html entities

Bruce_D_Arcus1 · June 17, 2009, 5:43pm

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Illustration:

A x.xml file with this content …

“

Fails on parsing:

$ xmllint x.xml
x.xml:1: parser error : Entity ‘ldquo’ not defined
“

An XML parser will simply choke on it.

So my concern is that if you build a library that is assuming it can
just output these HTML entities, that might be a problem when you go
to other output formats.

Now, this may not be an issue; don’t know. But I should probably raise it now.

Bruce

Frank_Bennett · June 17, 2009, 8:23pm

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you say
entities break everything, that clinches it. Anyone know the Unicode
addresses of double and single left and right quotes?

Frank

Bruce_D_Arcus1 · June 17, 2009, 8:32pm

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and “rtquote”
(or similar) terms to locales?

Bruce

Frank_Bennett · June 17, 2009, 8:56pm

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you say
entities break everything, that clinches it. Anyone know the Unicode
addresses of double and single left and right quotes?

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and “rtquote”
(or similar) terms to locales?

Quotes need to flip-flop, so lsquote and rsquote will also be needed.
If these will always be on character each, you wouldn’t lose anything
by sticking all four in a single field. It’s up to you; I’ll set up
whatever appears in the locales.

Simon_Kornblith · June 17, 2009, 9:46pm

So on HTML entities, they’re not per se a problem in the test
suite,
or in HTML, but they are a more general problem,
particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you
say
entities break everything, that clinches it. Anyone know the
Unicode
addresses of double and single left and right quotes?

<http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_refer

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and
“rtquote”
(or similar) terms to locales?

Quotes need to flip-flop, so lsquote and rsquote will also be
needed.
If these will always be on character each, you wouldn’t lose
anything
by sticking all four in a single field. It’s up to you; I’ll set up
whatever appears in the locales.

Is there a use case for these locale terms? We already have a quotes
attribute on each formatting element, and in light of locale-specific
formatting rules for quoted text (e.g., punctuation inside or outside
quotes), this seems like a better solution.

Simon

Topic		Replies	Views
quotes localization (was html entities) CSL Development	31	641	July 3, 2009
Options in locales.xml CSL Development	6	232	July 5, 2009
Quotes CSL Development	9	386	September 11, 2006
Terms CSL Development	3	258	February 20, 2010
namespaces (was Re: spec) CSL Development	3	256	July 1, 2009

html entities

Related topics