html entities

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Illustration:

A x.xml file with this content …

Fails on parsing:

$ xmllint x.xml
x.xml:1: parser error : Entity ‘ldquo’ not defined

An XML parser will simply choke on it.

So my concern is that if you build a library that is assuming it can
just output these HTML entities, that might be a problem when you go
to other output formats.

Now, this may not be an issue; don’t know. But I should probably raise it now.

Bruce

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you say
entities break everything, that clinches it. Anyone know the Unicode
addresses of double and single left and right quotes?

Frank

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and “rtquote”
(or similar) terms to locales?

Bruce

So on HTML entities, they’re not per se a problem in the test suite,
or in HTML, but they are a more general problem, particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you say
entities break everything, that clinches it. Anyone know the Unicode
addresses of double and single left and right quotes?

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and “rtquote”
(or similar) terms to locales?

Quotes need to flip-flop, so lsquote and rsquote will also be needed.
If these will always be on character each, you wouldn’t lose anything
by sticking all four in a single field. It’s up to you; I’ll set up
whatever appears in the locales.

So on HTML entities, they’re not per se a problem in the test
suite,
or in HTML, but they are a more general problem,
particularly if you
move that content to XML.

Unicode would sure be better. It’s easier to read, and if as you
say
entities break everything, that clinches it. Anyone know the
Unicode
addresses of double and single left and right quotes?

<http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_refer

Probably better to use the characters directly.

In related thoughts, we should probably add “ltquote” and
“rtquote”
(or similar) terms to locales?

Quotes need to flip-flop, so lsquote and rsquote will also be
needed.
If these will always be on character each, you wouldn’t lose
anything
by sticking all four in a single field. It’s up to you; I’ll set up
whatever appears in the locales.

Is there a use case for these locale terms? We already have a quotes
attribute on each formatting element, and in light of locale-specific
formatting rules for quoted text (e.g., punctuation inside or outside
quotes), this seems like a better solution.

Simon