XML Normalization

Bruce_D_Arcus1 · June 23, 2020, 5:57pm

I’m still a bit unsure.

The Relax NG book has some discussion of whitespace normalization, but as it relates to validation.

From this perspective, when we use text in a pattern, we’re saying “preserve all whitespace.”

When we use "foo", that’s shorthand for token, which in this case means “normalize whitespace”, so that I think "this that" and "this that " are equivalent.

But you’re talking parsing, where the concern is a parser may throw out the whitespace before you see it.

If I just run xmllint on a sample file, it preserves whitespace.

❯ xmllint --format test.xml
<?xml version="1.0"?>
<style>
  <text foo="test, " bar="one    "/>
  <text foo="test,   " bar="one "/>
</style>

What library are you using? Is there not some option to modify whitespace processing, as here?

Topic		Replies	Views
delimiter-precedes-last CSL Development	1	242	August 25, 2007
whitespace CSL Development	4	191	March 26, 2005
namespaces (was Re: spec) CSL Development	3	243	July 1, 2009
Delimiter in name substitute CSL Development	14	293	October 10, 2012
CSL name delimiter CSL Development	1	318	February 22, 2011

XML Normalization

Related topics