I’m still a bit unsure.
The Relax NG book has some discussion of whitespace normalization, but as it relates to validation.
From this perspective, when we use text
in a pattern, we’re saying “preserve all whitespace.”
When we use "foo"
, that’s shorthand for token
, which in this case means “normalize whitespace”, so that I think "this that"
and "this that "
are equivalent.
But you’re talking parsing, where the concern is a parser may throw out the whitespace before you see it.
If I just run xmllint
on a sample file, it preserves whitespace.
❯ xmllint --format test.xml
<?xml version="1.0"?>
<style>
<text foo="test, " bar="one "/>
<text foo="test, " bar="one "/>
</style>
What library are you using? Is there not some option to modify whitespace processing, as here?