Dear all,
I have a few questions about the expected behaviour of cite processors as regards text-case formatting. Many of the unit test inputs come with ‘nocase’ span elements, but I have seen no mention anywhere in the specification that item fields are expected to be HTML fragments; is it a general rule that items may contain HTML markup? and are there any special classes other than ‘nocase’?
More importantly, though, what is the exact meaning of the ‘nocase’ class? I was assuming it directs the processor to ignore the contents of the span when applying a given text-case format; however, this is not what the unit tests seem to imply. Consider the following examples:
textcase_Lowercase.txt:
input: "This is a pen that is a <span class=“nocase”>Smith pencil"
expected result (using ‘lowercase’): “this is a pen that is a smith pencil”
So in this case, the processor strips the span tag from the input and applies the formatting rules regardless.
textcase_TitleCapitalization.txt
input: "This IS a pen that is a <span class=“nocase”>smith pencil"
expected result (using title-case): “This IS a Pen That Is a smith Pencil”
Now the processor seems to strip away the span tag but does not apply the format to its contents.
Furthermore, would it not be sensible to turn ‘IS’ into ‘Is’ when applying title-case?
I would greatly appreciate if anyone could help clarify these issues or point me to the document where these formatting rules are specified in more detail.
Thanks!
Sylvester