I’m having trouble squaring expected test results to the spec.
Sentence case conversion (with
text-case
set to “sentence”) is performed by:
- For uppercase strings, the first character of the string remains capitalized. All other letters are lowercased.
- For lower or mixed case strings, the first character of the first word is capitalized if the word is lowercase. The case of all other words stays the same.
So:
- Example 1:
THIS IS A SENTENCE
is an uppercase string, which falls under rule 1, and I would expectThis is a sentence
- Example 2:
This is a Sentence
is a mixed case string, but strictly speaking falls under neither rule, because the first word is not lowercase, but mixed case. I would assumeThis is a Sentence
- Example 3:
this is a Sentence
is a mixed case string, which falls under rule 2, so the first word is capitalised but the case of all other words stays the same:This is a Sentence
.
However … textcase_SentenceCapitalization.txt
takes the input
`This is a Pen that is a <span class="nocase">Smith Pencil
And expects the output
This is a pen that is a Smith pencil
That seems to be applying a different rule 2, in which for mixed case strings, the first letter of the string is capitalized, and all other words are converted to lower-case (unless “protected” with a nocase
tag. If I apply the rules, I’d expect to get
This is a Pen that is a Smith Pencil
Is the test here wrong?