There are some ambiguities (I think) and limitations of
is-numeric that I think we should resolve.
Per the spec,
is-numeric has the following behavior:
Tests whether the given variables (Appendix IV - Variables) contain numeric content. Content is considered numeric if it solely consists of numbers. Numbers may have prefixes and suffixes (“D2”, “2b”, “L2d”), and may be separated by a comma, hyphen, or ampersand, with or without spaces (“2, 3”, “2-4”, “2 & 4”). For example, “2nd” tests “true” whereas “second” and “2nd edition” test “false”.
I’m not exactly sure this description achieves the desired goals for the test.
The main applications of
- test for things like
edition: Revised ed.versus
edition: 2to control formatting (e.g., ordinalizing and adding “ed.” for
edition: 2or adding the prefix “No.” before
- Extracting the integer number content for sorting.
For sorting, this is the relevant spec text for sorting about numbers:
numbers: Number variables called via the variable attribute are returned as integers (form is “numeric”). If the original variable value only consists of non-numeric text, the value is returned as a text string.
Number variables rendered within the macro with cs:number and date variables are treated the same as when they are called via variable. The only exception is that the complete date is returned if a date variable is called via the variable attribute. In contrast, macros return only those date-parts that would otherwise be rendered (respecting the value of the date-parts attribute for localized dates, or the listing of cs:date-part elements for non-localized dates).
“A2” should be sorted before “B1”, which from my read contradicts the spec.
For rendering, here are five potential variables:
edition: Revised ed.
number: Season 4, Episode 3
The first two are really clear, and I think this was the main impetus behind
is-numeric. Those easily fit the pattern:
<if is-numeric="edition"> <group delimiter=" "> <number variable="edition" form="ordinal-short"/> <label variable="edition" form="short"/> </group> </if> <else> <text variable="edition"/> </else>
The third is also clearly a number, and the fourth is also clearly not a number. But I think the fifth be treated like a number for formatting.
The case I am thinking about at the moment is this that I want to test whether
number is numeric to control whether “No.” is added the number. The desired output for 3-5 above is:
Season 4, Episode 3
There currently isn’t a way to get (5) to be formatted like (3) rather than like (4).
I think a much simpler spec for
is-numeric would be:
Content is considered numeric if all contained words include one or more numbers. Numbers may have prefixes and suffixes (“D2”, “2b”, “L2d”), may be separated by non-numeric characters (e.g., “5R01HD081252-04”), and may be separated by a comma, hyphen, or ampersand, with or without spaces (“2, 3”, “2-4”, “2 & 4”). Content is considered non-numeric if any word consists solely of non-numeric characters (e.g., “second”, “revised edition”, “number 3”).
I think this fits all of the needs correctly. If not, if we keep the existing
is-numeric spec, I suggest we add
is-numberlike to cover the above case.
Revised sorting text would be:
numbers: If a number variable consists solely of numbers, commas, hyphens, ampersands, and spaces, when called via the variable attribute, it is returned as the first integer before a space or punctuation (form is “numeric”). If the number variable value contains any other characters, the value is returned as a text string.