Locale ordinals and gender

Dear all,

I have several questions about CSL Locales in regard to the use of ordinals and the gender or gender-form attribute.

(Below are my original questions: I have meanwhile come across a previous posting http://sourceforge.net/mailarchive/message.php?msg_id=26629141 which answers my question about the distinction between the two attributes; still what about cases where the gender is tied to the content in a field? For example, if I had a different form of ‘ed.’ dependent on the editor’s sex?)

  1. What is the difference between ‘gender’ and ‘gender-form’? Both are used, for example, in test case number_EditionOrdinalMasculine.txt. (I have a hunch that this is the answer to my question 3 below; that by using ‘gender’ on the term ‘edition’ I am declaring the variable ‘edition’ to be masculine and thus give precedence to the masculine form of the associated number.)

  2. How is gender specified in a style? I would imagine the process ought to be somewhat similar to the processor distinguishes between singular and plural forms, i.e., by having a contextual or forced mode and, for the contextual, mode some way to specify the context. Are the rules for this described somewhere?

  3. To return to the number_EditionOrdinalMasculine test case: why is the masculine form prioritized here? The expected output is ‘stMASC’, but ‘ordinal-01’ is defined (without gender) as ‘st’ both in the style and in the fallback locale (en-US); so why is the masculine form picked by default?

  4. About ordinals in general: in the CSL schema ordinal-01 through ordinal-04 are defined, as well as long-ordinal-01 through long-ordinal-10; is this supposed to be an inclusive list? My implementation currently checks for a direct match and, should that fail, by checking the remainders consecutively. Thus, to ordinalize 123 I would try to match ‘ordinal-123’, then ‘ordinal-23’, then ‘ordinal-03’ which would result in ‘123rd’; to ordinalize 113, again, I would try to match ‘ordinal-113’, ‘ordinal-13’, ‘ordinal-03’ which would lead to ‘113rd’, so obviously for this to work correctly, I would need to define ‘ordinal-13’ as ‘th’ in the locale. For English this works quite well: I would need to define ‘ordinal-00’ through ‘ordinal-13’; furthermore, I could define any other ordinal should I run into a special case somewhere. My questions is: is this still valid CSL? If not, how is a citeproc processor supposed to ordinalze, for example, 113 or 123?

  5. Similarly, are long-ordinals supposed to be limited to the numbers 1-10? Is the expected fallback for larger numbers to use the regular ordinalized form?

Thanks!

Sylvester

First of all, the whole gender thing is still a proposal, and isn’t part of
CSL 1.0 (although it could appear in 1.0.1). With that in mind:

still what about cases where the gender is tied to the content in a field?
For example, if I had a different form of ‘ed.’ dependent on the editor’s
sex?)

Are there styles that demand this kind of formatting? In addition to CSL
support, this would require hints in the input data about the gender of
individual contributors, which seems a bit overkill (and in many cases it’s
not trivial to get this information).

  1. What is the difference between ‘gender’ and ‘gender-form’? Both are
    used, for example, in test case number_EditionOrdinalMasculine.txt. (I have
    a hunch that this is the answer to my question 3 below; that by using
    ‘gender’ on the term ‘edition’ I am declaring the variable ‘edition’ to be
    masculine and thus give precedence to the masculine form of the associated
    number.)

We discovered we needed to distinguish between “gender” and “gender-form” to
allow terms to be properly redefined via the locale section of styles.
“gender” is used to indicate the gender of a particular noun, whereas
“gender-form” is used to indicate the various gender forms of terms like
ordinals (which can exist in more than one gender).

  1. How is gender specified in a style? I would imagine the process ought to

be somewhat similar to the processor distinguishes between singular and
plural forms, i.e., by having a contextual or forced mode and, for the
contextual, mode some way to specify the context. Are the rules for this
described somewhere?

The “gender” is set as an attribute on the term.

  1. To return to the number_EditionOrdinalMasculine test case: why is the

masculine form prioritized here? The expected output is ‘stMASC’, but
‘ordinal-01’ is defined (without gender) as ‘st’ both in the style and in
the fallback locale (en-US); so why is the masculine form picked by default?

This is a bit of a tricky situation. For numbers rendered as ordinals, the
gender form picked for the ordinal term is based on the gender of the
related term. As the style defines (the “long” form of) the “edition” term
as masculine, the masculine ordinal term is used.

  1. About ordinals in general: in the CSL schema ordinal-01 through
    ordinal-04 are defined, as well as long-ordinal-01 through long-ordinal-10;
    is this supposed to be an inclusive list?

Yes. The CSL schema doesn’t allow this list to be extended.

My implementation currently checks for a direct match and, should that
fail, by checking the remainders consecutively. Thus, to ordinalize 123 I
would try to match ‘ordinal-123’, then ‘ordinal-23’, then ‘ordinal-03’ which
would result in ‘123rd’; to ordinalize 113, again, I would try to match
‘ordinal-113’, ‘ordinal-13’, ‘ordinal-03’ which would lead to ‘113rd’, so
obviously for this to work correctly, I would need to define ‘ordinal-13’ as
‘th’ in the locale. For English this works quite well: I would need to
define ‘ordinal-00’ through ‘ordinal-13’; furthermore, I could define any
other ordinal should I run into a special case somewhere. My questions is:
is this still valid CSL? If not, how is a citeproc processor supposed to
ordinalze, for example, 113 or 123?

long-ordinals fall back to ordinals (this, and overall ordinal behavior,
should probably be discussed in detail in the spec), so 113 would become
“113rd” if rendered as a long-ordinal.

  1. Similarly, are long-ordinals supposed to be limited to the numbers 1-10?
    Is the expected fallback for larger numbers to use the regular ordinalized
    form?

Yes.

RintzeOn Wed, Mar 2, 2011 at 10:11 AM, Sylvester Keil <@Sylvester_Keil>wrote:

Rintze,

thanks for this!

First of all, the whole gender thing is still a proposal, and isn’t part of CSL 1.0 (although it could appear in 1.0.1). With that in mind:

still what about cases where the gender is tied to the content in a field? For example, if I had a different form of ‘ed.’ dependent on the editor’s sex?)

Are there styles that demand this kind of formatting? In addition to CSL support, this would require hints in the input data about the gender of individual contributors, which seems a bit overkill (and in many cases it’s not trivial to get this information).

I am not aware of a style that would require this, no; rather, I was being carried away by all the complexity that could possibly be necessary.

  1. What is the difference between ‘gender’ and ‘gender-form’? Both are used, for example, in test case number_EditionOrdinalMasculine.txt. (I have a hunch that this is the answer to my question 3 below; that by using ‘gender’ on the term ‘edition’ I am declaring the variable ‘edition’ to be masculine and thus give precedence to the masculine form of the associated number.)

We discovered we needed to distinguish between “gender” and “gender-form” to allow terms to be properly redefined via the locale section of styles. “gender” is used to indicate the gender of a particular noun, whereas “gender-form” is used to indicate the various gender forms of terms like ordinals (which can exist in more than one gender).

So, right now, the intention is that only localized terms can have gender specific variants, because the only way to define gender that the processor understands is by using the ‘gender’ attribute on the term in question, correct?

  1. To return to the number_EditionOrdinalMasculine test case: why is the masculine form prioritized here? The expected output is ‘stMASC’, but ‘ordinal-01’ is defined (without gender) as ‘st’ both in the style and in the fallback locale (en-US); so why is the masculine form picked by default?

This is a bit of a tricky situation. For numbers rendered as ordinals, the gender form picked for the ordinal term is based on the gender of the related term. As the style defines (the “long” form of) the “edition” term as masculine, the masculine ordinal term is used.

  • From you answer I infer that it is possible to define gender for different forms (e.g., long and short) independently. Have you agreed on fallback scenarios if a given gender form is not available? That is to say, if the masculine short form of a term were requested but missing, should the processor return, for example, the short form with another or no gender, or the long form with the correct gender etc.?
  1. About ordinals in general: in the CSL schema ordinal-01 through ordinal-04 are defined, as well as long-ordinal-01 through long-ordinal-10; is this supposed to be an inclusive list?

Yes. The CSL schema doesn’t allow this list to be extended.

Is there any chance you would consider allowing for the addition of more ordinal definitions in the future?

Sylvester

So, right now, the intention is that only localized terms can have gender
specific variants, because the only way to define gender that the processor
understands is by using the ‘gender’ attribute on the term in question,
correct?

Yes

  1. To return to the number_EditionOrdinalMasculine test case: why is the
    masculine form prioritized here? The expected output is ‘stMASC’, but
    ‘ordinal-01’ is defined (without gender) as ‘st’ both in the style and in
    the fallback locale (en-US); so why is the masculine form picked by default?

This is a bit of a tricky situation. For numbers rendered as ordinals,
the gender form picked for the ordinal term is based on the gender of the
related term. As the style defines (the “long” form of) the “edition” term
as masculine, the masculine ordinal term is used.

  • From you answer I infer that it is possible to define gender for
    different forms (e.g., long and short) independently. Have you agreed on
    fallback scenarios if a given gender form is not available? That is to say,
    if the masculine short form of a term were requested but missing, should the
    processor return, for example, the short form with another or no gender, or
    the long form with the correct gender etc.?

In the proposal, gender can only be assigned to the default “long” form of a
term. This is based on the assumption that gender isn’t form-specific, and
circumvents any issues with fallback behavior and conflicting gender
assignments.

  1. About ordinals in general: in the CSL schema ordinal-01 through
    ordinal-04 are defined, as well as long-ordinal-01 through long-ordinal-10;
    is this supposed to be an inclusive list?

Yes. The CSL schema doesn’t allow this list to be extended.

Is there any chance you would consider allowing for the addition of more
ordinal definitions in the future?

As far as I know, it’s not on the roadmap. But there might very well be
locales for which the current set is too limited.

Rintze