CSL 1.0.1 release

Frank just reminded me that citeproc-js does not yet support the new scheme
for ordinal suffixes, so Zotero/Mendeley/etc. should probably wait with
distributing CSL 1.0.1 locales until that situation changes.

Rintze

I finally got on the stick and implemented CSL 1.0.1 ordinals in citeproc-js.

There is a test fixture:

https://bitbucket.org/bdarcus/citeproc-test/src/4a67f523ac07/processor-tests/humans/number_NewOrdinals.txt

If anyone wants to check out the new ordinals behaviour in a word
processing environment before the new locales appear in Zotero and
elsewhere, I have installed the CSL 1.0.1 locales and updated
citeproc-js version in MLZ:

http://citationstylist.org/tools

Cheers,
Frank

After that brief frolic, CLS 1.0.1 support has been rolled back out of
MLZ, pending refactoring in the processor.

Frank

Frank just reminded me that citeproc-js does not yet support the new scheme
for ordinal suffixes, so Zotero/Mendeley/etc. should probably wait with
distributing CSL 1.0.1 locales until that situation changes.

Rintze

I finally got on the stick and implemented CSL 1.0.1 ordinals in citeproc-js.

There is a test fixture:

https://bitbucket.org/bdarcus/citeproc-test/src/4a67f523ac07/processor-tests/humans/number_NewOrdinals.txt

If anyone wants to check out the new ordinals behaviour in a word
processing environment before the new locales appear in Zotero and
elsewhere, I have installed the CSL 1.0.1 locales and updated
citeproc-js version in MLZ:

http://citationstylist.org/tools

Cheers,
Frank

After that brief frolic, CLS 1.0.1 support has been rolled back out of
MLZ, pending refactoring in the processor.

Frank

While I’m thinking about it …

I assume that the genderized term fallback possibilities should be
exhausted before concluding that a term does not exist in a particular
form, and the next term form in priority is tried. Is that correct?

Frank, I tried the latest mlz release with a style which re-defines
ordinals, gender, etc.
This doesn’t work.
-limit-day-ordinals-to-day-1=“true” has no effect.

I haven’t implemented that one yet, so no worries there.

-when gender variants are defined in the ordinal suffix terms , there is no
output at all with terms other than months (e.g. “edition”)
[ʳᵉ
ᵉʳ]

That’s not good. Thanks for the report, I’ll take a look.

Is that the entire set of ordinals that are redefined in the style?
You need to provide the full set, and I think that a default term
(without gender-form) for each needs to be set. The ordinals travel
only as a full set.

Is that really necessary? The example I wrote up for the specification doesn’t define a genderless “ordinal-01” term.

I was wondering, too. Gracile explained that there is no genderless term in this case and Rintze suggested to generally fallback to the feminine version when looking up terms since that is the more common case in the languages we looked at.

Let me think out loud here for a few lines.

(1) If a masc, fem or neut variable term is called, the
corresponding genderized term should be used. That much is common
sense.

(2) If a neuter term is defined, and a masc or fem variable term is
called, and no corresponding genderized term is defined, it defaults
to the neuter term version. That’s clear from the spec.

(3) If a fem term is defined, and no neut term is defined, the fem
term becomes the default, and will be used if a masc term variant is
undefined (following the suggestion above).

(4) If only a masc term is defined, I guess that becomes the
default, and will be used if (as would logically be so) a fem term is
undefined.

This looks very good. What I fear will complicate matters considerably are the other possible fallbacks: locale-wise (i.e., style and default region overrides) and for ordinals specifically. As I said, I haven’t yet been able to sit down and think this through, but I agree with Frank that this matter (locale, gender fallback precedence) will require some time to be resolved properly.

Sylvester

signature.asc (203 Bytes)

Frank, I tried the latest mlz release with a style which re-defines
ordinals, gender, etc.
This doesn’t work.
-limit-day-ordinals-to-day-1=“true” has no effect.

I haven’t implemented that one yet, so no worries there.

-when gender variants are defined in the ordinal suffix terms , there is no
output at all with terms other than months (e.g. “edition”)
[ʳᵉ
ᵉʳ]

That’s not good. Thanks for the report, I’ll take a look.

Is that the entire set of ordinals that are redefined in the style?
You need to provide the full set, and I think that a default term
(without gender-form) for each needs to be set. The ordinals travel
only as a full set.

Is that really necessary? The example I wrote up for the specification doesn’t define a genderless “ordinal-01” term.

I was wondering, too. Gracile explained that there is no genderless term in this case and Rintze suggested to generally fallback to the feminine version when looking up terms since that is the more common case in the languages we looked at.

Let me think out loud here for a few lines.

(1) If a masc, fem or neut variable term is called, the
corresponding genderized term should be used. That much is common
sense.

(2) If a neuter term is defined, and a masc or fem variable term is
called, and no corresponding genderized term is defined, it defaults
to the neuter term version. That’s clear from the spec.

(3) If a fem term is defined, and no neut term is defined, the fem
term becomes the default, and will be used if a masc term variant is
undefined (following the suggestion above).

(4) If only a masc term is defined, I guess that becomes the
default, and will be used if (as would logically be so) a fem term is
undefined.

This looks very good. What I fear will complicate matters considerably are the other possible fallbacks: locale-wise (i.e., style and default region overrides) and for ordinals specifically. As I said, I haven’t yet been able to sit down and think this through, but I agree with Frank that this matter (locale, gender fallback precedence) will require some time to be resolved properly.

Sylvester

If we can agree that ordinal term sets should be provided as a full
set within a style, and that the prior ordinal terms (from the locale
file) will be entirely discarded in that case, it will simplify
things.

Frank

The scheme I had in mind is this:

  • if the target noun is neuter, only the neuter gender-variants are taken
    into consideration
  • if the target noun is feminine (or masculine), both the feminine (or
    masculine) and neuter gender-variants are taken into account. If an ordinal
    term exists both as a feminine (or masculine) and a neuter variant,
    the feminine (or masculine) variant is used

For example, with

.ª (1)
.ª (2)
.º (3)
.ª (4)
.ºº (5)

  • if the target noun is neuter, term definitions 1 and 2 are taken into
    account
  • if the target noun is feminine, term definitions 1 and 4 are taken into
    account
  • if the target noun is masculine, term definitions 1, 3 and 5 are taken
    into account

Rintze

Frank, I tried the latest mlz release with a style which re-defines
ordinals, gender, etc.
This doesn’t work.
-limit-day-ordinals-to-day-1=“true” has no effect.

I haven’t implemented that one yet, so no worries there.

-when gender variants are defined in the ordinal suffix terms , there is no
output at all with terms other than months (e.g. “edition”)
[ʳᵉ
ᵉʳ]

That’s not good. Thanks for the report, I’ll take a look.

Is that the entire set of ordinals that are redefined in the style?
You need to provide the full set, and I think that a default term
(without gender-form) for each needs to be set. The ordinals travel
only as a full set.

Is that really necessary? The example I wrote up for the specification doesn’t define a genderless “ordinal-01” term.

I was wondering, too. Gracile explained that there is no genderless term in this case and Rintze suggested to generally fallback to the feminine version when looking up terms since that is the more common case in the languages we looked at.

Let me think out loud here for a few lines.

(1) If a masc, fem or neut variable term is called, the
corresponding genderized term should be used. That much is common
sense.

(2) If a neuter term is defined, and a masc or fem variable term is
called, and no corresponding genderized term is defined, it defaults
to the neuter term version. That’s clear from the spec.

(3) If a fem term is defined, and no neut term is defined, the fem
term becomes the default, and will be used if a masc term variant is
undefined (following the suggestion above).

(4) If only a masc term is defined, I guess that becomes the
default, and will be used if (as would logically be so) a fem term is
undefined.

This looks very good. What I fear will complicate matters considerably are the other possible fallbacks: locale-wise (i.e., style and default region overrides) and for ordinals specifically. As I said, I haven’t yet been able to sit down and think this through, but I agree with Frank that this matter (locale, gender fallback precedence) will require some time to be resolved properly.

Sylvester

If we can agree that ordinal term sets should be provided as a full
set within a style, and that the prior ordinal terms (from the locale
file) will be entirely discarded in that case, it will simplify
things.

Absolutely.

Slightly off topic: I haven’t tackled this issue in the citeproc-ruby rewrite yet, but I know that I already thought about resolving locale fallbacks either by:

  • reconciling all the possible overrides at the beginning, i.e. generating a virtual locale which combines all locales in question (e.g., the style-locale, the locale, and the fallback region locale); the lookup would then just have to deal with a single locale

  • or to consider all the possible locales individually during the lookup process

My plan was to go with the former approach since the lookup algorithm would become simpler that way. Did you come to a similar solution in citeproc-js or a completely different approach altogether?

Sylvester

signature.asc (203 Bytes)

It seems like we haven’t revisited the topic of overwriting ordinal terms
since we introduced the new ordinalization scheme. Since the list of terms
is now practically open-ended (the number of “ordinal(-\d\d)?” terms in any
given locale file can range from 1 to 101), it indeed makes a lot of sense
to just ditch all of them whenever a style redefines one or more of these
terms.

(note that this wouldn’t be in agreement with the current spec language:
http://citationstyles.org/downloads/specification.html#ordinal-suffixes )

Rintze

Frank, I tried the latest mlz release with a style which re-defines
ordinals, gender, etc.
This doesn’t work.
-limit-day-ordinals-to-day-1=“true” has no effect.

I haven’t implemented that one yet, so no worries there.

-when gender variants are defined in the ordinal suffix terms , there is no
output at all with terms other than months (e.g. “edition”)
[ʳᵉ
ᵉʳ]

That’s not good. Thanks for the report, I’ll take a look.

Is that the entire set of ordinals that are redefined in the style?
You need to provide the full set, and I think that a default term
(without gender-form) for each needs to be set. The ordinals travel
only as a full set.

Is that really necessary? The example I wrote up for the specification doesn’t define a genderless “ordinal-01” term.

I was wondering, too. Gracile explained that there is no genderless term in this case and Rintze suggested to generally fallback to the feminine version when looking up terms since that is the more common case in the languages we looked at.

Let me think out loud here for a few lines.

(1) If a masc, fem or neut variable term is called, the
corresponding genderized term should be used. That much is common
sense.

(2) If a neuter term is defined, and a masc or fem variable term is
called, and no corresponding genderized term is defined, it defaults
to the neuter term version. That’s clear from the spec.

(3) If a fem term is defined, and no neut term is defined, the fem
term becomes the default, and will be used if a masc term variant is
undefined (following the suggestion above).

(4) If only a masc term is defined, I guess that becomes the
default, and will be used if (as would logically be so) a fem term is
undefined.

This looks very good. What I fear will complicate matters considerably are the other possible fallbacks: locale-wise (i.e., style and default region overrides) and for ordinals specifically. As I said, I haven’t yet been able to sit down and think this through, but I agree with Frank that this matter (locale, gender fallback precedence) will require some time to be resolved properly.

Sylvester

If we can agree that ordinal term sets should be provided as a full
set within a style, and that the prior ordinal terms (from the locale
file) will be entirely discarded in that case, it will simplify
things.

Absolutely.

Slightly off topic: I haven’t tackled this issue in the citeproc-ruby rewrite yet, but I know that I already thought about resolving locale fallbacks either by:

  • reconciling all the possible overrides at the beginning, i.e. generating a virtual locale which combines all locales in question (e.g., the style-locale, the locale, and the fallback region locale); the lookup would then just have to deal with a single locale

  • or to consider all the possible locales individually during the lookup process

My plan was to go with the former approach since the lookup algorithm would become simpler that way. Did you come to a similar solution in citeproc-js or a completely different approach altogether?

Sylvester

citeproc-js does the former, and it works quite well.

Note: I haven’t worked out a general algorithm for gender lookups yet that
considers all these fallbacks.

The scheme I had in mind is this:

  • if the target noun is neuter, only the neuter gender-variants are taken
    into consideration
  • if the target noun is feminine (or masculine), both the feminine (or
    masculine) and neuter gender-variants are taken into account. If an ordinal
    term exists both as a feminine (or masculine) and a neuter variant, the
    feminine (or masculine) variant is used

For example, with

.ª (1)
.ª (2)
.º (3)
.ª (4)
.ºº (5)

  • if the target noun is neuter, term definitions 1 and 2 are taken into
    account
  • if the target noun is feminine, term definitions 1 and 4 are taken into
    account
  • if the target noun is masculine, term definitions 1, 3 and 5 are taken
    into account

Rintze

Okay, that’s helpful.

Note: I haven’t worked out a general algorithm for gender lookups yet that
considers all these fallbacks.

The scheme I had in mind is this:

  • if the target noun is neuter, only the neuter gender-variants are taken
    into consideration
  • if the target noun is feminine (or masculine), both the feminine (or
    masculine) and neuter gender-variants are taken into account. If an ordinal
    term exists both as a feminine (or masculine) and a neuter variant, the
    feminine (or masculine) variant is used

For example, with

.ª (1)
.ª (2)
.º (3)
.ª (4)
.ºº (5)

  • if the target noun is neuter, term definitions 1 and 2 are taken into
    account
  • if the target noun is feminine, term definitions 1 and 4 are taken into
    account
  • if the target noun is masculine, term definitions 1, 3 and 5 are taken
    into account

Rintze

Okay, that’s helpful.

So it was a quick month.

I’ve put up a fresh release of MLZ (m245) that works with the updated
fr-FR locale and handles the above example as per Rintze’s description
of the gender fallback logic on ordinals.

What threw me is that the fallback priorities on ordinals differ from
those for straight terms (at least in my implementation). Terms always
have a hard-wired default value (assigned now according to the logic
of my earlier post). Getting that right was tricky, because the
default priorities can’t be evaluated until the full set of ordinal
terms is known. Ordinals need to steer clear of the default unless it
was a non-gendered term assigned explicitly.

In any case, I’m pretty confident that it’s right now, but it hasn’t
been extensively tested. Please try to break it; if problems turn up,
we’ll fix them.

Frank

Note: I haven’t worked out a general algorithm for gender lookups yet that
considers all these fallbacks.

The scheme I had in mind is this:

  • if the target noun is neuter, only the neuter gender-variants are taken
    into consideration
  • if the target noun is feminine (or masculine), both the feminine (or
    masculine) and neuter gender-variants are taken into account. If an ordinal
    term exists both as a feminine (or masculine) and a neuter variant, the
    feminine (or masculine) variant is used

For example, with

.ª (1)
.ª (2)
.º (3)
.ª (4)
.ºº (5)

  • if the target noun is neuter, term definitions 1 and 2 are taken into
    account
  • if the target noun is feminine, term definitions 1 and 4 are taken into
    account
  • if the target noun is masculine, term definitions 1, 3 and 5 are taken
    into account

Rintze

Okay, that’s helpful.

So it was a quick month.

I’ve put up a fresh release of MLZ (m245) that works with the updated
fr-FR locale and handles the above example as per Rintze’s description
of the gender fallback logic on ordinals.

What threw me is that the fallback priorities on ordinals differ from
those for straight terms (at least in my implementation). Terms always
have a hard-wired default value (assigned now according to the logic
of my earlier post). Getting that right was tricky, because the
default priorities can’t be evaluated until the full set of ordinal
terms is known. Ordinals need to steer clear of the default unless it

Correction: the above should read “until the full set of gender
variants is known”.

Can you give an example? I don’t follow.

Rintze

Nevermind, that actually didn’t make any sense. The “trickiness” was
code-specific stuff, nothing to do with the spec. Sorry for the
confusion.

Should be right now, anyway.

Frank

Date: Fri, 7 Sep 2012 11:31:29 +0900
From: Frank Bennett <@Frank_Bennett>
Subject: Re: [xbiblio-devel] CSL 1.0.1 release?
To: development discussion for xbiblio
xbiblio-devel@lists.sourceforge.net
Message-ID:
CAJgpGgDw-26SXsXCViLYsKPT9pHdu1Wk6hGzMDzeNSnRBqtHqQ@mail.gmail.com
Content-Type: text/plain; charset=UTF-8

Date: Thu, 6 Sep 2012 12:28:12 +0900
From: Frank Bennett <@Frank_Bennett>
Subject: Re: [xbiblio-devel] CSL 1.0.1 release
To: development discussion for xbiblio
xbiblio-devel@lists.sourceforge.net
Message-ID:
CAJgpGgA1Wvdvf156GXjbmqJnqG7zSr_J+aK1huusZqSj7Bu6xw@mail.gmail.com
Content-Type: text/plain; charset=UTF-8

Note: I haven’t worked out a general algorithm for gender lookups yet
that
considers all these fallbacks.

The scheme I had in mind is this:

  • if the target noun is neuter, only the neuter gender-variants are
    taken
    into consideration
  • if the target noun is feminine (or masculine), both the feminine (or
    masculine) and neuter gender-variants are taken into account. If an
    ordinal
    term exists both as a feminine (or masculine) and a neuter variant, the
    feminine (or masculine) variant is used

For example, with

.? (1)
.? (2)
.? (3)
.? (4)
.?? (5)

  • if the target noun is neuter, term definitions 1 and 2 are taken into
    account
  • if the target noun is feminine, term definitions 1 and 4 are taken
    into
    account
  • if the target noun is masculine, term definitions 1, 3 and 5 are
    taken
    into account

Rintze

Okay, that’s helpful.

So it was a quick month.

I’ve put up a fresh release of MLZ (m245) that works with the updated
fr-FR locale and handles the above example as per Rintze’s description
of the gender fallback logic on ordinals.

What threw me is that the fallback priorities on ordinals differ from
those for straight terms (at least in my implementation). Terms always
have a hard-wired default value (assigned now according to the logic
of my earlier post). Getting that right was tricky, because the
default priorities can’t be evaluated until the full set of ordinal
terms is known. Ordinals need to steer clear of the default unless it
was a non-gendered term assigned explicitly.

In any case, I’m pretty confident that it’s right now, but it hasn’t
been extensively tested. Please try to break it; if problems turn up,
we’ll fix them.

Frank

Frank,

With the latest release (3.0m246):

-dates: ok (“limit-ordinals-to-day-1” works)
-“issue”, “volume”: ok

Yay.

There’s still a problem with “edition”: I always get the “general” ordinal
suffix (“e” in fr-FR). But it’s not correct when the edition is the first
one: it should be “1re ?d.” (“edition” is defined as “feminine” in the
locale-fr-FR.xml). The processor, with the “edition” variable, disregards
the gender variants and choose the “general” suffix.

[same results with a generic style (e.g. Chicago) or a custom style which
redefines the locale terms]

Thanks,
Gracile

This works for me. In the fr-FR locale, I get 1re for first edition.
Same result in the processor test-bed and in MLZ m246, both via the
test pane and in word processor documents.

The discrimination turns on the locale and plain vanilla Javascript
logic in the processor. I’m not sure why we would be getting different
results. It’s unlikely to turn anything up, but can you send me a copy
of the custom style and an item that shows the error?

Frank

Belay that thought. With CMS Fullnote only, I get the masculine
suffix. Checking …

Frank

Gregoire: The 1re edition failure should be fixed in the latest MLZ
update. Thanks for reporting; I picked up another bug that would have
affected multi-locale styles while tracking down this one.

Frank

I haven’t tested this update extensively but it seems to work perfectly. Thanks!