Locale files - script variants

Rintze_Zelle · November 29, 2012, 3:32am

A user wishes to contribute a Latin variant of the Serbian CSL locale
file (in addition to the existing Cyrillic variant). See

Is it acceptable to add a script subtag to the locale file name and
xml:lang value? E.g. “locales-sr-RS-Latn.xml” (for Latin) and
“locales-sr-RS-Cyrl.xml” (for Cyrillic) (I guess we could omit the
script subtag from one variant, e.g. we could keep the Cyrillic
variant as “locales-sr-RS.xml”). See also
http://www.w3.org/International/articles/language-tags/Overview.en.php#script

Rintze

Avram_Lyon · November 29, 2012, 4:14pm

I know that Frank built citeproc-js to be aware of BCP 47 semantics, which
are used extensively in MLZ, but can we do this without demanding the same
of other processors?

Frank_Bennett · November 29, 2012, 8:35pm

I know that Frank built citeproc-js to be aware of BCP 47 semantics, which
are used extensively in MLZ, but can we do this without demanding the same
of other processors?

Avram,

I think that in locale filenames, citeproc-js will only handle the
first two elements of a tag at the moment. What sort of syntax do you
have in mind?

Frank

Frank_Bennett · November 29, 2012, 9:09pm

Off-list, Rintze and I came to the conclusion that a processor can
just treat the RS-Latn pair as a single tag, with the same fallback
behaviour as a single element. In this case, sr-RS-Latn would fall
back to plain-vanilla sr, with whatever default mapping defined for it
in the processor (so sr-RS). That covers script variants under the RFC
5646 specificaton, and should be simple to implement in processors.
citeproc-js needs a small adjustment to handle the filename in that
case, but it should be easy to do.

So I’m fine with including the file in the repo, if others are happy with it.

Frank

Bruce_D_Arcus1 · November 29, 2012, 9:13pm

Minor thing: shouldn’t file names be lower-case generally?

Frank_Bennett · November 29, 2012, 9:16pm

I’ll defer to others on filenaming conventions, but RFC 5646 language
tags are not case sensitive, so there would be no problem with
lowercasing as far as the specification is concerned.

Avram_Lyon · November 29, 2012, 9:16pm

Here, we run into a conflict with the standards for the language tags,
which are generally cased in a particular way.

Frank_Bennett · November 29, 2012, 9:25pm

Here is the language from RFC 5646:

The ABNF syntax also does not distinguish between upper- and
lowercase: the uppercase US-ASCII letters in the range ‘A’ through
’Z’ are always considered equivalent and mapped directly to their US-
ASCII lowercase equivalents in the range ‘a’ through ‘z’. So the tag
"I-AMI" is considered equivalent to that value “i-ami” in the
’irregular’ production.

Although case distinctions do not carry meaning in language tags,
consistent formatting and presentation of language tags will aid
users. The format of subtags in the registry is RECOMMENDED as the
form to use in language tags. This format generally corresponds to
the common conventions for the various ISO standards from which the
subtags are derived.

So Avram and I are not at odds. For readability, you would adhere to
the convention of using uppercase.

Frank

Topic		Replies	Views
Locale codes of CSL locale files CSL Development	0	325	May 1, 2011
locales and text CSL Development	5	257	August 12, 2006
locale specification CSL Development	5	330	August 3, 2010
CSL test suite -- citeproc-js migration CSL Development	0	445	March 7, 2016
Locale handling CSL Development	1	284	December 10, 2009

Locale files - script variants

Related topics