requirements, example schema

OK, one thing that is critical to the design of CSL is that:

  • it should work for the humanities (often poorly supported), including
    translated and transliterated names and titles*
  • styles should be fully portable across implementations and user
    communities

In other words, if a math user creates some style and a biologist uses
it to format a bibliography, it should “just work.” This is one of the
reasons why CSL has this fallback and inheritance system. It is because
most records should be able to be formatted using generic templates.
Otherwise, styles will be fragile.

Likewise, if a Zotero user creates a style, it should also "just work"
in any other implementation, no matter what internal data model they
use.

Hence the need for essentially a CSL data model that one can map onto.

If we’re going to be completely general and minimalist, I’m thinking
something like this:

style = element cs:style { macro+, context+ }

macro = element cs:macro { name, field* }

name = attribute name { token }

field = element cs:field { name, formatting }

e.g. “citation” and “bibliography”

context = element cs:contexts { name, type+ }

type = element cs:type { name, (macro|field)+ }

E.g.:

... ... ...

But we’d need to control the model (the variables, types, etc.) to
support the requirements I have. E.g. the “name” attributes could not
be an uncontrolled token.

Also, this doesn’t include the tricky stuff like et al and such.

Bruce

  • I have a style I was working on for a Japanese religious studies
    journal. So, published in English, but for Japanese scholars. Titles
    and names and such need to include the original Kanji and the
    transliterated variants. This kind of thing must be supported by CSL,
    and it needs to be easy to do. Right now it is, so it needs to stay
    that way.

Hi,

thank you for your responses.
Bruce D’Arcus wrote:

OK, one thing that is critical to the design of CSL is that:

  • it should work for the humanities (often poorly supported), including
    translated and transliterated names and titles*
  • styles should be fully portable across implementations and user
    communities

In other words, if a math user creates some style and a biologist uses
it to format a bibliography, it should “just work.” This is one of the
reasons why CSL has this fallback and inheritance system. It is because
most records should be able to be formatted using generic templates.
Otherwise, styles will be fragile.

You probably do have a better overview on this topic. It is very good to
have validation classes, we experienced the problems of
having to change the csl schema just because we wanted to use the custom
class and we wanted to make some modifications of the custom class.

Currently in the CSL schema the custom class for a style is practically
not allowed (?) :

"CitationStyle =

24 element cs:style {
25 attribute xml:lang { xsd:language }?,
26 (AuthorDateStyle
27 | AuthorStyle
28 | NumberStyle
29 | LabelStyle
30 | NoteStyle
31 | InTextStyle
32 | AnnotatedStyle)
33 }"

However, one additional question: is it possible to “split” the current
schema to 2 “modules”:

  • one that defines only the language (i.e. CSL syntax) itself (important
    for portability between various processors)
  • the other that people (as at present) use to validate particular
    styles that are derived from e.g. author-date class (important for
    validation/sharing).

Likewise, if a Zotero user creates a style, it should also “just work”
in any other implementation, no matter what internal data model they
use.

Hence the need for essentially a CSL data model that one can map onto.

If we’re going to be completely general and minimalist, I’m thinking
something like this:

style = element cs:style { macro+, context+ }

macro = element cs:macro { name, field* }

name = attribute name { token }

field = element cs:field { name, formatting }

e.g. “citation” and “bibliography”

context = element cs:contexts { name, type+ }

type = element cs:type { name, (macro|field)+ }

Very very nice, generic and minimalist. Even more pragmatic - since the
“macro” element can not be built from other “macros” .

E.g.:

... ... ...

This structure is very much on the line of the question above in fact:
it probably will allow to define style-level macros which can be
imported from some “macro-library” in the csl-style definition file and
further customized (?)

But we’d need to control the model (the variables, types, etc.) to
support the requirements I have. E.g. the “name” attributes could not
be an uncontrolled token.

On this issue, do you think that all “name” attributes should be a
controlled token? Or it makes sense to make name attributes controled
for element context only?
(Field names are anyway controlled by the schema).
If one asks for a controlled “macro” names - then one will have to wait
that they are approved. In this case, addition of new macros
will become a bit cumbersome (e.g. we internally do really have a
requirements to be able to add new macros as the need is expressed and
that has to work asap :slight_smile:

Also, this doesn’t include the tricky stuff like et al and such.

True! Related to this: Is “et al” like element is applicable only to
creators (authors, editors, …) in current CSL?

If for any input:

  • We have extended the “et al” like setting to be applicable to any
    repeatable element and experimented with attribute groups such as:
    (maxCount - to say the max.number of occurences where the processor has
    to trigger the “et.al”
    maxCountEndsWith - to say what it should be at the end i.e. such as
    “et.al” or “and others”
    delimiter - as in CSL)
  • have not done any controlled vocabulary on abbreviations for e.g.
    creator role (editor, author, translator etc. - metadata schema controls
    the value) because in this case we again had so many differences - so
    that is actually in the “macro” for editor fields and resolved with
    “prefix”, “postfix” attributes.
    -We even had additional requirement to “limit” the length of a
    particular citation or macro output i.e. the whole citation/macro output
    should not be “longer” than 300 chars.
    For this purpose we had to specify attribute groups such as: (maxLength
  • to say the max. length of “this macro output or citation” is 300
    chars; maxLengthEndsWith - to say how it should end
    e.g. “…” or “…” ) IMO this is still a disputable requirement - but we
    had it :))

Bruce

  • I have a style I was working on for a Japanese religious studies
    journal. So, published in English, but for Japanese scholars. Titles
    and names and such need to include the original Kanji and the
    transliterated variants. This kind of thing must be supported by CSL,
    and it needs to be easy to do. Right now it is, so it needs to stay
    that way

Absolutely agree! We also have such kind of data - since we deal with
multiple scripts and languages(greek, latin, chinese) from Humanities
sections etc.

Best,
Natasa> -------------------------------------------------------------------------

Currently in the CSL schema the custom class for a style is practically
not allowed (?) :

"CitationStyle =

24 element cs:style {
25 attribute xml:lang { xsd:language }?,
26 (AuthorDateStyle
27 | AuthorStyle
28 | NumberStyle
29 | LabelStyle
30 | NoteStyle
31 | InTextStyle
32 | AnnotatedStyle)
33 }"

Yes, true. I was probably trying to avoid people using that until they
actually needed it :wink:

It’s easy enough to add the CustomStyle pattern to that choose list
there. I’ll do that now, though I do believe that you should rarely if
ever need that kind of flexibility. In general, if you’re getting
stuck, it’s because there’s some mistake in the schema that has it too
tight, or you’re maybe using it in ways not entirely intended.

However, one additional question: is it possible to “split” the current
schema to 2 “modules”:

  • one that defines only the language (i.e. CSL syntax) itself
    (important
    for portability between various processors)
  • the other that people (as at present) use to validate particular
    styles that are derived from e.g. author-date class (important for
    validation/sharing).

Yes. See csl-simple.rnc for an example (at least I think it’s in the
repo).

Bruce

Back to this, since I didn’t have time to address it earlier …