csl-data.rnc

So WRT to these data issues that keep coming up, I’ve decided to
create a csl-date
schema to formalize this. It is based on the CSL data model(actually
imports it),
but simply formalizes the expectations for particular data types, which include:

  • contributor names
  • dates
  • simple variables
  • rich variables (at this point, only titles)

Example:

Some Title

As I say, this is intended as a formalization of input expectations;
not per se as some new exchange format that I want to widely promote.

Issues:

  1. what variables should be “rich variables”?

  2. what inline formatting are we supporting on rich variables? Right
    now, I have b, i, and sup. I’ve punted on the semantics stuff, because
    it’s more important that we get the meta stuff (RDFa and such)
    implemented first.

  3. names is still up-in-the-air

  4. date details are still TBD (really the MRHA stuff; but I’d rather
    leave that out)

Schema is at:

http://bitbucket.org/bdarcus/csl-schema/src/

Bruce

So WRT to these data issues that keep coming up, I’ve decided to
create a csl-date
schema to formalize this. It is based on the CSL data model(actually

Nice.

imports it),
but simply formalizes the expectations for particular data types, which include:

  • contributor names
  • dates
  • simple variables
  • rich variables (at this point, only titles)

Example:

Some Title

As I say, this is intended as a formalization of input expectations;
not per se as some new exchange format that I want to widely promote.

Issues:

  1. what variables should be “rich variables”?

Title is all I can think of (but see below re ).

  1. what inline formatting are we supporting on rich variables? Right
    now, I have b, i, and sup. I’ve punted on the semantics stuff, because
    it’s more important that we get the meta stuff (RDFa and such)
    implemented first.

citeproc-js recognizes these:

(italic)
(bold)
(superscript)
(subscript)
(small caps)
(passthrough, for proper nouns)
" or locale outer quote (flipflops with ')
’ or locale inner quote (flipflops with ")

  1. names is still up-in-the-air

The two kinds of particles (dropping vs non-dropping) should be
discriminated in the input, to avoid the need for sub-field parsing.

  1. date details are still TBD (really the MRHA stuff; but I’d rather
    leave that out)

Circa is more critical, if MHRA doesn’t make it this time around,
there’s always later.

Is there a way to specify that the ends of the ranges have to have a
full set of matching date elements? That is, month + year to
end-month + end-year should be valid, but month + year to end-month
should not be.

Also, as Dan recently reminded me, there needs to be provision for
literal passthrough in the implementation. I don’t know whether you
want to formalize that, though, or whether the processor should just
do it.

Schema is at:

http://bitbucket.org/bdarcus/csl-schema/src/

There is at least one off-schema variable lurking in the citeproc-js
source code:

shortTitle (an externally provided explicit value, substituted for
the title variable when its form is set to “short”

Frank

  1. what inline formatting are we supporting on rich variables? Right
    now, I have b, i, and sup. I’ve punted on the semantics stuff, because
    it’s more important that we get the meta stuff (RDFa and such)
    implemented first.

citeproc-js recognizes these:

(italic)
(bold)
(superscript)
(subscript)
(small caps)
(passthrough, for proper nouns)
" or locale outer quote (flipflops with ')
’ or locale inner quote (flipflops with ")

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

  1. names is still up-in-the-air

The two kinds of particles (dropping vs non-dropping) should be
discriminated in the input, to avoid the need for sub-field parsing.

Yeah, but this is still a feature I’d call highly experimental.

  1. date details are still TBD (really the MRHA stuff; but I’d rather
    leave that out)

Circa is more critical, if MHRA doesn’t make it this time around,
there’s always later.

Is there a way to specify that the ends of the ranges have to have a
full set of matching date elements? That is, month + year to
end-month + end-year should be valid, but month + year to end-month
should not be.

There probably is, but it won’t be pretty (not that it matters).

Also, as Dan recently reminded me, there needs to be provision for
literal passthrough in the implementation. I don’t know whether you
want to formalize that, though, or whether the processor should just
do it.

I’m not willing to formalize this notion beyond what I already have
(date “other” and contributor “name”).

Bruce

  1. what inline formatting are we supporting on rich variables? Right
    now, I have b, i, and sup. I’ve punted on the semantics stuff, because
    it’s more important that we get the meta stuff (RDFa and such)
    implemented first.

citeproc-js recognizes these:

(italic)
(bold)
(superscript)
(subscript)
(small caps)
(passthrough, for proper nouns)
" or locale outer quote (flipflops with ')
’ or locale inner quote (flipflops with ")

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

Not sure that it’s relevant, but the actual character used for quotes
can affect processing. It’s a bit arcane, but if you have a title
that contains double French quotes (what Wikipedia tells me are called
guillemets), and the work is being cited in English with outer double
quotes around the whole title, the French quotes would stay in place
verbatim. But if the work is cited in a French style, with
CSL-supplied French-style double quotes around the title, the quotes
embedded in the title would flip-flop to French-style inner quotes.

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

Not sure that it’s relevant, but the actual character used for quotes
can affect processing. It’s a bit arcane, but if you have a title
that contains double French quotes (what Wikipedia tells me are called
guillemets), and the work is being cited in English with outer double
quotes around the whole title, the French quotes would stay in place
verbatim. But if the work is cited in a French style, with
CSL-supplied French-style double quotes around the title, the quotes
embedded in the title would flip-flop to French-style inner quotes.

I’m not willing to add this sort of complexity ATM to the spec or the
schema. There are quotes, and there are literal characters. Quotes are
localized and thus represented in one language.

Bruce

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

Not sure that it’s relevant, but the actual character used for quotes
can affect processing. It’s a bit arcane, but if you have a title
that contains double French quotes (what Wikipedia tells me are called
guillemets), and the work is being cited in English with outer double
quotes around the whole title, the French quotes would stay in place
verbatim. But if the work is cited in a French style, with
CSL-supplied French-style double quotes around the title, the quotes
embedded in the title would flip-flop to French-style inner quotes.

I’m not willing to add this sort of complexity ATM to the spec or the
schema. There are quotes, and there are literal characters. Quotes are
localized and thus represented in one language.

Yes, agreed that it’s a corner case, but it does need to work that
way. Japanese citation styles use 「wide corner brackets」for quotes,
which don’t mix well with English text (and conversely, 混ざった“形式”にすると
English quotes look odd when applied to Japanese text).

One option would be to just leave quotes out of the inline section of
the spec for the present.

“abstract” might make sense as well.

Rintze

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

Not sure that it’s relevant, but the actual character used for quotes
can affect processing. It’s a bit arcane, but if you have a title
that contains double French quotes (what Wikipedia tells me are called
guillemets), and the work is being cited in English with outer double
quotes around the whole title, the French quotes would stay in place
verbatim. But if the work is cited in a French style, with
CSL-supplied French-style double quotes around the title, the quotes
embedded in the title would flip-flop to French-style inner quotes.

I’m not willing to add this sort of complexity ATM to the spec or the
schema. There are quotes, and there are literal characters. Quotes are
localized and thus represented in one language.

Yes, agreed that it’s a corner case, but it does need to work that
way. Japanese citation styles use 「wide corner brackets」for quotes,
which don’t mix well with English text (and conversely, 混ざった“形式”にすると
English quotes look odd when applied to Japanese text).

OK.

One option would be to just leave quotes out of the inline section of
the spec for the present.

Or to simply incrementally build up support for this.
Cross-script/language issues are always going to be more complex than
when we’re dealing with one language/script.

Is the issue you identify fundamentally about the language, or the script?

Bruce

Well, I’ll add q for quotes (forgot about those!), and maybe the others too.

Not sure that it’s relevant, but the actual character used for quotes
can affect processing. It’s a bit arcane, but if you have a title
that contains double French quotes (what Wikipedia tells me are called
guillemets), and the work is being cited in English with outer double
quotes around the whole title, the French quotes would stay in place
verbatim. But if the work is cited in a French style, with
CSL-supplied French-style double quotes around the title, the quotes
embedded in the title would flip-flop to French-style inner quotes.

I’m not willing to add this sort of complexity ATM to the spec or the
schema. There are quotes, and there are literal characters. Quotes are
localized and thus represented in one language.

Yes, agreed that it’s a corner case, but it does need to work that
way. Japanese citation styles use 「wide corner brackets」for quotes,
which don’t mix well with English text (and conversely, 混ざった“形式”にすると
English quotes look odd when applied to Japanese text).

OK.

One option would be to just leave quotes out of the inline section of
the spec for the present.

Or to simply incrementally build up support for this.
Cross-script/language issues are always going to be more complex than
when we’re dealing with one language/script.

Is the issue you identify fundamentally about the language, or the script?

The language, and its associated script. Actually, thinking it
through, if English quotes don’t mix with Han characters well, then
any outer quotes should adjust to the title content. I’m not sure how
that is handled in practice in publishing. You’re right, this should
be taken a little at a time.