Short roles/verb roles/role capitalization

I apologize if I’m just missing something in the new CSL schema, but I can’t
see how to implement either short contributor roles (ed) or verb roles
(edited by) without redefining terms (which, presumably, we don’t want to
do, because that means we can’t localize these stylistic differences). I
also can’t seem to find a way to alter role capitalization (edited by vs.
Edited by vs. Edited By), although that’s not as big of a deal. Any tips?

Simon

I think this fell through the cracks …

I apologize if I’m just missing something in the new CSL schema, but I
can’t
see how to implement either short contributor roles (ed) or verb roles
(edited by) without redefining terms (which, presumably, we don’t want
to
do, because that means we can’t localize these stylistic differences).
I
also can’t seem to find a way to alter role capitalization (edited by
vs.
Edited by vs. Edited By), although that’s not as big of a deal. Any
tips?

I think it should be:

The lowercase is easy and obvious, as is I think the form (it’s
consistent with titles and names). Not so sure about the “type”
attribute.

Any opinions?

I’ll add it once I hear back. I never did get any feedback on the
localization file and schema stuff.

Bruce

Hmm. What if we added classes for roles (or whatever you want to call them)
to locales.xml, e.g.:

editor editors translator translators ed eds tran trans

Then:

The mapping seems a bit clearer this way, but in implementation the
difference will be minimal.

Simon

Just to expand, the schema (with the content model borrowed from FO,
itself from CSS):

attribute text-transform { “none” | “lowercase” | “uppercase” |
“capitalize” }?

The localization file is your third option, “capitalize” would get your
second, and I’ve already told you how to get the first.

Bruce

Hmm. What if we added classes for roles (or whatever you want to call
them)
to locales.xml

I asked about that earlier, but recall some people complaining about it
:wink:

I have no strong opinion either way; really up to you guys.

Why “class”, rather than “form”? I guess the question could be turned
around though …

Bruce

So how would the verb form work? Right now I have it as a simple term
(since it’s the same value for single and multiple).

Edited By

… ?

Seems a little awkward (you have to look up a different value to get
it, though that may not be a big deal?).

Maybe:

Edited By Editor Editors

I dunno.

Bruce

So how would the verb form work? Right now I have it as a simple term
(since it’s the same value for single and multiple).

Edited By

… ?

Seems a little awkward (you have to look up a different value to get
it, though that may not be a big deal?).

I was planning to abstract terms in such a way that it wouldn’t matter
within my code (when a term contains only text, I’d just assume that the
text applies to both single and multiple).

Maybe:

Edited By Editor Editors

I vaguely prefer the additional abstraction of my approach, but this would
work too.

I’m a bit unclear on locators. Why does contain a tag
while all bibliography elements contain elements? Why is there an
include-label attribute on the tag?

Sorry to be so anal about these details. It’s just that as I work through
updating my implementation for the new schema I want to make sure I’m doing
things right.

Simon

I was planning to abstract terms in such a way that it wouldn’t matter
within my code (when a term contains only text, I’d just assume that the
text applies to both single and multiple).

I just mean that “editor” and “edited-by” are different variables. You
need logic to account for that in the current approach. It’d be mildly
easier to just look up a role (“editor”) and then from there the
specific rendering.

I’m a bit unclear on locators. Why does contain a tag
while all bibliography elements contain elements? Why is there an
include-label attribute on the tag?

This is a tricky issue to deal with, and I’m not I have it exactly right.

Remember, locator is the base class so to speak. Pages are a kind of locator.

In reference lists, you really have to use the specific locators
because order and formatting is that specific.

In citations, however, you typically have a single locator, which is
usually – but not always – page numbers. So consider a style like
APA, which should be like (Doe, 1999:23). The locator isn’t printed.
But what if you need to indicate a paragraph number?

That’s what we need to configure.

This is another one of those things that becomes more awkward,
incidentally, with the localized approach. Earlier, there would be
lcoator elements (terms), and if one was present, then the label would
get printed. If it wasn’t, it didn’t.

Sorry to be so anal about these details. It’s just that as I work through
updating my implementation for the new schema I want to make sure I’m doing
things right.

Don’t be sorry. Now is the time to be really picky; it’ll yield a
tighter schema and spec.

Bruce

I was planning to abstract terms in such a way that it wouldn’t matter
within my code (when a term contains only text, I’d just assume that the
text applies to both single and multiple).

I just mean that “editor” and “edited-by” are different variables. You
need logic to account for that in the current approach. It’d be mildly
easier to just look up a role (“editor”) and then from there the
specific rendering.

In that case, why couldn’t we just do:

Edited By

Since the term is in a separate class, we don’t have to worry about
conflicting names.

I’m a bit unclear on locators. Why does contain a tag
while all bibliography elements contain elements? Why is there an
include-label attribute on the tag?

This is a tricky issue to deal with, and I’m not I have it exactly right.

Remember, locator is the base class so to speak. Pages are a kind of locator.

In reference lists, you really have to use the specific locators
because order and formatting is that specific.

In citations, however, you typically have a single locator, which is
usually – but not always – page numbers. So consider a style like
APA, which should be like (Doe, 1999:23). The locator isn’t printed.
But what if you need to indicate a paragraph number?

Ah. What threw me off is that, in csl.rnc, volume and issue also inherit
from locator.

That’s what we need to configure.

This is another one of those things that becomes more awkward,
incidentally, with the localized approach. Earlier, there would be
lcoator elements (terms), and if one was present, then the label would
get printed. If it wasn’t, it didn’t.

Ah, I see. This issue is pretty tricky. We have to be able to specify:

pp105-106
pp. 105-106
p105-106 (as Johan pointed out a while back)
pages 105-106

As well as:

¶1-2 (maybe?)
¶ 1-2
paragraphs 1-2

There’s also the situation of citing law, about which you probably know much
more than I do (is there only one format?). And finally, there’s the Bible,
the Constitution, and other similar documents, but I think we’d be wise to
consider these as special cases we don’t necessarily have to handle.

The main problem here is that, while there are short and long roles, there
are also many possible permutations of the two. There are more variations on
page number than on paragraphs, and prefix/suffix can change depending on
the locator (how do you specify that you want "pp. " for multiple pages but
"¶ " for paragraphs?).

Right now, there are three solutions that come to mind:

  1. Use a class-based approach similar to what I recommended for roles and
    define a very wide range of possible classes.

  2. Define only “short” and “long” forms and expect people who need "pp. "
    and "¶ " to re-define the terms at the top of the file. (It probably doesn’t
    matter too much that the localized version would be missing a period.)

  3. Expect people who want "pp. " and "¶ " to define the element
    separately from in to add a period to the label. (Of
    course, this still doesn’t solve the irritating “p105-106” situation, which
    would require redefining a term).

Either way, I would prefer to specify the label as a child element instead
of via a simple include-label attribute, since that way we can define a
prefix or suffix.

Perhaps you have some other solution to this issue in mind? I’d be happy to
hear something simpler/nicer than what I’ve got.

Simon

Ah, I see. This issue is pretty tricky. We have to be able to specify:

pp105-106
pp. 105-106
p105-106 (as Johan pointed out a while back)
pages 105-106

As well as:

¶1-2 (maybe?)
¶ 1-2

Actually:

¶ 1
¶¶ 1-2

paragraphs 1-2

Yup; all that. And lines, figures, equations, etc.

And to get into what is pretty much edge case territory, sometimes
there are more than one: “page 2, line 14.” [actually, I’ve seen this
in history books]

There’s also the situation of citing law, about which you probably know much
more than I do (is there only one format?).

Law is really difficult, OTOH, there is one dominant format, and a
handful of others.

And finally, there’s the Bible,
the Constitution, and other similar documents, but I think we’d be wise to
consider these as special cases we don’t necessarily have to handle.

I agree; we’ve got more than enough to handle with what we know (APA,
Chicago, etc.).

The main problem here is that, while there are short and long roles, there
are also many possible permutations of the two. There are more variations on
page number than on paragraphs, and prefix/suffix can change depending on
the locator (how do you specify that you want "pp. " for multiple pages but
"¶ " for paragraphs?).

Exactly.

Right now, there are three solutions that come to mind:

  1. Use a class-based approach similar to what I recommended for roles and
    define a very wide range of possible classes.

  2. Define only “short” and “long” forms and expect people who need "pp. "
    and "¶ " to re-define the terms at the top of the file. (It probably doesn’t
    matter too much that the localized version would be missing a period.)

  3. Expect people who want "pp. " and "¶ " to define the element
    separately from in to add a period to the label. (Of
    course, this still doesn’t solve the irritating “p105-106” situation, which
    would require redefining a term).

Either way, I would prefer to specify the label as a child element instead
of via a simple include-label attribute, since that way we can define a
prefix or suffix.

Yeah, that makes sense. I’ll definitely make that change.

I guess on the other ideas, one way to think of this is that the
differerence between the paragraph symbol and the abbreviated text
label with the period after it is just that: one is text and the other
a symbol.

Perhaps we somehow incorporate that logic: short text labels may be
initialized, while symbols may not?

Not sure about the concrete XML outcome of the idea; need to sleep on
that. Feel free to post a followup …

Bruce

Any further thoughts on this issue? I’m eagerly awaiting a solution that
will allow me to use short labels, and I figure we should probably work this
out first.

Simon

I’m busy at the moment, but you’ve got SVN access. Want to just commit
a version that works for you and I can review it? Implement in the
loceles.xml file, and then I’ll later add it to the schema if it seems
right.

Bruce