Questions, Comments, and Reminders

I’ve been working on a rather troublesome CSL for the Modern Humanities Research Association (MHRA). Here are some questions and comments that have come up:

  1. I suggest that label element in should have form=“short-initialized”. This way “p.” and “pp.” will display without using suffix=".". Any thoughts?

  2. In the defaults section, should the date element be allowed within ? The year seems to be bundled with the publication data in many styles – e.g. “(New York: Random House, 2004)”.

  3. Reminder: should in some way be able to display “and others”.

  4. Reminder: we should add the following form values to : form=“verb-short” and form=“verb-long”. This will display “ed. by” and “edited by” respectively.

  5. I would like access to the XBib repository so I can deposit and update styles. I’ve not done this before, but it’s about time I started.====================

These questions are more directed toward Simon:

  1. The following does not work:

Nothing is displaying. Shouldn’t it print out Zotero’s “Series” field? Have you had any problems with this?

  1. What CSL element matches to Zotero’s “Series Number”?

  2. What CSL element matches to Zotero’s “# of Volumes”?

  3. Reminder: we’ll need you to map out the CSL elements and their relationship to Zotero’s fields. This will go a long way to avoid confusion.

  4. I am not comfortable making changes to the RNG schema. If changes need to be made and Bruce is busy would you mind doing them at your leisure? Ideally, any time Zotero’s parser diverges from the schema the RNG should be updated after community deliberations and Bruce’s approval.

====================

-Jim

  1. I suggest that label element in should have
    form=“short-initialized”. This way “p.” and “pp.” will display without
    using suffix=".". Any thoughts?

What’s wrong with adding the suffix?

  1. In the defaults section, should the date element be allowed within
    ? The year seems to be bundled with the publication data in
    many styles – e.g. “(New York: Random House, 2004)”.

Hmm … good point. Maybe yes. But I think this might go back to
Simon’s issue the other day, the details which I’m not quite
remembering. Simon, did you any conclusion on that since?

  1. Reminder: should in some way be able to display “and
    others”.

Right. We agreed on it being allowed in csl:defaults and that it should
allow the term-name attribute? I forget (been a busy few days).

  1. Reminder: we should add the following form values to :
    form=“verb-short” and form=“verb-long”. This will display “ed. by” and
    "edited by" respectively.

OK.

  1. I would like access to the XBib repository so I can deposit and
    update styles. I’ve not done this before, but it’s about time I
    started.

Sure, just send me your SF user name off-list.

Bruce

  1. I suggest that label element in should have
    form=“short-initialized”. This way “p.” and “pp.” will display
    without
    using suffix=".". Any thoughts?

What’s wrong with adding the suffix?

Hmm. Nothing, I guess. I assumed suffix and prefix should only be used for punctuation and whitespace, not abbreviation points. For that matter, I’m encountering times when I need to do this:

Something is fishy about using suffix this way; I don’t know. Are there any alternatives? Or do you think this usage is OK.

  1. Reminder: should in some way be able to display “and
    others”.

Right. We agreed on it being allowed in csl:defaults and that it
should
allow the term-name attribute? I forget (been a busy few days).

You are correct. Below is our discussion:>>> Jim asked about using a different string for “et al.” It’s not

Yes:

<term suffix="." ... />

Been awhile, but that ought to work.

Bruce

  1. The following does not work:

Nothing is displaying. Shouldn’t it print out Zotero’s "Series"
field? Have you had any problems with this

It was set to print out the “Series Title” field, but I’ve updated it
so that it prints out Series Title preferentially and then Series if
no Series Title exists. Not sure why we have both fields.

  1. What CSL element matches to Zotero’s “Series Number”?

  2. What CSL element matches to Zotero’s “# of Volumes”?

At the moment, I’m not sure there is a good CSL element to map to
either of these. There’s certainly nothing yet in the code. Bruce:
any ideas?

  1. Reminder: we’ll need you to map out the CSL elements and their
    relationship to Zotero’s fields. This will go a long way to avoid
    confusion.

  2. I am not comfortable making changes to the RNG schema. If
    changes need to be made and Bruce is busy would you mind doing them
    at your leisure? Ideally, any time Zotero’s parser diverges from
    the schema the RNG should be updated after community deliberations
    and Bruce’s approval.

Once we’re through getting b4 out the door I’ll get to work on these
two issues.

Simon

  1. The following does not work:

Nothing is displaying. Shouldn’t it print out Zotero’s "Series"
field? Have you had any problems with this

It was set to print out the “Series Title” field, but I’ve updated it
so that it prints out Series Title preferentially and then Series if
no Series Title exists. Not sure why we have both fields.

Actually, a series title is really the “collection” level.

  1. What CSL element matches to Zotero’s “Series Number”?

  2. What CSL element matches to Zotero’s “# of Volumes”?

At the moment, I’m not sure there is a good CSL element to map to
either of these. There’s certainly nothing yet in the code. Bruce:
any ideas?

6 should be supported, though I don’t remember the precise details. 7
is not supported, though we can obviously add it if necessary.

Bruce

Hmm. Perhaps we ought to introduce some way of making macros:

And in the section:

Up until now, we have assumed that the only formatting of a
bibliography shared among different item types will be single fields,
but there are certainly situations in the styles I’ve written where
something like this could have been useful. This would provide an
easy solution to the issues Jim and I have had with the publisher
field, eliminate the need to have all fields permitted in ,
and probably solve other future issues. We could conceivably even
deprecate the section, although I haven’t fully considered
what the consequences of this would be. Implementation should be
fairly simple. Bruce and Jim, what do you think?

Simon

An interesting idea, which actually matches some of my thinking that
emerged when I was trying to figure out how to:

a) model CSL in a RDBMS and a Rails app, and …

b) to document the basic design

The idea that I came to was that there are basically a few core ideas
in CSL:

- definitions (bad name, but "citation" and "bibliography" are 

examples)
- templates (collections of variables; what you are calling “macro”)
- variables

I’m open to refactoring a bit along these (generalizing) lines if you
guys are.

All of these decisions involves trade-offs between different goals:
ease of programming vs. ease of authoring, etc.

Bruce

Bruce and Jim, what do you think?

I think it’s a fine idea, definitely worth consideration. This would, as you indicate, constitute a paradigmatic shift in the way CSLs are written and validated. So we should tread carefully. Some initial discussion points:

  1. In what way could this render the defaults section unnecessary? And how would this "eliminate the need to have all fields permitted in "? Not all elements can be bulked together in a macro. At least, I don’t think so. Elaborate on your vision.

  2. Should a macro’s name attribute allow any user-defined value? Or should there be an authority list of all possible name values? The former would be more flexible. The latter would ensure commonality among all CSLs. I’m leaning toward the former.

All of these decisions involves trade-offs between different goals:
ease of programming vs. ease of authoring, etc.

I’m admittedly bias here, but I think “ease of authoring” should take precedence over “ease of programming.” In an ideal world XML should be simple enough for a non-techie to understand and author.

Yes, that’s my bias too. That’s partly why CSL looks the way it does.

Bruce

Bruce and Jim, what do you think?

I think it’s a fine idea, definitely worth consideration. This
would, as you indicate, constitute a paradigmatic shift in the way
CSLs are written and validated. So we should tread carefully. Some
initial discussion points:

  1. In what way could this render the defaults section unnecessary?
    And how would this "eliminate the need to have all fields permitted
    in "? Not all elements can be bulked together in a macro.
    At least, I don’t think so. Elaborate on your vision.

There is effectively no difference between:

...

and:

...

The one sticky point is inheritance. Currently, takes
items like and , which aren’t fields in
themselves, but rather refer to classes of fields. Perhaps this is a
reason to keep ? But perhaps restrict it to these elements
that specify field classes?

  1. Should a macro’s name attribute allow any user-defined value? Or
    should there be an authority list of all possible name values? The
    former would be more flexible. The latter would ensure commonality
    among all CSLs. I’m leaning toward the former.

I would strongly encourage the former. A style should be allowed to
have as many or as few macros as the style author wants to put in it;
they are simply tools to eliminate repetition and enhance readability
of the XML.

Simon

Yes, for sure.

If you want to pursue this Simon, why not come up with a rough
proposal, preferably that you can represent in RNG (and of course
responding to Jim’s questions).

WRT to validation, one thing to keep in mind is that attribute-based
validation is in general harder to do (and in XML Schema pretty much
impossible). That doesn’t always matter, but it’s just to say there are
trade-offs there, and validation is important to ensure consistency.

Bruce

This is the thing. Basically, the default children are macros, but
they are pre-defined and constrained (controlled by the schema).

So the question is, what additional functionality do you need that the
current system does not provide (other than the details Jim noted)? Do
we really need the added functionality of a generic macro approach?

Just asking …

Bruce

Ultimately, to address the publisher problem I mentioned earlier, I
will need to do one of the following:

  1. Repeat the same CSL to format the publisher at least twice in the
    style (more if I were to implement the supplemental types).
  2. Add and to publisher (in which case,
    consistency and foresight suggest that and
    should be added to all fields, which would be more complicated to
    implement in both the schema and the parser than /).
  3. Use a macro/template.

I would be happy to replace with /, if
we could come up with a suitable way of doing so. it seems like
fields that now contain as a child element (locators and
identifiers) should no longer require this child element to print
something, and fields like would have to be split into
and .

Obviously, we don’t need the added functionality of a generic macro
approach, but without it, style authors will have to deal with a
significant amount of repetition to handle the idiosyncrasies of some
styles.

Simon

So see if you can come up with a proposal.

To me, the goal ought to be that it enhances the consistency,
flexibility and I suppose brevity of the existing approach, but does
not introduce additional complexity either for us as implementors or
schema maintainers, or for people wishing to author the styles.

This might be a tough set of requirements to meet, but perhaps possible.

Bruce

fields should remain the same, since there’s no way to
eliminate the need for child elements and attributes on those elements.
fields could remain the same, or could be replaced with
separate top-level // fields. I lean toward the
latter.
should be replaced with top-level and
elements. An field might also be necessary (or
this could be an attribute on ).
Most fields inheriting from could be replaced with /
/etc. (as separate top-level fields). I’m not quite sure what
to do with itself, because has to be coupled with
it. Perhaps it would have to stay as is.
/ would be top-level elements, as they are now.
should be replaced with top-level , , and
elements.
would be replaced with top-level //
fields.

Ultimately, I might have a slight preference for simply tacking on
to the current schema, but either of these approaches would
work for me. My Chicago Manual of Style CSL now has the same code for
formatting repeated 4 times inside it and the same code for
formatting repeated twice inside it. I would be happy if
I could replace this code with a simple .

Simon

fields should remain the same, since there’s no way
to eliminate the need for child elements and attributes on those
elements.

Agreed.

fields could remain the same, or could be replaced
with separate top-level // fields. I lean toward
the latter.

How would we then differentiate between, say, the publication date of a book and the creation date of a letter within that book? For that matter, how would we do that under the current system? Should there be or , etc.?

Moreover, there are times when styles require inclusive publication dates for the full run of volumes. Obviously this should be explicitly stated in . But these styles may also require the individual year of the volume being cited, separate from the inclusive dates. How is this done under the current system?

should be replaced with top-level and
elements. An field might also be
necessary (or this could be an attribute on ).

I agree with an or . That way we could differentiate between, say, the original publisher and the reprint publisher. Maybe we could use for reprints. How is this done under the current system?

Most fields inheriting from could be replaced with
//etc. (as separate top-level fields). I’m not quite
sure what to do with itself, because has to be
coupled with it. Perhaps it would have to stay as is.

Agreed.

/ would be top-level elements, as they are now.

I’m surprised there is no series element in the schema. Maybe we should replace with or ? We also need some equivalent to .

Frankly, the relation="" attribute is confusing at times, but maybe necessarily so. What are your thoughts?

should be replaced with top-level , , and
elements.

Agreed.

would be replaced with top-level //
fields.

Agreed. Or and .

Simon,

should be replaced with top-level and
elements.

Can you explain why?

Most fields inheriting from could be replaced with /
/etc. (as separate top-level fields). I’m not quite sure what
to do with itself, because has to be coupled with
it. Perhaps it would have to stay as is.
/ would be top-level elements, as they are now.
should be replaced with top-level , , and
elements.
would be replaced with top-level //
fields.

Again, I’m not following why your impulse here is to decouple child
variables from the generic macros.

Also, keep in mind that none of the styles you’ve done (or indeed the
design of the Zotero data model, which still has the Western-specific
first/last name stuff) accounts for alternate languages and scripts.
Some of my design choices here (like “titles”) are designed to handle
that.

Ultimately, I might have a slight preference for simply tacking on
to the current schema, but either of these approaches would
work for me. My Chicago Manual of Style CSL now has the same code for
formatting repeated 4 times inside it and the same code for
formatting repeated twice inside it. I would be happy if
I could replace this code with a simple .

Why would you need to repeat access or publisher? More importantly, how
would having a (different) macro change that?

Bruce

I need to repeat publisher to do:

There is no way to format this properly otherwise, although nearly
every style uses this rule (or else a conditional on
that replaces it with n.p.). With a
macro, I could define this once at the top of the document, rather
than cluttering up the section with repeated XML.

This also explains why I prefer simple top-level fields. Once you
remove , is there any reason to have </

and not just ? Either we’ll be making authors use
everywhere (or define a macro to handle
this in nearly every document), or we’ll need to make
actually do something. In my view, the latter is much more intuitive
than the former.

Simon