CSL 2016 - second instalment (specification updates)

Hi all,

as promised, here’s the 2nd instalment of the conversation between Bruce,
Frank, Rintze, and me on CSL plans for 2016. This one focuses on the
general outline for updates of the CSL specifications&schema.

  1. Minor Update (terms, types, and variables)
    There’s a fair number of proposed terms, variable&types that we should add
    to CSL, most of them listed here
    https://github.com/avram/zotero-bits/issues/ We’d also, in that process,
    re-license the schema under MIT in line with our general governance
    standards.
    Any addition of these should be very straightforward for processors to
    include and obviously the change would be backward compatible for old
    styles. Our tentative goal for this would be within the next 3-4 months

Two things for people to comment/check on:
a) if you have something you’d like to see added that fits in this broad
description and isn’t listed at the link above, now would be a good time to
bring that up.
b) any thoughts from implementers on how you’d like us to deal with this
on the repository? I think it makes sense to have two branches for some
time, but we don’t really want to handle multiple branches for an extended
period of time.

  1. Major update (unclear timeline, maybe summer 2016?)
    The main things we’d like to see would be
    a) distinction between continuously paginated journals and those that are
    not for APA and similar styles. This comes up a lot & is super annoying to
    automate. We think that from the CSL side, we would just introduce a
    variable like “continuously-paginated” which defaults to true (since most
    journals are) and leave it to reference managers on how exactly to
    implement, though we’d be happy to host metadata to help automatize this,
    crowdsourced or otherwise.

b) Implementing a distinction between (author date) and author (date). This
is being requested a lot and the lack of proper support makes things hard
particularly for citation styles like APA with changing et al. requirements
depending on position of the cite. To assure that one of CSL’s key features
– one click conversion between author-date and footnotes styles, remains
in tact this means authors also need to be included in corresponding
citations for numeric and note styles, i.e. Smith (1776) needs to turn into
Smith [1] or Smith ^1 respectively. Pandoc-citeproc already does this, and
up to this point this could all be handled in the ref managers, processors.

Where we would need a CSL chance is to allow for different formatting
outside of the parentheses. E.g. APA (again!) has (Smith & Marx, 1776), but
Smith and Marx (1776), so we would need to allow for two formats in
cs:citation depending on the type of citation.

c) Composite styles as used in chemistry. These look like numeric styles,
but within a single numeric citation combine multiple references using a),
b), c) etc. This is the one we understand least well, so no one is really
sure how this would look properly implemented. If someone knows an expert
in ACS-type citation styles who’d be willing to walk us through exactly how
this works, that could be very helpful.

Are there any objections to any of these? And are there any major issues
we’re missing that we should put on the agenda for the next major update?

Thanks,
Sebastian–
Sebastian Karcher, PhD
www.sebastiankarcher.com

Sebastian Karcher wrote

  1. Minor Update (terms, types, and variables)
    There’s a fair number of proposed terms, variable&types that we should add
    to CSL, most of them listed here
    Issues · zotero/zotero-bits · GitHub We’d also, in that process,
    re-license the schema under MIT in line with our general governance
    standards.
    Any addition of these should be very straightforward for processors to
    include and obviously the change would be backward compatible for old
    styles. Our tentative goal for this would be within the next 3-4 months

Two things for people to comment/check on:
a) if you have something you’d like to see added that fits in this broad
description and isn’t listed at the link above, now would be a good time
to
bring that up.

One set of changes that probably still needs substantial discussion is
determining the set of fields that should be added for “nested” events
(e.g., a paper in a symposium in a conference). Initial discussion was here:

The minimum set of fields I think satisfy all use cases are: (1) the title
of the paper (smallest unit), (2) the title of the paper sessions where the
paper is presented, (3) the title of the sub conference, section, or track
at the conference (this could also be used for things like lecture series),
(4) the name of the overall event.

Mapping to CSL fields:
Title of paper ↔ title
Title of session ↔ session-title
Title of track/subconference ↔ track-title (could also use series-title
here if that isn’t perverting the use of that term too much)
Title of conference ↔ event-name

Sebastian Karcher wrote

  1. Major update (unclear timeline, maybe summer 2016?)
    The main things we’d like to see would be
    a) distinction between continuously paginated journals and those that are
    not for APA and similar styles. This comes up a lot & is super annoying to
    automate. We think that from the CSL side, we would just introduce a
    variable like “continuously-paginated” which defaults to true (since most
    journals are) and leave it to reference managers on how exactly to
    implement, though we’d be happy to host metadata to help automatize this,
    crowdsourced or otherwise.

If the default for “continuously-paginated” is true, would it be easier for
clients and processors to implement if the variable were
“paginated-by-issue” with default to false, so that the absence of the
variable implies false?

As for implementation, I imagine this could be handled in a similar way as
journal abbreviations with either a set curated list (ala Zotero) or
user-editable database (ala Frank’s Abbreviations Plugin for Juris-M) being
first choice and whatever is defined for the item in the client being used
as a fallback.

Sebastian Karcher wrote

b) Implementing a distinction between (author date) and author (date).
This
is being requested a lot and the lack of proper support makes things hard
particularly for citation styles like APA with changing et al.
requirements
depending on position of the cite. To assure that one of CSL’s key
features
– one click conversion between author-date and footnotes styles, remains
in tact this means authors also need to be included in corresponding
citations for numeric and note styles, i.e. Smith (1776) needs to turn
into
Smith [1] or Smith ^1 respectively. Pandoc-citeproc already does this, and
up to this point this could all be handled in the ref managers,
processors.

Where we would need a CSL chance is to allow for different formatting
outside of the parentheses. E.g. APA (again!) has (Smith & Marx, 1776),
but
Smith and Marx (1776), so we would need to allow for two formats in
cs:citation depending on the type of citation.

APA’s specific in-text formatting requirements are:

  1. “Smith and Marx (1776)” versus “(Smith & Marx, 1776)”
  2. “Smith, Marx, and Jones (1776)” for first appearance, “Smith et al.
    (1776)” for subsequent appearances
  3. “Smith et al. (1776)” for the first appearance in a paragraph, “Smith et
    al.” for subsequent appearances in the same paragraph (unless this is
    ambiguous, in which case the year is included each time)
  4. Alternative phrases (e.g., “and colleagues”) are allowed instead of et
    al. for in-text citations
  5. The year can be free-floating, instead of in parentheses (e.g., “In their
    1776 review, Smith and Marx argued…”)

The first two are what Sebastian already brought up. The third requires that
a system similar to the “near-note” or “ibid” functionality exist (note that
multiple in-text citations can be referred to without a year so long as they
are unambiguous–e.g., when comparing two papers) and that it can be
separately applied to in-text versus standard citations.

The fourth item is probably not worth dealing with in CSL. It is probably
best handled by just surpassing the author (current practice).

The fifth item would require both in-text citation formatting and the option
to suppress the year (which citeproc-js can already do). This way, the user
could manually type the year, then add a citation formatted as “Smith et
al.” so that it is formatted corrected. Including an option to format a
reference as “Year” without the parentheses is probably not worth the
hassle.

(For feature comparison, Endnote provides (Author, Year), Author (Year),
suppress Author, and suppress Year options.)–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/CSL-2016-second-instalment-specification-updates-tp7579446p7579447.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Thanks for these

One set of changes that probably still needs substantial discussion is
determining the set of fields that should be added for “nested” events
(e.g., a paper in a symposium in a conference). Initial discussion was
here:
Presentation - add Session Title, maybe Chair · Issue #60 · zotero/zotero-bits · GitHub

yeah – we’ll see how far we get on that. The hope would be to rather push
out an update sooner rather than later and do these smaller ones more
frequently (like once a year) as needed, so if we can’t fit the more
complicated considerations like the nested aspect for conferences as well
as books in series in this time, I’m OK with that, though of course it’d be
cool if we did.

If the default for “continuously-paginated” is true, would it be easier for

clients and processors to implement if the variable were
“paginated-by-issue” with default to false, so that the absence of the
variable implies false?s. Alternatively it could be an attribute on the
issue variable.

Right, we don’t have “true” and “false” for variable, so that was sloppy,
spent too much time in javascript recently… So if we do this as a
variable (which would be kind of weird, because currently all of our
variables have an actual string), what would matter is what the reference
manager passes to the citeproc (which would then be the “default”) and we
could go either way, tough it probably makes sense to only include it for
the rarer case, i.e. paginated-by-issue.

It might make more sense to include it as an attribute variable=“volume”
the way is-numeric works now: … again,
probably makes more sense to test for the rare condition as you suggest.
But syntax details we can handle on github, I think. I’m more interested in
the general view on whether this is something we should try to address.

As for implementation, I imagine this could be handled in a similar way as
journal abbreviations

yes, that’s the model we had in mind.

APA’s specific in-text formatting requirements are:

  1. “Smith and Marx (1776)” versus “(Smith & Marx, 1776)”
  2. “Smith, Marx, and Jones (1776)” for first appearance, “Smith et al.
    (1776)” for subsequent appearances
  3. “Smith et al. (1776)” for the first appearance in a paragraph, “Smith et
    al.” for subsequent appearances in the same paragraph (unless this is
    ambiguous, in which case the year is included each time)
  4. Alternative phrases (e.g., “and colleagues”) are allowed instead of et
    al. for in-text citations
  5. The year can be free-floating, instead of in parentheses (e.g., “In
    their
    1776 review, Smith and Marx argued…”)

The first two are what Sebastian already brought up. The third requires
that
a system similar to the “near-note” or “ibid” functionality exist (note
that
multiple in-text citations can be referred to without a year so long as
they
are unambiguous–e.g., when comparing two papers) and that it can be
separately applied to in-text versus standard citations.

The fourth item is probably not worth dealing with in CSL. It is probably
best handled by just surpassing the author (current practice).

The fifth item would require both in-text citation formatting and the
option
to suppress the year (which citeproc-js can already do). This way, the user
could manually type the year, then add a citation formatted as “Smith et
al.” so that it is formatted corrected. Including an option to format a
reference as “Year” without the parentheses is probably not worth the
hassle.

Agree on ignoring 4). I don’t think something like near note is possible
with reasonable effort. Counting paragraphs is a lot harder than footnotes.
So I think this will be up to authors and I think it could be done
reasonably well with a suppress year function as needed for 5. Like
suppress author, that’s up to the citeprocs and reference managers to
implements and they could already go ahead and do that should they want to
(which I think has a lot going for it, even though it makes for an ugly
interface with too many options)

I am thrilled to hear that a CSL update is on its way. It probably goes without saying, but in case there’s anyone new here reading this: along with the zotero-bits repository, there’s more at https://github.com/citation-style-language/schema/issues to be considered for the next release.

Sebastian Karcher wrote

One set of changes that probably still needs substantial discussion is
determining the set of fields that should be added for “nested” events
(e.g., a paper in a symposium in a conference). Initial discussion was
here:
Presentation - add Session Title, maybe Chair · Issue #60 · zotero/zotero-bits · GitHub

yeah – we’ll see how far we get on that. The hope would be to rather push
out an update sooner rather than later and do these smaller ones more
frequently (like once a year) as needed, so if we can’t fit the more
complicated considerations like the nested aspect for conferences as well
as books in series in this time, I’m OK with that, though of course it’d
be
cool if we did.

What would be the best format for presenting proposals on events and book
series for discussion?–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/CSL-2016-second-instalment-specification-updates-tp7579446p7579450.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

The asynchronous back-and-forth of a github ticket or a listserv seems
less-than-ideal, since these things get complicated.

Maybe a google document, linked from the tickets?

I recall laying out some ideas while back for a standard template for
proposals, but can’t seem to find it ATM. Anybody know what I’m
talking about, and where it might be?

Maybe it would be handier to create a GitHub repo dedicated to CSL
development, which could be the central place to have such detailed
discussions. I’m not a big fan of storing proposals in GitHub issue
descriptions, since issues don’t have version history and can be
easily deleted, but we could just store the proposals as files within
the Git repo. Or use the repo’s wiki.

Rintze

I think github repo Wiki is a good idea.

Bruce D’Arcus-3 wrote

The asynchronous back-and-forth of a github ticket or a listserv seems
less-than-ideal, since these things get complicated.

Maybe a google document, linked from the tickets?

I recall laying out some ideas while back for a standard template for
proposals, but can’t seem to find it ATM. Anybody know what I’m
talking about, and where it might be?

Here is the proposal template you suggested.

View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/CSL-2016-second-instalment-specification-updates-tp7579446p7579454.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.