CSL 1.0.2 and 1.1 Development Window Open

I’d say largely (at least for me) as a consequence of the pandemic, some of us (me, along with @bwiernik and @Denis_Maier, and key input from @Rintze_Zelle and @Sebastian_Karcher) have found some time to do some work whittling down the backlog of issues and pull requests on the various CSL repos.

Plan

Current plan is to push out two releases over the next two months:

  1. v1.0.2: variable additions mostly
  2. v1.1: remove some items previously marked to deprecate, add new features (like an intext element, allow different formatting for title and subtitles, etc.)

A lot of this work is already done, the latter on the v1.1 branch.

I am hoping we can tag a pre release version of 1.0.2 in the coming weeks, and then allow for a comment period before tagging v1.0.2, and then same process a bit later with v1.1.

How you can contribute

We have setup a proposal review project (though it’s not as automated as it might be, and I’ve not been manually updating it recently), and milestones for v1.0.2 and v.1.1. Hopefully this makes it easier to track progress.

If you think we’ve missed an issue, or want to submit or review PRs, including for documentation, now would be the time.

I’d in particular like to add some developers to a PR review team, so we can get feedback early, and make decisions quickly.

Just an update: goal is to close work on v1.0.2 end of week, and tag v1.0.2-pre.1, and then have a comment period.

Tag is here. Please review and let us know if any feedback.

This release pretty much just adds new types and variables, and so closes a lot of issues and PRs that have been hanging around.

We were aggressive in trying to close as many outstanding issues as possible, but I think conservative in only accepting changes we thought would be uncontroversial.

Thanks much to @bwiernik and @Denis_Maier for all the help!

v1.1 should be even more cool!

1 Like

Now would be the time to review 1.0.2, as I plan to tag it later this week.

I don’t necessarily expect any issues, but positive thumbs up if that’s the case would be welcome too.

3 Likes

I just went through the output of git diff again, and it looks like this should be ok. There’s probably nothing controversial in there, and we can always add new terms/variables later.

1 Like

As we work on wrapping up 1.1 in parallel, a note on those evolving plans.

Since this release will introduce new features that will require processor changes to support (though not any breaking changes), I think it wise that before merging to master we make sure all the new behavior is documented in the test suite, and that we get some of the processor developers (cc @Frank_Bennett, @cormacrelf, @asimonyi, @PaulStanley) to sign off on them.

So a test-driven release plan.

To wit, and at @Frank_Bennett’s suggestion, I created a separate branch of the test suite and a subdir for these tests.

I have marked these tests “experimental” as they are not yet merged to master.

Currently, there’s only one that’s done, but I hope we can fill this out over the coming week or two. Mostly, per below, this will just mean identifying and adapting existing CSL-M citeproc-js tests.

Request of @Denis_Maier: Citeproc-js has tests for two of the main features listed below (possible candidates for intext and for conditions). Could you please identify a few and add to our repo, adjusting as needed? Our intext implementation should be the same as in CSL-M, I think, but conditions may differ slightly.

The most important new features, and open questions that we still need feedback on:

1 Like

Thanks for all the work on this.

I want to take another look, but mostly I think 1.0.1 is in good shape. One thing I think I noticed (unless I’m missing something) is that JSON schema and RNC schema aren’t 100% aligned. volume-title, e.g., is in the RNC schema but not in the JSON schema. If I’m right about this, that means there may be other issues and this requires another very careful look.

Of the added variables, the only thing I have some misgivings about is https://github.com/citation-style-language/schema/pull/231 – given demand for IDs over the years, this is kind of a random selection. E.g. arXiv ID, NBN, probably OCLC number, ZbMath, etc. all have come up more. Given minimal demand for the new identifiers (it’s not like adding these solves some urgent, much noticed issue), the fact that there are a ton of existing IDs and an expanding landscape of which ones are used and how, I’d suggest tabling these and thinking about a more global approach to IDs for 1.1+

1 Like

Any ideas on how that could work?

I hadn’t thought about this, but first impressions …

Maybe an input object (potentially even the new custom) for other IDs, and then some mechanism to access them in a style?

Would seem the simple and clean solution would be to use custom, and then allow variable="custom/some-id".

That does raise larger compatability questions though, so wondering if there are any other options.

No, I think an identifier variable that carries 'type’ and ‘value’ elements (or, probably better, it could be an object with key value pairs) would be better. Many of these identifiers are used in citations (e.g., ArXiV id).

For comparison, after discussion, Frank and I settled on legal-eid to include the ECLI (European Law Case Identifier) and other potential future electronic legal identifiers. He didn’t care for legal-identifier because that term might encompass things like a case’s Docket Number (CSL number). But, I think if we had a general identifier variable and documented it well, it could encompass both legal and other identifiers.

We should add an identifier or electronic-identifier or e-identifier variable. It would work like locator. It would have type-value elements or be structured as a set of key-value pairs.

The main question is exactly how to work with this in styles. Each identifier has its own formatting requirements (e.g., a URI scheme). The URI scheme should also inform the target of embedded links. I suggest we permit three forms: “uri”, “prefix”, and “bare”. Accomplishing this would be easiest handled with a new element similar to dates: cs:identifier.

We can curate a list of identifiers, prefixes, and URL schemes that are supported. This could be easily expanded over time. For identifiers not supported, all forms would just return “bare”.

The existing identifiers in CSL 1.0.1 could be grandfathered (DOI, PMCID, PMID, ISBN, ISSN) to be available through either cs:text or cs:identifier.

2 Likes

Sounds good.

Seems like we need a PR to consider for 1.1.

If we settle that, we can follow Sebastian’s suggestion to remove these ids from 1.0.2.

Done. I’ve opened a pull request with changes from Sebastian’s feedback. https://github.com/citation-style-language/schema/pull/303

1 Like

Thinking about the human-friendly input representation, I think we do want to keep the core ids as dedicated properties, and so treat a potential new identifier object as effectively for extended ids.

Yeah, sure, that’s what I was thinking.

For human-friendly input, we could also accept a labeled delimited list in flat string (which we’ve discussed for locator and page).

This I don’t like, because it requires custom parsing again.

Would rather just do something like:

extids:
   foo: 1234
1 Like

For larger process – I’m a little worried by the small number of people involved in this so far* and I think it’s worth having a more widely accessible comment period for a couple of weeks both as a check and as a general matter of transparency.

Ideally, we’d have a more readily human-readable summary of the planned changes for 1.0.2 that we could point people to and then various ways to engage. I don’t think we’ll be overrun, but given that CSL is used by >1m people, I think getting feedback from >10 would be good and worth a small delay & a bit of extra work.

* this is not at all intended as criticism of the work done so far; on the contrary, I think for getting the actual work done, a small group is preferable

1 Like

I think that’s a very valuable suggestion! For 1.0.2 this should be rather trivial. But for 1.1 we should really do this. Ideally, the same document would later become the changelog?

1 Like

I’m fine on waiting a bit. I just can’t commit to having time beyond July to stay as involved as I have for the past few months.

There are also some practical considerations around managing multiple branches in git for non-experts.

Keep in mind, the decisions we made were mostly closing stale issues that had been lingering for years, with substantial feedback on them, often building on discussions in other fora, like the Zotero forums.

Even the 1.1 changes are mostly implementing features already in CSL-M. The intext narrative citation support is already supported in multiple processors (pandoc, citeproc-rs I believe, for example), in fact.

I do think, once these two releases are done, we need to rethink process, and consider things like:

  • no longer doing x.x.x releases; just adding strings as they’re needed, and only version the spec and schemas (or version them differently) when we introduce features that impact processors; but this does raise question about how best to get feedback
  • formalize a test-driven process for the latter; which requires feedback from a different audience (developers)
  • solicit labor contributions for all this work from the big projects that rely on it (though I’m unsure how practical, or what the specific asks would be; at minimum it would likely be more contribution to PR reviews)

I don’t expect this to be super controversial. For 1.0.2 I’d imagine the main function would be to see if there’s a major thing we overlooked (which we could then either consider for 1.1 or decide it was crucial enough for 1.0.2).
I’m also just thinking about this in terms of governance – I think it’s the right thing to do to Tweet about this and to link it from the Zotero forums and to send an email to all implementers for which we have contact info. At least for the two former, a quick summary would be nice – we could recycle that for the announcement of the final version, which I think is a pretty big deal.

1 Like

Yeah, for 1.0.2 I imagine most feedback would be something like Sebastian’s suggestion to defer the identifiers.

It might also be nice to get folks from Mendeley, Zotero, and other applications in discussions about variable and type mapping updates.

Yes. We just heard from Paperpile (via citationstyles.org contact) and I pointed them to this thread & asked for feedback.

1 Like