I’d say largely (at least for me) as a consequence of the pandemic, some of us (me, along with @bwiernik and @Denis_Maier, and key input from @Rintze_Zelle and @Sebastian_Karcher) have found some time to do some work whittling down the backlog of issues and pull requests on the various CSL repos.
Plan
Current plan is to push out two releases over the next two months:
v1.0.2: variable additions mostly
v1.1: remove some items previously marked to deprecate, add new features (like an intext element, allow different formatting for title and subtitles, etc.)
A lot of this work is already done, the latter on the v1.1 branch.
I am hoping we can tag a pre release version of 1.0.2 in the coming weeks, and then allow for a comment period before tagging v1.0.2, and then same process a bit later with v1.1.
How you can contribute
We have setup a proposal review project (though it’s not as automated as it might be, and I’ve not been manually updating it recently), and milestones for v1.0.2 and v.1.1. Hopefully this makes it easier to track progress.
If you think we’ve missed an issue, or want to submit or review PRs, including for documentation, now would be the time.
I’d in particular like to add some developers to a PR review team, so we can get feedback early, and make decisions quickly.
Tag is here. Please review and let us know if any feedback.
This release pretty much just adds new types and variables, and so closes a lot of issues and PRs that have been hanging around.
We were aggressive in trying to close as many outstanding issues as possible, but I think conservative in only accepting changes we thought would be uncontroversial.
I just went through the output of git diff again, and it looks like this should be ok. There’s probably nothing controversial in there, and we can always add new terms/variables later.
As we work on wrapping up 1.1 in parallel, a note on those evolving plans.
Since this release will introduce new features that will require processor changes to support (though not any breaking changes), I think it wise that before merging to master we make sure all the new behavior is documented in the test suite, and that we get some of the processor developers (cc @Frank_Bennett, @cormacrelf, @asimonyi, @PaulStanley) to sign off on them.
I have marked these tests “experimental” as they are not yet merged to master.
Currently, there’s only one that’s done, but I hope we can fill this out over the coming week or two. Mostly, per below, this will just mean identifying and adapting existing CSL-M citeproc-js tests.
Request of @Denis_Maier: Citeproc-js has tests for two of the main features listed below (possible candidates for intext and for conditions). Could you please identify a few and add to our repo, adjusting as needed? Our intext implementation should be the same as in CSL-M, I think, but conditions may differ slightly.
The most important new features, and open questions that we still need feedback on:
I want to take another look, but mostly I think 1.0.1 is in good shape. One thing I think I noticed (unless I’m missing something) is that JSON schema and RNC schema aren’t 100% aligned. volume-title, e.g., is in the RNC schema but not in the JSON schema. If I’m right about this, that means there may be other issues and this requires another very careful look.
Of the added variables, the only thing I have some misgivings about is https://github.com/citation-style-language/schema/pull/231 – given demand for IDs over the years, this is kind of a random selection. E.g. arXiv ID, NBN, probably OCLC number, ZbMath, etc. all have come up more. Given minimal demand for the new identifiers (it’s not like adding these solves some urgent, much noticed issue), the fact that there are a ton of existing IDs and an expanding landscape of which ones are used and how, I’d suggest tabling these and thinking about a more global approach to IDs for 1.1+
No, I think an identifier variable that carries 'type’ and ‘value’ elements (or, probably better, it could be an object with key value pairs) would be better. Many of these identifiers are used in citations (e.g., ArXiV id).
For comparison, after discussion, Frank and I settled on legal-eid to include the ECLI (European Law Case Identifier) and other potential future electronic legal identifiers. He didn’t care for legal-identifier because that term might encompass things like a case’s Docket Number (CSL number). But, I think if we had a general identifier variable and documented it well, it could encompass both legal and other identifiers.
We should add an identifier or electronic-identifier or e-identifier variable. It would work like locator. It would have type-value elements or be structured as a set of key-value pairs.
The main question is exactly how to work with this in styles. Each identifier has its own formatting requirements (e.g., a URI scheme). The URI scheme should also inform the target of embedded links. I suggest we permit three forms: “uri”, “prefix”, and “bare”. Accomplishing this would be easiest handled with a new element similar to dates: cs:identifier.
We can curate a list of identifiers, prefixes, and URL schemes that are supported. This could be easily expanded over time. For identifiers not supported, all forms would just return “bare”.
The existing identifiers in CSL 1.0.1 could be grandfathered (DOI, PMCID, PMID, ISBN, ISSN) to be available through either cs:text or cs:identifier.
Thinking about the human-friendly input representation, I think we do want to keep the core ids as dedicated properties, and so treat a potential new identifier object as effectively for extended ids.
For larger process – I’m a little worried by the small number of people involved in this so far* and I think it’s worth having a more widely accessible comment period for a couple of weeks both as a check and as a general matter of transparency.
Ideally, we’d have a more readily human-readable summary of the planned changes for 1.0.2 that we could point people to and then various ways to engage. I don’t think we’ll be overrun, but given that CSL is used by >1m people, I think getting feedback from >10 would be good and worth a small delay & a bit of extra work.
* this is not at all intended as criticism of the work done so far; on the contrary, I think for getting the actual work done, a small group is preferable
I think that’s a very valuable suggestion! For 1.0.2 this should be rather trivial. But for 1.1 we should really do this. Ideally, the same document would later become the changelog?
I’m fine on waiting a bit. I just can’t commit to having time beyond July to stay as involved as I have for the past few months.
There are also some practical considerations around managing multiple branches in git for non-experts.
Keep in mind, the decisions we made were mostly closing stale issues that had been lingering for years, with substantial feedback on them, often building on discussions in other fora, like the Zotero forums.
Even the 1.1 changes are mostly implementing features already in CSL-M. The intext narrative citation support is already supported in multiple processors (pandoc, citeproc-rs I believe, for example), in fact.
I do think, once these two releases are done, we need to rethink process, and consider things like:
no longer doing x.x.x releases; just adding strings as they’re needed, and only version the spec and schemas (or version them differently) when we introduce features that impact processors; but this does raise question about how best to get feedback
formalize a test-driven process for the latter; which requires feedback from a different audience (developers)
solicit labor contributions for all this work from the big projects that rely on it (though I’m unsure how practical, or what the specific asks would be; at minimum it would likely be more contribution to PR reviews)
I don’t expect this to be super controversial. For 1.0.2 I’d imagine the main function would be to see if there’s a major thing we overlooked (which we could then either consider for 1.1 or decide it was crucial enough for 1.0.2).
I’m also just thinking about this in terms of governance – I think it’s the right thing to do to Tweet about this and to link it from the Zotero forums and to send an email to all implementers for which we have contact info. At least for the two former, a quick summary would be nice – we could recycle that for the announcement of the final version, which I think is a pretty big deal.