CSL Discourse

Proposal: Make style ids immutable


#1

As I’ve noted in the past, it’s really inconvenient that style ids can change, and dealing with it creates extra complexity everywhere styles are handled, leading to bugs that cause unpleasant experiences for users. It’s also redundant to have a user-friendly id when there’s already a human-friendly title. The id should be for computers, and the title should be for humans.

It’d be much easier if this worked more like Zotero translator IDs, where ids are immutable, dependencies use those ids, and filenames are irrelevant. Then styles on GitHub could be renamed as desired without requiring downstream code to do anything to handle those changes.

As things are, to properly fix the issue in the above thread, the Zotero client would have to download and store updated renamed-style data from the repo on every style update check (instead of bundling it with version upgrades as it does now). That’s more development, complexity, and potential for bugs to solve a problem that doesn’t need to exist.

So here’s what I’d propose:

  1. Freeze all existing style ids as they are now. In an ideal world they’d all be changed to whatever new format we decided on (e.g., a UUID), but we don’t want to break existing clients or require massive changes across the whole CSL ecosystem. Fortunately this isn’t a problem, because the id is for computers, not people. http://zotero.org/styles/apa is just a string, no different from 90320768-db08-4d22-917c-4b7714273ff4. If a journal with an existing style changes its name, too bad. The title and filename can be updated, but the id has to stay the same.
  2. Make sure all implementers are treating ids and independent-parent as opaque strings, not dereferenceable URLs, and not interpreting the suffix as a meaningful short name. No assumption should be made that the filename on GitHub will remain consistent. Implementers can name local styles however they like (e.g., a filename-safe version of the title).
  3. Recommend UUIDs for new styles.
  4. Leave renamed-styles.json in the repo for the time being, or have existing implementers mirror a static copy of it to deal with old dependencies and then delete it. The advantage of the latter is that new implementers wouldn’t think they had to handle these mappings.

A note about independent-parent: there’s currently an ambiguity regarding whether the URI in independent-parent is meant to be understood as an id or a download location. In Zotero — and perhaps other tools — it’s both: it’s used as an id to identify a parent style that’s already installed, but it’s treated as a URI to download a missing style. That really shouldn’t be the case. (I’m not sure what happens if the id of the style that’s downloaded doesn’t match the independent-parent value, but probably nothing good.) If we switch to always considering the id an opaque string, I think we’d have to say that independent-parent had to exist in the official CSL repository (or an implementer’s mirror of it) so that it could be retrieved by id if necessary. While that’s a small bit of centralization, it also seems acceptable for dependent styles, since the whole point is that they allow for styles to be named after journals while tracking one of the more common styles. If you really need your style to track some style that isn’t in the main repo, you can just copy it and change id and title (or automate that) rather than making a dependent style.


#2

We also use renamed-json.json to offer redirection for styles we delete from the repository (e.g. for journals that stop publishing, or when organizations simplify the number of styles they use (e.g. ACS earlier this year standardized onto a single citation format)). Immutable IDs wouldn’t help with that.

P.S. Of course, we don’t really need to provide an update path for deleted styles, and we can also suggest implementations to fall back on e.g. APA if an existing style is no longer available.


#3

And for tracking independent parent styles, we could just add a second element to separate the ID from the retrieval link, e.g.:

<title>Nature Biotechnology</title>
<id>90320768-db08-4d22-917c-4b7714273ff4</id>
<link href="http://www.zotero.org/styles/nature-biotechnology" rel="self"/>
<parent-id>73f1d413-9249-4464-8f40-4e77de7f95b0</parent-id>
<link href="http://www.zotero.org/styles/nature" rel="independent-parent"/>

instead of

<title>Nature Biotechnology</title>
<id>http://www.zotero.org/styles/nature-biotechnology</id>
<link href="http://www.zotero.org/styles/nature-biotechnology" rel="self"/>
<link href="http://www.zotero.org/styles/nature" rel="independent-parent"/>

#4

Yeah, I don’t think redirection for deleted styles is necessary — it’s much more straightforward for the client to just have the user reselect a valid style.

We could — and then the <link> would remain a deferenceable URL, as you sort of expect it to be. If we did that, I think we’d have to say that an independent-parent in the absence of a <parent-id> was also an id so that existing styles without <parent-id> continued to work.

There’s also the related issue of the rel="self" link. I don’t think we use that in Zotero — we update central styles by id from our own repo, and we never implemented updating of externally hosted styles. But if any clients do rely on those, they’d become more important without ids that were also deferenceable URLs. For central styles, new links would need to be in the form https://www.zotero.org/styles/90320768-db08-4d22-917c-4b7714273ff4, and we’d add a fallback to the Zotero repo for legacy ids so that https://www.zotero.org/styles/apa continued to work rather than it needing to be https://www.zotero.org/styles/http%3A%2F%2Fwww.zotero.org%2Fstyles%2Fapa (as logically it should be). Pre-UUID styles that were renamed would still retain their old URLs based on the legacy ids. (If we really didn’t like that, we could discuss alternatives, but accepting frozen ids and being content with updated titles is sort of the point here.)