Proposal: Make style ids immutable

Well, I tried. But if we keep around the current style IDs, I see no benefit of adopting UUIDs for new styles.

Having two ID schemes would complicate CSL’s own workflows, UUIDs wouldn’t represent the majority of IDs for a long time (as there isn’t much journal turnover and we’re mostly just adding new styles for journals; see https://pinux.info/csls_counter/ for historic style count growth), and there would be no path to phase out the current style ID format. It would be much simpler to keep things as they are and just treat style IDs as immutable once assigned (or only change them in a very limited fashion under the realization that it would break updating/retrieval in clients), and maybe get rid of renamed-styles.json.

Not reading this whole thread, so sorry if this is off-base, but just to throw in two cents …

The original intention was these ids were pure, immutable, URI identifiers, as URIs are supposed to be.

Where things got complicated is the desire to have them simultaneously resolve to the right document.

But there’s no technical issue with keeping them as is; right? It’s really just an effectively aesthetic issue to do with human expectations?

Could also change the content of the id element to allow uuids, and encourage people to use them going forward?

We could use uuids using the urn uuid scheme on the current id element?

This is valid against the current schema:

    <id>urn:uuid:16479a02-9e0d-11ea-b1c0-f875a4261b65</id>

So keep the current schema definition, and change the doc from this:

    ## Specify the URI to establish the identity of the style. The URI
    ## should be stable, unique and dereferenceable URI.
    element cs:id { xsd:anyURI }

… to something like:

    ## Specify the URI to establish the identity of the style. The URI
    ## must be stable and unique; preferably a urn uuid per RFC 4122. See:
    ## https://tools.ietf.org/html/rfc4122
    element cs:id { xsd:anyURI }

And so freeze current http URI ids, and strongly recommend (or even require for the styles we accept) the above going forward? We could even have a script substitute the uuid if not present.

It’s a similar problem, though — without redirection, if the filename might change, then rel="self" can’t mirror the filename it or things will break. I said above that rel="self" needs to point to the current version of the file, but what I really meant was that it needs to resolve to the current version of the file. A journal can use <link rel="self" href="https://example.com/jofb.csl"/> and later change it to <link rel="self" href="https://example.com/journal-of-foo-bar.csl"/>, but only if they add a redirect in .htaccess.

The difference from id is that, unlike many CSL clients, a central repo can be guaranteed to receive full updates from git more or less atomically. So while the id used in clients should never change, it would be acceptable to update rel="self" as long as renamed-styles.json was updated at the same time. An HTTP request for /jofb would then return a redirect to /journal-of-foo-bar.

In the Zotero client and styles repo, we would just stop updating renamed-styles.json and treat it as a static list of id mappings going forward. The Zotero styles page would get renamed-styles.json updates at the same time as style updates and could therefore immediately redirect old links.

There’d still be a bunch of benefits:

  1. Authoring tools could just generate them automatically — including for editing of existing default styles, which almost uniformly results in people failing to update the id and having their style overwritten by Zotero.

  2. For new styles, it would avoid adding a string that looked like a resolvable URL but that could never change, whereas rel="self" could (as I describe above).

  3. There wouldn’t be confusion about whether it was necessary to use an http://www.zotero.org URI for a custom style hosted elsewhere.

That approach with renamed-styles for rel="self" sounds good. My key concern here is that these need to be human readable because they are entered and read by humans in pandoc workflows. Keeping with the current filename and having a redirect fed by renamed-styles would work well to that end.

Dan’s points 1 and 3 above are big points of confusion for lots of people customizing styles on the Zotero forums. Probably about 30% of people need repeated explanation of how and why to change the style ID.

But nothing prevents authoring tools from using UUIDs for style IDs right now. (although I guess a benefit of accepting UUIDs as IDs for the official style repository would be that an edited style with a unique UUID could always retain its ID through admission to the repository, which currently isn’t always the case as we tweak style titles and file names)

I don’t follow. If the copy of renamed-styles.json in the style repo becomes static, where do the updates for the Zotero styles page come from?

Sorry, I was referring to the Zotero styles repo — i.e., the thing the Zotero client connects to for updates. Not the CSL styles repo.

13 posts were split to a new topic: Versioning CSL styles

PR for spec.

A though just occurred to me–rather than keeping renamed-styles.json around at all, why don’t we just make dependent styles for the old style ids? These dependent styles could be stored in a special folder like legacy-ids or renamed-styles indicating that they should not be removed/changed?

cf. https://github.com/citation-style-language/styles/issues/4903

The resolution to this was not to change the schema, but to encourage UUIDs for the id value.

The styles repo can followup by requiring UUIDs, if that’s the preference.

@bwiernik - I don’t have any input on that issue; really up to client app devs.

We never resolved the issues with dependent styles here, and the documentation change linked above left dependent-style handling in somewhat of an unimplementable state. cs:id is now described as an id, and a UUID is recommended, but independent-parent still refers to a URI. Any new styles created using UUIDs won’t work as parents unless clients ignore that and treat independent-parent as an id from the official repository, and a client can’t both retrieve a parent style from an unofficial source and identify it locally unless the parent style goes against the current spec and uses the same URI (which also can’t change) as the id.

If nothing else, independent-parent needs to be described as an id, in which case clients would be responsible for finding the style with that id, most likely in the official repository. I’m OK with that limitation — an unofficial dependent style could refer to an official style or it could just be an independent duplicate. But if we wanted to support retrieving parent styles from unofficial sources, we’d have to get into some of the changes discussed above re: adding a second element to specify the parent id separately from the URI.

1 Like

I think limiting dependents to CSL repository styles is fine.

If we wanted to support independent styles hosted not on the repo, then we could add a new rel option “independent-parent-uri” indicating where it should be retrieved from (and where update checks should be directed). The specification could then clarify which rel options should be treated as flat IDs and which should be treated as resolvable URIs.

A related question for styles with UUID id— what is the rel=“self” link for? Is it just a copy of id or is it intended as the URI where where the style can be found? I think the intention is the latter, correct?

If that is the case, another option to make things consistent would be to add the rel attribute to id as well as link. The current id becomes rel=“self”. rel=“independent-parent” and “template” would indicate the ids for those related styles. Then id would always be understood as a flat id and link would always be understood as a resolvable URI. This might be too disruptive to be worth it.

(Even if we stick with everything be a rel in link, making the rel=“self” meaning would be good—e.g., Zotero could in the future do update checks for non-repo custom styles based on the rel=“self” link.)

Yes, rel="self" is where the style can be found. It doesn’t really matter for official styles, but it’s needed for non-repo updating. We’ve just never gotten around to implementing that in Zotero.

I wouldn’t do that. rel on link feels natural because it’s borrowed from HTML. I think it’d be awkward to add it to id and have multiple id elements. id is clear — no reason to mess with it.

Rintze suggested <parent-id> above, allowing the rel="independent-parent" link to remain a download URL. But unless anyone wants to make a strong case for non-repo parents (non-repo dependents would still be fine), I’d say we ignore it and just update the documentation to clarify that independent-parent is an id. (It doesn’t even technically have to be an id in the official repo. It would just be the client’s job to find the style.)

I agree that updating the documentation to clarify that independent-parent is an id is a good plan.

I don’t see much downside to adding independent-parent-uri to indicate the URI where a non-repo indepdent parent can be found. It opens the possibility of some other party hosting its own independent parent and dependent styles without much extra burden on clients.

If we wanted to support non-repo parents, <parent-id> would be cleaner, since it would allow all <link>s to remain URLs, which would be more consistent.

I think it’d only make sense to specify rel="independent-parent" as an id if we didn’t actually care about supporting non-repo parents and just wanted to fix the spec.

Also, as I say above:

I don’t see much downside to that then so long as we include the behavior if parent-id is missing.