Citation Style Language

Citekey variable?

Practically speaking, it’s only a local token to link citation and source.

Which is what a citekey is.

Aside: early on, I went out of my way to avoid borrowing from bibtex :wink:

I don’t have a fundamental objection to using id for this, but I think you’re making this sound too simple:
Before turning an auto-generated, non-displayed variable into something that’s displayable and user-configurable (that can be duplicate in a document, e.g. if multiple people edit a document), it’d be worth checking with implementations on how they use it and if that’d cause problems.

I don’t think Zotero does much with the id – they use the uri they display outside of the schema JSON for linking, but Mendeley might, for example, and we don’t have much insight on other tools like paperpile and Readcube/Papers on this.

And even where the application doesn’t use it much, if this is going to be in 1.0.2 it may still require more changes to word processor integration to deal with old ids in documents, e.g., than would be desirable for a minor update.

Just to clarify: I, at least, was not talking about changing the ways how Zotero interacts with word processors. All I was saying is this: If Zotero added a citekey field, this can just be mapped to CSL id at export time. But that should not affect interaction with word processors at all.

Admittedly, I hadn’t thought about that.

I see three options:

  1. do nothing (id was designed, from the beginning, with this case in mind)
  2. do nothing with csl-data.json, but add a note in the spec (what I suggested above)
  3. add a new variable to hold a human-friendly id (say citekey), that may-or-may-not be the same as the id (what you suggested at top)

There are problems with all three, but I think we should do 2 or 3, and still lean towards 2.

I agree we should have feedback, but the only project developers who’ve been around here and active recently is @Dan_Stillman, who I am certain has thought about this issue, given that’s come up in the zotero forums since the beginning.

On implementers: we really need to get them added to a PR review team, and a commitment to actually provide feedback, but I don’t even know at this point who we contact.

Right, I don’t think it should have to change, but since id is used in the current data model in the word processor integration, I’m wondering if using it for citekey would require a change.

1 Like

Isn’t it already used for citekey, e.g. by pandoc.

But that’s just because that’s how pandoc understands internal ids. Using ids for citekeys means we’re changing how that variable is expected to work - it can now be printed, it can be changed by a user, it can be changed in an existing document, etc. There’s a fair chance that will cause problems for applications that in some way have relied on the previous behavior.
It’s possible that’s not the case, in which case I think Bruce’s 2) is definitely the right choice, but if it will require significant adjustments by implementers, that’d strengthen the case for 3)

How does citation-label fit?

I think citation-label has a different purpose.
That said, I still don’t really understand what the intended purpose of a citekey variable would be that id can not cover. Can someone give an example?

citation-label is entirely separate, at least as currently described:

citation-label
label identifying the item in in-text citations of label styles (e.g. “Ferr78”). May be assigned by the CSL processor based on item metadata.

So if you have a trigraph style (like American Mathematical Society or the DIN one we currently have available), you’d need both that and a citekey

It’s not about not “cover”. it’s about causing implementation issues given existing uses of id. E.g., if, say, Mendeley uses a UUID for id that’s used for connecting citations to sources, it can’t just map it to citekey.

Edit: oh, I think I missed the question, which is: why should the citekey be available to CSL styles other than as a citekey? That’s because it’d greatly facilitate traveling between Word and plaintext applications. Has come up a number of times as a request: You could author in Word, then set the citations to citekey and easily convert to LaTeX.

1 Like

Ok, that’s a good use case. It could be possible to do the conversion with pandoc, and extract even the citations with some filter trickery. @retorquere has written a pandoc lua filter that goes the opposite way: produce a word file with linked zotero citations from a markdown source with pandoc style citations. So maybee …

Let’s say we added citekey as a variable for use in style, what would that mean for CSL JSON export? Will that citekey variable replace `id or will we have both variables?

Zotero’s CSL JSON current exports the item URI as id. Only BBT’s CSL JSON uses the citekey. I think leaving that to implementers (i.e. Emiliano and Zotero) is perfectly fine; I don’t see why we would need to prescribe this either way.
edit: I guess what we do want to prescribe is that they have both and they can but needn’t be identical?

Really? I have just done that, but no URI in id… (maybe BBT patches the standard export translators?)
Same with bibtex by the way…

(I’m using Juris-M, but I don’t think that should make a difference concerning this.)

Maybee… https://retorque.re/zotero-better-bibtex/installation/preferences/hidden-preferences/#citeprocnotecitekey

Additionally, BBTs Better CSL exporters replace the id with the citekey, but I don’t touch the CSL as it moves through zotero, except for that hidden Pref above, and I don’t touch the standard CSL export.

Zotero and Mendeley both use their own “uri” field to manage items in document. Dan Stillman recently commented somewhere that they really only include an “id” field because it is required by spec.

(I haven’t checked out Mendeley’s new citation plugin, but I’d be surprised if this had changed.)

pandoc and citation.js seem to be the major consumers of the id field. pandoc treats it as a citekey.

citation.js unfortunately seems to treat citation-label as a citekey before falling back to id as a citekey: https://github.com/citation-js/citation-js/blob/master/packages/plugin-bibtex/src/output/label.js. We should recommend they change that behavior regardless.

Does anyone have a document with Papers citation data in it or know if their data model is documented somewhere?

If a citekey field were added to the data schema, would it also need to be in the CSL style schema? My thought would be “no”.

(I think that was schema, not id.)

In a word processor document, Zotero currently uses the numeric itemID for id. But what we do there isn’t really relevant — we can adjust as necessary. We embed URIs in the document that we use to reliably link to items.

When exporting CSL-JSON generally, however, we use a URI for id. While we don’t currently use that in any other way, it’s worth noting that, if id became a user-editable citekey, there wouldn’t be a guaranteed-permanent reference back to the Zotero item from the CSL-JSON entry. That might be an argument for keeping id as an application-provided identifier and adding a separate citekey field that was meant to be exposed to the user.

This sort of depends on your view of the world:

  1. In a BibTeX-based workflow (as I understand it), you basically make sure that the citation key is set before any export and never touch it again, so the citekey effectively functions as a permanent identifier.

  2. If you view CSL-JSON as a data exchange format, a user-editable field, even if it’s semi-permanent, isn’t the same as an application-enforced identifier that’s guaranteed never to change.

  3. If you view CSL-JSON as primarily focused on citations, a citekey field, even if user-editable, might make sense as the primary identifier, similar to BibTeX, and permanent linking to internal database entries is simply out of scope.

I don’t have a strong opinion on this. I’d be inclined to go with (2) on principle, but in Zotero we don’t recommend CSL-JSON as a recommended data exchange format beyond citation processing, and that’s not going to change, so (3) seems like a reasonable position.

I’ve obviously been thinking in terms of 3, but I also in the end don’t have a strong opinion.

If we want to allow 2, then makes sense to add the new variable.

But what about a label style that prints the citekey as the label? That key then needs to be unique within the document.