Citation Style Language

Citekey variable?

Just to clarify: I, at least, was not talking about changing the ways how Zotero interacts with word processors. All I was saying is this: If Zotero added a citekey field, this can just be mapped to CSL id at export time. But that should not affect interaction with word processors at all.

Admittedly, I hadn’t thought about that.

I see three options:

  1. do nothing (id was designed, from the beginning, with this case in mind)
  2. do nothing with csl-data.json, but add a note in the spec (what I suggested above)
  3. add a new variable to hold a human-friendly id (say citekey), that may-or-may-not be the same as the id (what you suggested at top)

There are problems with all three, but I think we should do 2 or 3, and still lean towards 2.

I agree we should have feedback, but the only project developers who’ve been around here and active recently is @Dan_Stillman, who I am certain has thought about this issue, given that’s come up in the zotero forums since the beginning.

On implementers: we really need to get them added to a PR review team, and a commitment to actually provide feedback, but I don’t even know at this point who we contact.

Right, I don’t think it should have to change, but since id is used in the current data model in the word processor integration, I’m wondering if using it for citekey would require a change.

1 Like

Isn’t it already used for citekey, e.g. by pandoc.

But that’s just because that’s how pandoc understands internal ids. Using ids for citekeys means we’re changing how that variable is expected to work - it can now be printed, it can be changed by a user, it can be changed in an existing document, etc. There’s a fair chance that will cause problems for applications that in some way have relied on the previous behavior.
It’s possible that’s not the case, in which case I think Bruce’s 2) is definitely the right choice, but if it will require significant adjustments by implementers, that’d strengthen the case for 3)

How does citation-label fit?

I think citation-label has a different purpose.
That said, I still don’t really understand what the intended purpose of a citekey variable would be that id can not cover. Can someone give an example?

citation-label is entirely separate, at least as currently described:

label identifying the item in in-text citations of label styles (e.g. “Ferr78”). May be assigned by the CSL processor based on item metadata.

So if you have a trigraph style (like American Mathematical Society or the DIN one we currently have available), you’d need both that and a citekey

It’s not about not “cover”. it’s about causing implementation issues given existing uses of id. E.g., if, say, Mendeley uses a UUID for id that’s used for connecting citations to sources, it can’t just map it to citekey.

Edit: oh, I think I missed the question, which is: why should the citekey be available to CSL styles other than as a citekey? That’s because it’d greatly facilitate traveling between Word and plaintext applications. Has come up a number of times as a request: You could author in Word, then set the citations to citekey and easily convert to LaTeX.

1 Like

Ok, that’s a good use case. It could be possible to do the conversion with pandoc, and extract even the citations with some filter trickery. @retorquere has written a pandoc lua filter that goes the opposite way: produce a word file with linked zotero citations from a markdown source with pandoc style citations. So maybee …

Let’s say we added citekey as a variable for use in style, what would that mean for CSL JSON export? Will that citekey variable replace `id or will we have both variables?

Zotero’s CSL JSON current exports the item URI as id. Only BBT’s CSL JSON uses the citekey. I think leaving that to implementers (i.e. Emiliano and Zotero) is perfectly fine; I don’t see why we would need to prescribe this either way.
edit: I guess what we do want to prescribe is that they have both and they can but needn’t be identical?

Really? I have just done that, but no URI in id… (maybe BBT patches the standard export translators?)
Same with bibtex by the way…

(I’m using Juris-M, but I don’t think that should make a difference concerning this.)


Additionally, BBTs Better CSL exporters replace the id with the citekey, but I don’t touch the CSL as it moves through zotero, except for that hidden Pref above, and I don’t touch the standard CSL export.

Zotero and Mendeley both use their own “uri” field to manage items in document. Dan Stillman recently commented somewhere that they really only include an “id” field because it is required by spec.

(I haven’t checked out Mendeley’s new citation plugin, but I’d be surprised if this had changed.)

pandoc and citation.js seem to be the major consumers of the id field. pandoc treats it as a citekey.

citation.js unfortunately seems to treat citation-label as a citekey before falling back to id as a citekey: We should recommend they change that behavior regardless.

Does anyone have a document with Papers citation data in it or know if their data model is documented somewhere?

If a citekey field were added to the data schema, would it also need to be in the CSL style schema? My thought would be “no”.

(I think that was schema, not id.)

In a word processor document, Zotero currently uses the numeric itemID for id. But what we do there isn’t really relevant — we can adjust as necessary. We embed URIs in the document that we use to reliably link to items.

When exporting CSL-JSON generally, however, we use a URI for id. While we don’t currently use that in any other way, it’s worth noting that, if id became a user-editable citekey, there wouldn’t be a guaranteed-permanent reference back to the Zotero item from the CSL-JSON entry. That might be an argument for keeping id as an application-provided identifier and adding a separate citekey field that was meant to be exposed to the user.

This sort of depends on your view of the world:

  1. In a BibTeX-based workflow (as I understand it), you basically make sure that the citation key is set before any export and never touch it again, so the citekey effectively functions as a permanent identifier.

  2. If you view CSL-JSON as a data exchange format, a user-editable field, even if it’s semi-permanent, isn’t the same as an application-enforced identifier that’s guaranteed never to change.

  3. If you view CSL-JSON as primarily focused on citations, a citekey field, even if user-editable, might make sense as the primary identifier, similar to BibTeX, and permanent linking to internal database entries is simply out of scope.

I don’t have a strong opinion on this. I’d be inclined to go with (2) on principle, but in Zotero we don’t recommend CSL-JSON as a recommended data exchange format beyond citation processing, and that’s not going to change, so (3) seems like a reasonable position.

I’ve obviously been thinking in terms of 3, but I also in the end don’t have a strong opinion.

If we want to allow 2, then makes sense to add the new variable.

But what about a label style that prints the citekey as the label? That key then needs to be unique within the document.

I guess in my view, I mainly see CSL-JSON as filling a similar role as a .bib file for BibTeX.

An option could be to add a citekey variable to the CSL-JSON spec, with the indication that it could be used to in the same manner as a BibTeX citekey with id as a fallback if citekey is unavailable.

That would involve primarily minor changes with two players:

  1. pandoc adding citekey as primary source for the citation key. This would be easy enough as existing documents relying on id would still work.
  2. citation.js switching citation-label to citekey, which they should anyway.

But what about a label style that prints the citekey as the label? That key then needs to be unique within the document.

I think this is something we should explicitly not do. Showing the citekey in the document is the realm of un-rendering a citation back into its raw identifier. That’s not a label style ala DIN 1950. If anything, it would a separate recommended API to produce a cite placeholder ala that used by pandoc or Sebastian/Frank’s ODF Scan plug-in for Zotero.


+1 for me on this — this seems like the most pragmatic approach to me with very little downsides. Also agree with the 2nd part – that’s why we have citation-label as a separate variable.