Citation Style Language

Preprint/dataset repository: What CSL variable?

Cross-posted at https://github.com/citation-style-language/styles/issues/4367

Encountering a wrinkle with preprints and datasets. From most preprint repositories, the name of the server is getting stored by Zotero’s translators as container-title . This had seemed like a good idea (ala Journal titles). However, I am now seeing that some preprints and many datasets import with container-title used to indicate the collection or larger body of works the item belongs to (e.g., doi.org/10.3886/ICPSR35531.v3).

Conceptually, I think the preprint/data repository makes good conceptual sense as either archive or publisher rather than container-title (this is, for example, how the OSF repositories export in their metadata [as publisher ], and CrossRef’s posted content type uses “group-id”, which would fit well with publisher ; ICPSR stores its name in publisher ). Most citation styles print publisher , so it seems like this would work well without many changes to existing styles.

Could we come to a consensus as to where these best belong?

I didn’t realize we were storing these as container-title/publication-title – where?
I think repositories (for both data and preprints) should absolutely be publishers. Given the peculiarities of actual archival citations, I’d much prefer not to use that field for anything other than traditional archives.

I guess its just arXiv (http://arxiv.org/abs/1910.00533) and bioRxiv (https://www.biorxiv.org/content/10.1101/805705v1) for preprints. Both import to Zotero as Journal Articles with the server in Publication. (The handful of others I checked import as either Report or Journal Article with server in Publisher [potentially getting dropped by Zotero].)

Regarding archive, using archive and archive_location, I’d previously been assuming that they could refer to either databases/electronic archives and locators or to physical archives (with archive-place being used to distinguish the two). Cf., https://github.com/citation-style-language/styles/blob/c3fd4bdeadbfc4a713284ad15cca64c8198a7dc7/apa.csl#L474

I think this usage is both conceptually meaningful (an electronic archive is an archive) and necessary. For example, thesis may require both referring to the institution awarding the degree (publisher) and the database/archive where it is deposited (currently archive). For example, this example, from APA7:

Hollander, M. M. (2017). Resistance to authority: Methodological innovations and new
lessons from the Mi/gram experiment (Publication No. 10289373) [Doctoral dissertation,
University of Wisconsin-Madison]. ProCuest Dissertations and Theses Global. https://search.proquest.com/docview/1975464986

Similar things might occur when citing reports, if the publishing institution of the report and where it is deposited are different (e.g., https://apps.dtic.mil/docs/citations/AD1053350). From my understanding of how things are cited in history, etc., this should work, too, for physical archives or electronic archives (e.g., a museum archive published online).

Archive/Location in Archive is also where, for example, Zotero has been storing nucleotide information in the NCBI Nucleotide Database (https://www.ncbi.nlm.nih.gov/nuccore/66819080)

An alternative option would perhaps be to use source (Zotero: libraryCatalog/Library Catalog) and call-number (Zotero: callNumber/Call Number), but these fields are really commonly used by users and institutions for internal organizational schemes, so I am cautious about those fields showing up in citations.