Dataset items not showing a URL quick test

Hi,

I didn’t know where to share this information (which Discourse forum or on GitHub) and if this is of any interest (or it might be known).

I know that the “dataset” type is sadly not widely used by reference managers (at least not by Zotero or Mendeley). But at least using Zotero with Better BibTeX it’s possible to change the document type at the time of export.

Anyway, I did a quick test on the existing CSL styles using two very simple documents. I counted how many CSL styles don’t output the URL in the bibliography (webpage and dataset types). The results:

Number of tested styles: 2273
URL missing from webpage: 360
URL missing from dataset: 1159
URL missing from dataset, exists in webpage: 801
URL missing from webpage, exists in dataset:  2 ['embo-press.csl', 'bursa-uludag-universitesi-fen-bilimleri-enstitusu.csl']

I was even wondering if something should be done pro-actively… like editing these 801 CSL styles to make “dataset” look like “webpage” if “dataset” type is not mentioned in the style? I’m checking manually and often the CSL file doesn’t have the “dataset” type mentioned at all. It would not be perfect, but just wondering.

PS: I used a document such as:
[
{
“title” : “Test dataset”,
“id” : “Test01”,
“author” : [
{
“given” : “John”,
“family” : “Smith”
}
],
“type” : “dataset”,
“URL”: “https://zenodo.org/sometest
}
]

1 Like

That’s a good programmatic solution to improving dataset citations in absence of manual updates.

For fun and to see how it would look I’ve done a branch:

A diff to citation-style-language/master:

Before:
URL missing from webpage: 360
URL missing from dataset: 1161
URL missing from dataset, exists in webpage: 803
URL missing from webpage, exists in dataset: 2

After:
URL missing from webpage: 360
URL missing from dataset: 649
URL missing from dataset, exists in webpage: 291
URL missing from webpage, exists in dataset: 2

Local rake execution doesn’t complain.

I’ve added “dataset” only for the if/else-if of type “match=“any””.

I suspect that another batch of styles would improve if I had converted lines like:
<if type="webpage">

to:
<if type="webpage dataset" match="any">

If there was real interesting for this: I would cross check the styles that have been edited with the styles that have been fixed. Technically a style that has been edited could still not show the URL after the edit. I didn’t check this yet.