CSL JSON input

Sylvester_Keil · November 5, 2011, 3:50pm

Reading Frank’s post on abbreviations, I noticed the ‘isInstitution’ predicate on a name. Looking at

citation-style-language/schema/blob/master/csl-data.json

{
    "description": "JSON schema (draft 3) for CSL input data",
    "id": "https://github.com/citation-style-language/schema/raw/master/csl-data.json",
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "type": {
                "type": "string",
                "required": true,
                "enum" : [
                    "article",
                    "article-journal",
                    "article-magazine",
                    "article-newspaper",
                    "bill",
                    "book",
                    "broadcast",
                    "chapter",
                    "dataset",

This file has been truncated. show original

I also noticed ‘journalAbbreviation’ and ‘shortTitle’ even though most other terms use a minus instead to connect terms (this includes predicates, e.g., ‘comma-suffix’, ‘static-ordering’). Implementing the format and providing a consistent API interface therefore requires a ridiculous amount of converting back and forth between taxonomies and trying to sanitize input.

Now, I know that religious wars have been fought about this, however, it would make all our lives much easier if there were a standard rule for attribute names. Barring perhaps Lisp, minuses in names are not very practical, but since a majority of the terms use a minus (plus the CSL attributes do, too) I would suggest to stick with that convention.

Thoughts?

Sylvester

Bruce_D_Arcus1 · November 5, 2011, 4:04pm

The problem is there is no “CSL JSON”: there’s just some ad hoc stuff
that different people (mostly Frank, working apart from CSL per se)
have worked on. That schema, for example, merely documents what’s in
the test suite for purposes of validation. It is not really meant to
be normative.

I agree it would be nice to fix this.

Bruce

Bruce_D_Arcus1 · November 5, 2011, 4:05pm

Oh, and now that I see it, I really dislike the journalAbbreviation
predicate. Why did we include this???

Sylvester_Keil · November 5, 2011, 4:06pm

Since I’m going through this right now, I’ll post a list with all inconsistencies I notice (should I also post this somewhere on github?).

Bruce_D_Arcus1 · November 5, 2011, 4:13pm

One possible idea (don’t know if it’s a good one):

Fork the repo and just make the proposed changes on csl-data.json,
with a separate commit, and pull request, for each change?

But this assumes we all agree we need to rationalize this now. Am not
sure what Rintze and Frank think about this.

Otherwise, you could just itemize them on the wiki, or on a markdown gist?

Maybe start with the latter?

Bruce

Sylvester_Keil · November 5, 2011, 4:15pm

Right, I’ll do the latter and post the link here later on.

Bruce_D_Arcus1 · November 5, 2011, 4:43pm

Great.

FWIW, IF we want the possibility to use CSL JSON more widely (say as
microdata in HTML5?), we might want to adopt camel casing as the
preferred convention. If we did that, it would be easy enough to map
to CSL terms, where upper letters just get lower-cased and prepended
with a dash.

Bruce

Sylvester_Keil · November 5, 2011, 5:08pm

I started an informal Wiki page on the schema repository here:

Create new page · citation-style-language/schema Wiki · GitHub Nov 5, 2011, at 5:43 PM, Bruce D’Arcus wrote:

On Sat, Nov 5, 2011 at 12:15 PM, Sylvester Keil <@Sylvester_Keil> wrote:

Right, I’ll do the latter and post the link here later on.

Great.

FWIW, IF we want the possibility to use CSL JSON more widely (say as
microdata in HTML5?), we might want to adopt camel casing as the
preferred convention. If we did that, it would be easy enough to map
to CSL terms, where upper letters just get lower-cased and prepended
with a dash.

Yes absolutely – although I personally, irrationally, dislike camelCase (PascalCase is fine)

But as long as there is consistency, it’s very easy to convert back and forth between different conventions.

Because we have names like ISBN, we can’t just lower-case, but it’s still easy enough to map camel-cased words using regular expressions.

Bruce_D_Arcus1 · November 5, 2011, 5:41pm

Where camel casing is used, pascal casing is often used to distinguish
classes. This is common in the RDF world, where you might have an item
with a class like “EditedBook”, but properties like “isPartOf”.

But CSL is an awfully simple model. Still, that might be the way to go
in distinguishing types and properties.

Bruce

Bruce_D_Arcus1 · November 5, 2011, 5:42pm

Er …On Sat, Nov 5, 2011 at 1:41 PM, Bruce D’Arcus <@Bruce_D_Arcus1> wrote:

But CSL is an awfully simple model. Still, that might be the way to go
in distinguishing types and properties.

… variables.

Bruce

Sylvester_Keil · November 5, 2011, 5:48pm

I’m confused Are we talking about the processor input now or about styles?

Bruce_D_Arcus1 · November 5, 2011, 6:47pm

The former.

Bruce

Frank_Bennett · November 5, 2011, 7:39pm

Since I’m going through this right now, I’ll post a list with all inconsistencies I notice (should I also post this somewhere on github?).

One possible idea (don’t know if it’s a good one):

Fork the repo and just make the proposed changes on csl-data.json,
with a separate commit, and pull request, for each change?

But this assumes we all agree we need to rationalize this now. Am not
sure what Rintze and Frank think about this.

Readability is a good thing, I’ll be very happy to follow suit.

Rintze_Zelle · November 5, 2011, 11:49pm

I don’t have a strong opinion on this. csl-date.json is purely based on
what I found in the citeproc-js documentation, and I agree it could be
cleaned up a bit. But I do wonder how much hassle it is to change things
now that Zotero and Mendeley embed metadata in Word/LibreOffice documents
using this format. How would existing documents be dealt with?

Rintze

Frank_Bennett · November 6, 2011, 12:09am

I don’t have a strong opinion on this. csl-date.json is purely based on what
I found in the citeproc-js documentation, and I agree it could be cleaned up
a bit. But I do wonder how much hassle it is to change things now that
Zotero and Mendeley embed metadata in Word/LibreOffice documents using this
format. How would existing documents be dealt with?

We can always add a mapping layer to the implementations. Would it
make sense to add a version field to the input schema?

Carles_Pina_i_Estany · November 6, 2011, 12:04am

Hi,On Nov/05/2011, Rintze Zelle wrote:

I don’t have a strong opinion on this. csl-date.json is purely based on
what I found in the citeproc-js documentation, and I agree it could be
cleaned up a bit. But I do wonder how much hassle it is to change things
now that Zotero and Mendeley embed metadata in Word/LibreOffice documents
using this format. How would existing documents be dealt with?

New Mendeley versions could read old and new Json format. Not idea
(devel time, testing time, etc. but could be done).

Old Mendeley versions would not be able to read the new Json format.
Unless the new Mendeley writes the new Json with the old Json there
too, this is possible…

Topic		Replies	Views
json representation CSL Development	0	252	July 10, 2009
types CSL Development	6	227	August 15, 2007
CSL-JSON as JSON-LD	7	1053	February 1, 2021
comments CSL Development	4	233	July 23, 2007
Design Principles for CSL JSON CSL Development	76	2316	July 20, 2020

CSL JSON input

Related topics