Citeproc json data input specs

Hello,

I am currently working on integrating CSL/CiteProc into my web application,
starting with citeproc-php (and looking forward to the new release of
citeproc-hs!). Now I run into a very basic question: what is the structure
of the json data that is fed into citeproc-js / citeproc-php ? The test
cases only cover tiny bits of it, but I would need a complete specification
of the structure. Is is MODS converted to JSON?

Thanks for a clarification,

Christian–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5135372.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Yeah, this keeps coming up. We probably need to do something about
this. Rintze, Frank, what do you think is the easiest, quickest, way
to put this together?

One option I’d earlier thought about was to just define it in RNC, and
perhaps create an XSLT to convert the XML version of it to some kind
of JSON schema and documentation? Does that seems feasible?

Bruce

no, MODS is only read by citeproc-hs.

While it is true that the JSON data structure is not documented, still
you can find all the needed information in the available
documentation.

The input object - the list of references - is modelled on the CSL
list of variables:
http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-variables.rnc

The cs-names object (author, etc) is documented in the citeproc-js
documentation:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#id25

Dates are documented here:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#input-dates

The type object (a string) can have the value listed here:
http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-types.rnc

The citation-items object, and the citations one, which store the
list of cites (or groups of citations, as you prefer) and provide some
commands for the processor, are described in the citeproc-js
documentation and the tests’ documentation.

Hope this helps.

Andrea

ps: the Haskell implementation of the JSON data structure is based on
that documentation. Still, Haskell is a strongly typed language, and I
would like to have a specification of the JSON data type in order to
avoid that lack of uniformity (some time an “id” is an Integer, others
a string… there are things that may range from numbers, to string or
even bools) you may find in the test suite. But PHP is well know for
not suffering of such a problem…:wink:

Andrea, thank you for this very helpful information!

My own data model is very primitive - it is basically an extended BibTeX
schema, which I now need to map to this more sophisticated data model.

Thanks again,

Christian–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5136743.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Could you see if you can document the bibtex --> csl mapping while you
do it? We’re going to need to document this, and others.

Bruce

Sure, I’ll be happy to…

C.–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5136926.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Here’s the first installment:

$this->type_data = array (
    "article" => array (
      "label"         => _("Article"),
      "bibtex"        => true,
      "citeProcType"  => "article-journal"
    ),
    "book" => array (
      "label"         => _("Book (Monograph)"),
      "bibtex"        => true,
      "citeProcType"  => "book"
    ),
    "booklet" => array (
      "label"         => _("Booklet"),
      "bibtex"        => true,
      "citeProcType"  => "pamphlet"
    ),
    // non-standard
    "collection" => array (
      "label"         => _("Book (Edited)"),
      "bibtex"        => false,
      "citeProcType"  => "book"
    ),
    // non-standard use: normally same as "proceedings"
    "conference" => array (
      "label"         => _("Conference Paper"),
      "bibtex"        => true,
      "citeProcType"  => "paper-conference"
     ),
    "inbook" => array (
      "label"         => _("Book Chapter"),
      "bibtex"        => true,
      "citeProcType"  => "chapter"
    ),
    // non-standard
    "incollection"  => array (
      "label"         => _("Chapter in Edited Book"),
      "bibtex"        => true,
      "citeProcType"  => "chapter"
    ),
    "inproceedings" => array (
      "label"         => _("Paper in Conference Proceedings"),
      "bibtex"        => true,
      "citeProcType"  => "chapter"
    ),
    // non-standard
    "journal" => array (
      "label"         => _("Journal Issue"),
      "bibtex"        => false,
      "citeProcType"  => "???" // => type: periodical?
      ),
    // non-standard use
    "manual" => array (
      "label"         => _("Handbook"),
      "bibtex"        => true,
      "citeProcType"  => "book"
    ),
    "mastersthesis" => array (
      "label"         => _("Master's Thesis"),
      "bibtex"        => true,
      "citeProcType"  => "thesis"
      ),
    "misc" => array (
      "label"         => _("Miscellaneous"),
      "bibtex"        => true,
      "citeProcType"  => "manuscript" // ????
    ),
    "phdthesis" => array (
      "label"         => _("Ph.D. Thesis"),
      "bibtex"        => true,
      "citeProcType"  => "thesis"
    ),
    "proceedings" => array (
      "label"         => _("Conference Proceedings"),
      "bibtex"        => true,
      "citeProcType"  => "book"
    ),
    "techreport" => array (
      "label"         => _("Report/Working Paper"),
      "bibtex"        => true,
      "citeProcType"  => "report"
    ),
    "unpublished" => array (
      "label"         => _("Unpublished Manuscript"),
      "bibtex"        => true,
      "citeProcType"  => "manuscript"
    )
  );

As you can see, the BibTeX types are much fewer than the CSL types, but I
still have a couple of problems mapping my own version of the BibTeX types.

  • There isn’t a notion of “periodical” or “serial” (like journals) in
    BibTex, but I also couldn’t find one in the CSL specs.
  • Is a “booklet” a “pamphlet” or a “book”?
  • What would be the equivalent of “misc”, stuff that doesn’t really fit
    anywhere else. But then, I have a hard time of thinking of anything that
    doesn’t fit :wink:
  • I guess the difference between edited books and monographs does not need
    to be expressed at the level of the reference type, but is figured out from
    whether there is an author or an editor?

More later,

C.–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5136993.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

As you can see, the BibTeX types are much fewer than the CSL types, but I
still have a couple of problems mapping my own version of the BibTeX types.

  • There isn’t a notion of “periodical” or “serial” (like journals) in
    BibTex, but I also couldn’t find one in the CSL specs.

container-title

  • Is a “booklet” a “pamphlet” or a “book”?

I’d probably say the former. Not really sure what a “booklet” is :slight_smile:

  • What would be the equivalent of “misc”, stuff that doesn’t really fit
    anywhere else. But then, I have a hard time of thinking of anything that
    doesn’t fit :wink:

Yeah, I always hated the “misc” type, so there is none. I believe we
have a “document” type (?) that might work.

  • I guess the difference between edited books and monographs does not need
    to be expressed at the level of the reference type, but is figured out from
    whether there is an author or an editor?

Yes.

Bruce

Thanks. here is the mapping data for “fields”. Where the “csl” key is defined
as “false” doesn’t always mean that there is no equivalent, just that a
simple 1:1 translation is not possible. The more complex fields are parsed
separately.

  /**
   * all fields and their metadata
   */
  $this->field_data = array (
    'reftype' => array(
      'label'     => _("Bibliographic Type"),
      'type'      => "string",
      "csl"       => "type"

    ),
    'citekey' => array(
      'label'     => _("Citation Key"),
      'type'      => "string",
      "csl"       => "ID"
    ),
    'abstract' => array(
      'label'     => _("Abstract"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "abstract"
    ),
    // this is used for publisher-place or for author address
    'address' => array(
      'label'     => _("Place"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "publisher-place"
    ),
     // author affiliation
    'affiliation' => array(
      'label'     => _("Affiliation"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false // ???
    ),
    'annote' => array(
      'label'     => _("Annotation"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "annote"
    ),
    'author' => array(
      'label'     => _("Authors"),
      'type'      => "string",
      'bibtex'    => true,
      'separator' => ";",
      "csl"       => "author"
    ),
    'booktitle' => array(
      'label'     => _("Book Title"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "container-title"
    ),
    'contents' => array(
      'label'     => _("Contents"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'copyright' => array(
      'label'     => _("Copyright"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => null
    ),
    'crossref' => array(
      'label'     => _("Cross Reference"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "references"
    ),
    'date'   => array(
      'label'     => _("Date"),
      'type'      => "date",
      'bibtex'    => true,
      "csl"       => false
    ),
    'edition' => array(
      'label'     => _("Edition"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "edition"
    ),
    'editor' => array(
      'label'     => _("Editors"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "author"
    ),
    'howpublished' => array(
      'label'     => _("Published As"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'institution' => array(
      'label'     => _("Institution"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'isbn'   => array(
      'label'     => _("ISBN"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "ISBN"
    ),
    'issn'   => array(
      'label'     => _("ISSN"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "ISSN"
    ),
    'journal' => array(
      'label'     => _("Journal"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "container-title"
    ),
    // don't know what this is for, anyways
    'key' => array(
      'label'     => _("Key"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'keywords' => array(
      'label'     => _("Keywords"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "keyword"
    ),
    'language' => array(
      'label'     => _("Language"),
      'autocomplete'  => array('separator' => null ),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false // ???
    ),
    'lccn' => array(
      'label'     => _("Call Number"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "call-number"
    ),
    // field to store where the book is kept
    'location' => array(
      'label'     => _("Location"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'month' => array(
      'label'     => _("Month"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'note' => array(
      'label'     => _("Note"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "note"
    ),
    'number' => array(
      'label'     => _("Number"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "number"
    ),
    'organization' => array(
      'label'     => _("Organization"),
      'type'      => "string",
      "csl"       => false
    ),
    'pages' => array(
      'label'     => _("Pages"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "page"
    ),
    'price' => array(
      'label'     => _("Price"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'publisher' => array(
      'label'         => _("Publisher"),
      'type'          => "string",
      'bibtex'        => true,
      "csl"       => "publisher"
    ),
    'school' => array(
      'label'         => _("University"),
      'type'          => "string",
      'bibtex'        => true,
      "csl"       => false
    ),
    'series' => array(
      'label'         => _("Series"),
      'type'          => "string",
      'bibtex'        => true,
      "csl"       => "collection-title"
    ),
    'size'   => array(
      'label'     => _("Size"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'subtitle' => array(
      'label'     => _("Subtitle"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => false
    ),
    'title' => array(
      'label'     => _("Title"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "title"
    ),
    'url' => array(
      'label'     => _("Internet Link"),
      'type'      => "link",
      'bibtex'    => true,
      "csl"       => "URL"
    ),
    'volume' => array(
      'label'     => _("Volume"),
      'type'      => "string",
      'bibtex'    => true,
      "csl"       => "volume"
    ),
    'year'     => array(
      'label'     => _("Year"),
      'type'      => "int",
      'bibtex'    => true,
      "csl"       => "issued"
    )
  );-- 

View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5137090.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

As you can see, the BibTeX types are much fewer than the CSL types, but I
still have a couple of problems mapping my own version of the BibTeX types.

  • There isn’t a notion of “periodical” or “serial” (like journals) in
    BibTex, but I also couldn’t find one in the CSL specs.

container-title

  • Is a “booklet” a “pamphlet” or a “book”?

I’d probably say the former. Not really sure what a “booklet” is :slight_smile:

  • What would be the equivalent of “misc”, stuff that doesn’t really fit
    anywhere else. But then, I have a hard time of thinking of anything that
    doesn’t fit :wink:

Yeah, I always hated the “misc” type, so there is none. I believe we
have a “document” type (?) that might work.

There is no “document” type in CSL. There is a “Document” item type
in the Zotero UI, but it maps to vanilla CSL “article”, (in both
Zotero 2.0 and 2.1a1). It is the only Zotero item type that maps to
“article” (as opposed to “article-journal”, et cetera), so for Zotero
purposes, for the present, at least, “document” == “article”. But you
wouldn’t want to specify that generally, as it’s pretty clearly an ad
hoc solution.

One possibility would be to explicitly permit nil or unrecognized
values for “type” in the CSL specification. Styles that discriminate
between item types would then pick up and process such a creature on
the cs:else branch of a condition statement. That might require a
modification of the CSL schema though, I’m not sure.

Thanks. here is the mapping data for “fields”. Where the “csl” key is defined
as “false” doesn’t always mean that there is no equivalent, just that a
simple 1:1 translation is not possible. The more complex fields are parsed
separately.

 /**
  * all fields and their metadata
  */
 $this->field_data = array (
   'reftype' => array(
     'label'     => _("Bibliographic Type"),
     'type'      => "string",
     "csl"       => "type"

   ),
   'citekey' => array(
     'label'     => _("Citation Key"),
     'type'      => "string",
     "csl"       => "ID"
   ),
   'abstract' => array(
     'label'     => _("Abstract"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "abstract"
   ),
   // this is used for publisher-place or for author address
   'address' => array(
     'label'     => _("Place"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "publisher-place"
   ),
    // author affiliation
   'affiliation' => array(
     'label'     => _("Affiliation"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false // ???
   ),
   'annote' => array(
     'label'     => _("Annotation"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "annote"
   ),
   'author' => array(
     'label'     => _("Authors"),
     'type'      => "string",
     'bibtex'    => true,
     'separator' => ";",
     "csl"       => "author"
   ),
   'booktitle' => array(
     'label'     => _("Book Title"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "container-title"
   ),
   'contents' => array(
     'label'     => _("Contents"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'copyright' => array(
     'label'     => _("Copyright"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => null
   ),
   'crossref' => array(
     'label'     => _("Cross Reference"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "references"
   ),
   'date'   => array(
     'label'     => _("Date"),
     'type'      => "date",
     'bibtex'    => true,
     "csl"       => false
   ),
   'edition' => array(
     'label'     => _("Edition"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "edition"
   ),
   'editor' => array(
     'label'     => _("Editors"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "author"
   ),
   'howpublished' => array(
     'label'     => _("Published As"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'institution' => array(
     'label'     => _("Institution"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'isbn'   => array(
     'label'     => _("ISBN"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "ISBN"
   ),
   'issn'   => array(
     'label'     => _("ISSN"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "ISSN"
   ),
   'journal' => array(
     'label'     => _("Journal"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "container-title"
   ),
   // don't know what this is for, anyways
   'key' => array(
     'label'     => _("Key"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'keywords' => array(
     'label'     => _("Keywords"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "keyword"
   ),
   'language' => array(
     'label'     => _("Language"),
     'autocomplete'  => array('separator' => null ),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false // ???
   ),
   'lccn' => array(
     'label'     => _("Call Number"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "call-number"
   ),
   // field to store where the book is kept
   'location' => array(
     'label'     => _("Location"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'month' => array(
     'label'     => _("Month"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'note' => array(
     'label'     => _("Note"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "note"
   ),
   'number' => array(
     'label'     => _("Number"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "number"
   ),
   'organization' => array(
     'label'     => _("Organization"),
     'type'      => "string",
     "csl"       => false
   ),
   'pages' => array(
     'label'     => _("Pages"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "page"
   ),
   'price' => array(
     'label'     => _("Price"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'publisher' => array(
     'label'         => _("Publisher"),
     'type'          => "string",
     'bibtex'        => true,
     "csl"       => "publisher"
   ),
   'school' => array(
     'label'         => _("University"),
     'type'          => "string",
     'bibtex'        => true,
     "csl"       => false
   ),
   'series' => array(
     'label'         => _("Series"),
     'type'          => "string",
     'bibtex'        => true,
     "csl"       => "collection-title"
   ),
   'size'   => array(
     'label'     => _("Size"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'subtitle' => array(
     'label'     => _("Subtitle"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => false
   ),
   'title' => array(
     'label'     => _("Title"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "title"
   ),
   'url' => array(
     'label'     => _("Internet Link"),
     'type'      => "link",
     'bibtex'    => true,
     "csl"       => "URL"
   ),
   'volume' => array(
     'label'     => _("Volume"),
     'type'      => "string",
     'bibtex'    => true,
     "csl"       => "volume"
   ),
   'year'     => array(
     'label'     => _("Year"),
     'type'      => "int",
     'bibtex'    => true,
     "csl"       => "issued"
   )
 );

It’s great to see this.

Taking things one step further, I think it’s important to adopt two
rules for app->CSL mappings.

(1) No app-side field should map to more than one CSL variable; and

(2) No two fields available in an app-side item type should map to the
same CSL variable.

This would prevent a situation where a CSL style developed on one
system is intractably broken on another. We’re already headed for
that difficulty, as the current Zotero mappings violate both rules.
For example:

The “series” and “seriesTitle” fields available on Zotero
“journalArticle” both map to collection-title …

… and the Zotero “place” field maps to both “event-place” and
“publisher-place”.

We can avoid potential chaos by requiring two things of applications
that integrate with a CSL processor:

(a) A description of the app-side fields available for each app-side
item type; and

(b) A description of the mappings of app-side fields to CSL variables.

If the descriptions are expressed in JSON (say), a small script can
check whether the rules are broken anywhere. It’s a simple thing, but
not necessarily obvious on the surface; Zotero has been running
happily for years with issues concerning (1) and (2) above; but when
other applications enter the mix and we start swapping CSL styles
between them, it’s going to be a problem.

Frank

Perhaps we should add “document”?

Bruce

Returning to the original topic of this thread, and moving a discussion that
I started on a different thread to here: I think what would be really good
for the integration of the different processors into applications would be
to have some sort of “schema validation” like it exists for xml for the json
input data. This would allow to enforce a uniform schema accross
implementations, and to help application developers to correctly map the
application-internal data to the data that is expected by the CSL formatting
engines.

After some research, it seems to me that there are multiple projects which
try to do some sort of json schema validation, but no standard has emerged
yet (correct me if I am wrong). I found this which has an implementation in
javascript and PHP:

and which uses a very lightweight syntax to check the schema of a json data
structure:

{
“type”:“object”,
“properties”:{
“a”:{“type”:“number”, properties : { … }},
“b”:{“type”:“string”}
},
“additionalProperties”:false
}

Basically, only three terms (type, properties, and additionalProperties) are
used to describe the json schema. Seems pretty easy to implement in other
languages, and we get the javascript and PHP for free.

What do you think?

C.–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5144137.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Returning to the original topic of this thread, and moving a discussion that
I started on a different thread to here: I think what would be really good
for the integration of the different processors into applications would be
to have some sort of “schema validation” like it exists for xml for the json
input data. This would allow to enforce a uniform schema accross
implementations, and to help application developers to correctly map the
application-internal data to the data that is expected by the CSL formatting
engines.

After some research, it seems to me that there are multiple projects which
try to do some sort of json schema validation, but no standard has emerged
yet (correct me if I am wrong). I found this which has an implementation in
javascript and PHP:

Json Schema PHP Validator download | SourceForge.net

and which uses a very lightweight syntax to check the schema of a json data
structure:

{
“type”:“object”,
“properties”:{
“a”:{“type”:“number”, properties : { … }},
“b”:{“type”:“string”}
},
“additionalProperties”:false
}

Basically, only three terms (type, properties, and additionalProperties) are
used to describe the json schema. Seems pretty easy to implement in other
languages, and we get the javascript and PHP for free.

What do you think?

Rintze and I chatted about this off-list awhile back. Here’s some of
that discussion …

Thanks Bruce. So there IS an emerging standard, http://json-schema.org/, good
to know. I would vote for creating a reference validation schema in
JSONSchema, then. From there, the maintainers of the different
implementations can go ahead and include/write validators. I’d volunteer to
contribute the PHP one for citeproc-php, if Ron doesn’t want to do it
himself.

Thanks,
Christian–
View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5144826.html
Sent from the xbiblio-devel mailing list archive at Nabble.com.

Thanks Bruce. So there IS an emerging standard, http://json-schema.org/, good
to know. I would vote for creating a reference validation schema in
JSONSchema, then.

A question, though: can JSONSchema represent choices? E.g. can it
represent this in RNC?

name-attributes =
attribute name { text }
> (attribute given { text }?,
attribute family { text }?,
attribute prefix { text }?,
attribute suffix { text }?,
attribute particle { text }?)
}

I think the answer is “no” but am not sure. There is an “enum”
datatype, but this is for string values.

If it really can’t represent choices, then it’s arguably too
lightweight to be all that useful.

From there, the maintainers of the different
implementations can go ahead and include/write validators. I’d volunteer to
contribute the PHP one for citeproc-php, if Ron doesn’t want to do it
himself.

Thanks,
Christian

Bruce

Not sure, but this thread might be related:

http://groups.google.com/group/json-schema/browse_thread/thread/1603919da6c4cb2e?pli=1

Bruce

Also, just for reference, here’s the schema draft in RNC:

http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-data.rnc

Another thing probably hard/impossible to represent in JSON Schema is
the stuff at the bottom for the HTML subset "rich text.

Bruce

Hi,

I am a bit late on this thread but would like to add my 2cents.

One advantage of CSL is that its bibliographic input is not bound
to a specific data format but defined on form of a data model.
Unfortunately this data model is not defined at one place but
implied by multiple documents. The best to start with are:

and

The Relax NG schema defines an XML format as serialization of
the data model (by the way the same format could and should also
be defined by an XML Schema document).

The JSON input format is not explicitely defined by a Schema
because there is no widely adopted schema language for JSON.
I tried some of the JSON schema languages which were mentioned
in this thread but they all seem impractical - either too complex
or you cannot express everything needed and moreover a schema
without validator is of little use. I think it is more practical
to directly implement validators in several programming languages.

By the way I would name the JSON format CSL/JSON and the XML
format CSL/XML. Other CSL input formats that could be useful
are CSL/RDF for CSL as Linked Data and CSL/Microformat to embed
CSL data in HTML.

Cheers
Jakob–
Verbundzentrale des GBV (VZG)
Digitale Bibliothek - Jakob Voß
Platz der Goettinger Sieben 1
37073 Goettingen - Germany
+49 (0)551 39-10242
http://www.gbv.de
@Jakob_Voss

Hi,

I am a bit late on this thread but would like to add my 2cents.

One advantage of CSL is that its bibliographic input is not bound
to a specific data format but defined on form of a data model.
Unfortunately this data model is not defined at one place but
implied by multiple documents. The best to start with are:

and

The Relax NG schema defines an XML format as serialization of
the data model (by the way the same format could and should also
be defined by an XML Schema document).

Yes, except that a) I HATE XSD, and b) CSL is authored in RNG, and so
it’s easy to pull in patterns to avoid duplication.

The JSON input format is not explicitely defined by a Schema
because there is no widely adopted schema language for JSON.
I tried some of the JSON schema languages which were mentioned
in this thread but they all seem impractical - either too complex
or you cannot express everything needed and moreover a schema
without validator is of little use. I think it is more practical
to directly implement validators in several programming languages.

OK, so you’re suggesting to define the JSON schema in code?

By the way I would name the JSON format CSL/JSON and the XML
format CSL/XML. Other CSL input formats that could be useful
are CSL/RDF for CSL as Linked Data and CSL/Microformat to embed
CSL data in HTML.

Makes sense, except I’m not interested in doing CSL/RDF; that’s what
BIBO is for.

Bruce