CSL deployment idea

I don’t know a ton about web servers and such, but in deploying CSL,
would it make sense to recommend that all CSL files have at the
location their URI resolves to …

a) the XML (text/xml) style itself
b) an HTML representation with output examples

…? So if given a URI of “http://zotero.org/styles/pyschology/apa”, a
browser would display the web page, while a citation processor would
just download (and cache) the style directly?

Bruce

This could be done with HTTP/1.1 Content Negotiation:
http://httpd.apache.org/docs/2.0/content-negotiation.html

There would be two files in the directory, apa.html and apa.csl.
Browsers, which send text/html in the Accept header, would get the HTML.
Citations processors would send text/xml (or application/x-csl, or
whatever) in the Accept header (say, using
XMLHTTPRequest.setRequestHeader(), as Zotero would do) and get the CSL
file, assuming the server had been configured to serve .csl files with
the appropriate content type.

Of course, citation processors could also just append “.csl” to the URI
automatically, and if the idea was for CSL files to be available in a
distributed manner, that’d probably have to be the standard behavior, as
configuring MIME types and content negotiation settings might not be
possible for all authors (depending on the server setup).

Hi Dan,

This could be done with HTTP/1.1 Content Negotiation:
Content Negotiation - Apache HTTP Server

Yes, that’s what I had in mind, having known about it through the RDF
stuff. I just wasn’t sure about the practical deployment issues. So

[…]

Of course, citation processors could also just append “.csl” to the URI
automatically, and if the idea was for CSL files to be available in a
distributed manner, that’d probably have to be the standard behavior, as
configuring MIME types and content negotiation settings might not be
possible for all authors (depending on the server setup).

… what would be the precise upshot? Are you suggesting we really
can’t suggest a human-readable (HTML) representation be resolvable
using the URI for the style?

Bruce

No, I’m not suggesting that. If given an HTML file without an extension,
Apache, at least, will still send the file as text/html, thanks to
mod_mime_magic, even without using content negotiation. So we could just
have the HTML file be “apa” and the CSL file be “apa.csl”, and as long
as citation processors knew to append “.csl” to the URI when requesting
the CSL file, it’d be fine.

The content negotiation approach would be a bit more elegant, but it
doesn’t provide much benefit that I can think of and might limit who
could host a CSL file.

OK, sounds good.

BTW, apropos of this, I just bcc-ed you on a reply to a colleague of
mine who is loving Zotero, and now wanting to use the style for his
particular journal (he got tripped up because wanting to manually
correct the formatting in the bibliography fields).

What I’d to be able to do sometime soon is:

  1. load a CSL for his journal onto the web
  2. send him the URI for it, which he pastes into a field in Zotero

The rest (downloading, caching, etc.) would happen automatically.

Or something like that :wink:

Bruce

Some questions/thoughts:

  1. How do you see updates being handled? Via an XML-based index file
    that contains version/timestamp information for all CSL files on a site?

  2. Are the location and name of the index file specified in the CSL, or
    are they implicit (say, a default file name in the same directory of any
    already downloaded files, or a walk up the paths until one is found (to
    handle a repository with a hierarchical directory structure))? For
    various reasons, some method of determining the index file from a given
    CSL file would be helpful.

  3. What’s the mechanism for downloading all styles from a repository? I
    imagine you’d be able to specify either a single CSL file or an entire
    repository (which might be the URL of the index file). If you point to
    the index file, can you pick and choose, or do you get them all? (This
    is a client implementation question, and I can’t guarantee that one
    approach or another would be implemented in Zotero, but I’m curious how
    you see it working.)

  4. How are duplicates handled? If you post a style for your colleague to
    test, and he installs it via the URL you provide, and then the style is
    moved into the main Zotero repository (to which my Zotero client is
    subscribed by default), what happens to the installed style on his
    computer? He can just delete the old one manually, of course, but
    citation processors should probably be expected to handle 301 redirects
    correctly for both index files and CSL files and switch to the
    redirected URL (at least if it resolves to a valid index/CSL file). If
    the index file exists and specifies a CSL file URL not below its path,
    that should probably also be treated as a permanent redirect for
    existing subscribers.

  • Dan

What I’d to be able to do sometime soon is:

  1. load a CSL for his journal onto the web
  2. send him the URI for it, which he pastes into a field in Zotero

The rest (downloading, caching, etc.) would happen automatically.

Some questions/thoughts:

  1. How do you see updates being handled? Via an XML-based index file
    that contains version/timestamp information for all CSL files on a
    site?

Probably. This is why we were earlier talking about using Atom. So user
would subscribe to a repository (in a large repository, it might be one
for their field though).

I suppose a simple way to start is to use the time-stamp in the file,
though?

  1. Are the location and name of the index file specified in the CSL, or
    are they implicit (say, a default file name in the same directory of
    any
    already downloaded files, or a walk up the paths until one is found (to
    handle a repository with a hierarchical directory structure))? For
    various reasons, some method of determining the index file from a given
    CSL file would be helpful.

Agreed. And I would say the latter. So we take the APA example. Let’s
say we move it to your repository, and we give it the stable URI
http://zotero.org/styles/psychology.

I would expect at the styles directory to see something like:

index.xml (probably an Atom file)
history/
psychology/
index.xml
apa.csl

Note: some tricky issues with categorizing styles sometimes, so not
sure that’s a good idea. I’m just imagining the goal, which is the
possibility that a repository might have hundreds, if not thousands, of
styles. Might be better to just do:

apa.csl
chicago.csl
generic.xml
history.xml
mla.csl
psychology.xml

So a feed per subject/field, where each file also gets tagged with
those (from which the index file is generated).

  1. What’s the mechanism for downloading all styles from a repository? I
    imagine you’d be able to specify either a single CSL file or an entire
    repository (which might be the URL of the index file). If you point to
    the index file, can you pick and choose, or do you get them all?

I think ideally both.

I’d imagine a user would subscribe to a repository or repository
section just as they would a news feed. Perhaps they get a list of all
styles in that feed/repo, and can either select all or choose
individual styles to … I don’t know the verb … activate (?).

(This is a client implementation question, and I can’t guarantee that
one
approach or another would be implemented in Zotero, but I’m curious how
you see it working.)

Right.

  1. How are duplicates handled? If you post a style for your colleague
    to
    test, and he installs it via the URL you provide, and then the style is
    moved into the main Zotero repository (to which my Zotero client is
    subscribed by default), what happens to the installed style on his
    computer? He can just delete the old one manually, of course, but
    citation processors should probably be expected to handle 301 redirects
    correctly for both index files and CSL files and switch to the
    redirected URL (at least if it resolves to a valid index/CSL file). If
    the index file exists and specifies a CSL file URL not below its path,
    that should probably also be treated as a permanent redirect for
    existing subscribers.

I agree this is a detail that’s important. You probably have a better
sense than I. And perhaps the Atom spec might be helpful?

Bruce

Ah, sure enough; API support for Atom feed reading:

http://developer.mozilla.org/en/docs/Feed_content_access_API

Bruce

Was just remembering that the Atom world had to deal with this. See:

<http://www.dehora.net/journal/2004/07/
atom_and_cool_uris_dogma_idealism_expediency.html>
http://diveintomark.org/archives/2004/05/28/howto-atom-id

The right answer isn’t that straightforward, though we’d want to decide
on the right policy: either one always-resolvable URI, or multiple.

Bruce