That’s a big one.
I had an idea today, though, that might catch both objectives. Here’s
the pitch. I think it’s as original as casting the CSL editor. See
what you think about the idea, though.On Sat, Apr 20, 2013 at 4:58 PM, Sebastian Karcher <@Sebastian_Karcher> wrote:
Hi,
last Thursday, Rintze, Frank, and I had a conference call with Alex
Garnett and Juan Pablo Alperin of the Public Knowledge Project
http://pkp.sfu.ca .
We wanted to explore if (and if so how) CSL could find an
institutional host at the PKP and what that would entail. Generally
the conversation was very positive, the PKP folks know CSL and
actually have started using it in one of their projects. They seemed
quite positive about the general prospect of providing a home to CSL.
They don’t have much in terms of developer time to offer, but said
that short term some advice and time for grant writing would be
possible. They said they would want to be included in some way in the
CSL decision-making process, though more in terms of knowing what’s
going on than to influence decisions (we did describe said process as
open and consensus-based, which they seemed fine with). As for grants,
as other have said, they said that it’s basically impossible to get
grants to cover day-to-day operations. Grant institutions want to fund
something specific and new, so we’d have to think about that. Rintze
and I came up with three areas on the spot:
- Specifications - while the syntax is well specified, all the little
things like eliminating double spaces/punctuation etc. that the
processors do (or not) isn’t. It should be
- Legal CSL - incorporating Frank’s modification for legal support
- Other CSL 1.1/2.0 developments including field updates, potential
multilingual improvements etc.
Perhaps the biggest concern in all of this is that Rintze and I don’t
see how this is going to reduce our work (which, after all, was one of
the original reasons we started talking about this).
CSL is a carefully designed language. The potential for CSL to become
a de facto standard for defining and automating document referencing
formats has been proven through performance: several implementations
of the language are running in the wild, and user-contributed styles
have brought the CSL Style Repository to 800+ styles covering 4000+
journals. Major projects, including Mendeley, Papers and Zotero rely
upon the language to serve a large user community, many working in
research or at the PhD level.
In the community’s drive to satisfy user needs, the focus has been on
individual styles. This has spread attention across an expanding
codebase, slowing efforts to refine and improve styles across the
archive as a whole.
This challenge can be addressed by drawing upon a latent potential for
modularity in CSL that has not heretofore played a part in style
maintenance and distribution. At the most basic level, CSL cleanly
separates four elements of style design:
- Citation formats
- Citation format parameters
- Bibliography formats
- Bibliography format parameters
Although each style in the CSL Style Repository is currently stored as
an atomic unit, each is composed of these four elements, and they can
easily be separated and remixed, resulting in a smaller base of code,
higher quality in many styles, and potential for more rapid coverage
of remaining publisher and university styles. There is deeper
potential for modularity in CSL (through a shared macro library).
Implementing this simple modular break-out in the current repository
infrastructure will make it possible to explore those avenues in
future.
Moving to a modular archive design would require the following:
- Style-level test suites to confirm current style behaviour;
- Tools for breaking out the current code base:
- Separating current styles into citation-format and
bibliography-format elements for separate validation;
- Extracting and storing bibliography and citation format IDs and
parameters on a per-style basis.
- Tools for exploring commonalities between citation and
bibliography formats, and merging IDs;
- A middle layer for recombining styles from modular code and
testing the result.
For simplicity, this back-office functionality should be masked from
users and style designers, who understand CSL styles (either when
using the CSL editor, or when directly editing style XML) as
integrated units. Accordingly, archive modularisation should be
accompanied by a maintenance layer performing two functions:
- Automated pre-flight checks for schema validity and correct and
complete style metadata;
- Arbitration with the modular repo back-end, with heuristic
identification and merger of citation and bibliography formats; and
- User-facing and maintainer-facing UI to drive these facilities.
Frank