csl-training - ideas for methods, good practice etc.

Hello everyone,

I have come to source people’s ideas for a ‘codesprint’ I have suggested
for next week’s THATCamp Paris. The potted description in French is here
http://barcamp.org/w/page/54813952/Codesprint%20et%20Booksprint%2027-28%20septembre%20(cliquer%20ici),
but the general idea is to explain the basics of csl style language
and
then see how many styles we can turn out for French humanities and social
sciences. This takes up the initiative explained here
http://www.boiteaoutils.info/p/csl-france-styles-pour-zotero.html and
hosted here https://trello.com/board/csl-france/4e8f4ee92adc2a00009616d3which
never really took off.

I originally suggested this a long time ago and then started wondering
whether it was still a good idea when I saw the progress that had been made
on the visual style editor. In the end I decided to maintain the
codesprint, including actual code, because I reckon with the fairly
tech-literate public at the THATCamp it makes sense and it would be an
excellent opportunity to get more good French styles into the repository.

So far my plan is to assemble links to all the available documentation on
the page mentioned above with the necessary explanations in French, to
start the codesprint with a walk-through of adapting an existing style
(located using the visual editor tool), while explaining the structure of
csl-styles. I may well produce a very basic style with in-line comments in
French explaining what happens at each point.

So, this is where my questions come in :

  • has anyone ever held a similar initiative - and have any useful hints
    to share?
  • What guidelines for good csl-practice would you want to teach a bunch
    of beginners?
  • finally, more specifically, when writing French styles, I got used to
    including codes like   for non-breaking spaces and è for è - in
    good part because I got tired of people opening/saving styles on different
    operating systems and breaking accented character encodings. I remember
    that being discouraged at some point on this list. What are people’s
    opinions on this?

I’d be very grateful for any advice you might have.
Thanks in advance,
Franziska Heimburger

Hello everyone,

I have come to source people’s ideas for a ‘codesprint’ I have suggested for
next week’s THATCamp Paris. The potted description in French is here
http://barcamp.org/w/page/54813952/Codesprint%20et%20Booksprint%2027-28%20septembre%20(cliquer%20ici)
, but the general idea is to explain the basics of csl style language and
then see how many styles we can turn out for French humanities and social
sciences. This takes up the initiative explained here
http://www.boiteaoutils.info/p/csl-france-styles-pour-zotero.html and hosted
here https://trello.com/board/csl-france/4e8f4ee92adc2a00009616d3 which
never really took off.

I originally suggested this a long time ago and then started wondering
whether it was still a good idea when I saw the progress that had been made
on the visual style editor. In the end I decided to maintain the codesprint,
including actual code, because I reckon with the fairly tech-literate public
at the THATCamp it makes sense and it would be an excellent opportunity to
get more good French styles into the repository.

That makes sense. Hand coding for people confortable will yield better styles.

Great idea, BTW! More below …

So far my plan is to assemble links to all the available documentation on
the page mentioned above with the necessary explanations in French, to start
the codesprint with a walk-through of adapting an existing style (located
using the visual editor tool), while explaining the structure of csl-styles.
I may well produce a very basic style with in-line comments in French
explaining what happens at each point.

So, this is where my questions come in :

has anyone ever held a similar initiative - and have any useful hints to
share?
What guidelines for good csl-practice would you want to teach a bunch of
beginners?

I only have one, two-part, suggestion here:

  1. good styles exploit macros heavily to avoid duplication; you can
    see evidence of this by the relative percentage of code dedicated to
    macros vs. the citation and bibliography templates.
  2. use an example style that is well-written to demonstrate. Rintze
    and Sebastian may have good suggestions, but I tend to think a widely
    used style like APA or Chicago is a good bet.

Bruce

has anyone ever held a similar initiative?

I’m not aware of any. Sebastian (adamsmith) interacts with a lot of
Zotero users via the Zotero forums, and Charles Parnot offers support
to users of Papers (Mekentosj, the makers of Papers, also has a
program to reward people to contribute styles:
http://news.mekentosj.com/2012/01/a-serial-for-a-style/ ).

What guidelines for good csl-practice would you want to teach a bunch of
beginners?

We have a set of requirements for styles that are submitted to the
official style repository (see


), although those mostly concern the style metadata and style
validation. Sebastian might have some additional tips on coding the
actual styles, but one relatively recent development is that we try
to rely much more on the cs:group element to specify delimiting
punctuation and whitespace. This prevents superfluous punctuation in
case not all metadata is present (see an example at

).

Other than that, I’ve started some work to rewrite the CSL Primer
(http://citationstyles.org/downloads/primer.html). It’s a work in
progress, but there is some new material that might be useful as an
introduction to CSL at
http://citation-style-language.readthedocs.org/en/latest/primer.html
(as an aside: I’ve been suffering a bit from a writer’s block; if
anybody has any topics they feel should be addressed in the CSL
primer, let me know).

finally, more specifically, when writing French styles, I got used to
including codes like   for non-breaking spaces and è for è - in
good part because I got tired of people opening/saving styles on different
operating systems and breaking accented character encodings. I remember that
being discouraged at some point on this list. What are people’s opinions on
this?

I think escape codes make a lot of sense for those cases where a style
author cannot (easily) visually identify a character, as is e.g. the
case for a non-breaking space. Every now and then I reindent the
styles in the style repository, and the code I use automatically
unescapes these codes. The characters that I re-escape can be found at
https://github.com/citation-style-language/utilities/blob/master/csl-indent.py#L30
(currently the “no-break space”, “non-breaking hyphen”, “en-dash”,
“em-dash”, and a superscript “e” that didn’t render properly on my
system). I can easily add more, but I don’t think we need to escape
relatively common characters such as “è”, or characters that should
render just fine in any modern plain text editor, like French
quotation marks (http://en.wikipedia.org/wiki/Guillemet).

Rintze

Hi Franziska,

This is an awesome initiative. I don’t know if equivalent things have been done in the past, but I applaud the French :slight_smile:

(I am French myself).

I don’t know about the XML escaping codes, but I would tend to recommend instead to also educate style developers in using UTF8 encoding and be conscious of that issue. Things have eveolved a lot in the past few years, and fortunately, it’s become more and more standard in many text editors (and certainly fully supported in any modern editor). In the end, paying attention to the file encoding might be less trouble than dealing with confusing character codes. There is still the need to escape a number of characters of course, as you probably know (in particular & --> &).

Regarding also French term (and thus those accents), be sure to explain that a number of terms are already localized for you. This is maybe also a good opportunity with all those French brains to check the locale file for French (here: https://github.com/citation-style-language/locales/blob/master/locales-fr-FR.xml). Any of those terms can be inserted with for instance:

<text term="in press" text-case="capitalize-first" suffix=" "/>

where “in press” in the term you’d like to see (in French) --> “sous presse” will be output

Then the style element itself should set the default locale so the correct translation is used:

If you have also canadian styles, well you can use fr-CA! Here is also very fresh info on locales, very detailed writeup just created by Rintze: http://citation-style-language.readthedocs.org/en/latest/translating-locale-files.html (I am sorry if all the above is already known for you, I just thought it would be important to stress out for a French-centric effort.) Now, regarding the structure of the styles. I would agree hand-coding might be a good idea in this particular context. As Bruce said, use as many macros as possible, and get inspired by e.g. APA or Chicago. I don't have that much experience writing styles from scratch, but I notice that using suffix/prefix in the wrong way can lead to issues. You typically want to reserve those use for `` elements, or when you use both on the same element. For instance: Because in this case, if year is absent, neither will show. However, consider this: If no page is present, then you end up with an extra ':'. In this context, the ':' is really a delimiter. So it's better to do: In general, using elements can really help make things more robust and flexible. Before submission, you should at least validate the style. Here is a more extensive list of things to check: https://github.com/citation-style-language/styles/wiki/Style-Requirements If several publications or publishers use the exact same output, we really encourage you to create more styles using 'dependent' styles. The reason is that for the user to find the right style, they usually just search based on the name of the journal or publisher, not based on what style the journal/publisher adhere to. Dependent styles are very easy to create, and are better than duplicating a whole style, as they function more like aliases / shortcuts, pointing to a parent style that will affect them all if modifications are needed. And with CSL 1.0.1, the default-locale can be set on a dependent style, which allows to create identical styles for 2 different languages set explicitely. Finally, not trying to push any agenda here, but we (at Papers) have this reward program in place for CSL contributors: http://news.mekentosj.com/2012/01/a-serial-for-a-style/ The spirit of this reward is to really encourage the growth of CSL, which has been a great resource for Papers, and which we want to make even more useful. Let us know how things went! Charles

This is maybe also a good opportunity with all those French brains to
check the locale file for French (here:
https://github.com/citation-style-language/locales/blob/master/locales-fr-FR.xml).

One comment: CSL 1.0.1 was released earlier this month, and includes
some features that are helpful for French styles (in particular, an
improved way to define ordinal suffixes, and support for
gender-specific ordinals (e.g. “1er janvier”, see
http://citation-style-language.readthedocs.org/en/latest/translating-locale-files.html#ordinals
)). However, Zotero doesn’t yet ship with CSL 1.0.1 locale files, and
Frank Bennett, the author of citeproc-js, the CSL processor used by
Zotero, recently fixed a few problems concerning the handling of
ordinals. Zotero hasn’t been updated yet to use the latest version of
citeproc-js with these fixes. Gracile, another French user, might be
able to give you a few pointers on how to deal with this situation (he
has been testing the new CSL 1.0.1 features with Zotero 3.0.8).

Rintze

Hi Franzika,

just some brief additions:

  • As for “good CSL” I want to echo what the others said: Prefer groups over
    affixes and use macros extensively. One other point: If you’re working of
    an existing style, don’t assume that it’s perfect. Many styles on the
    repository are only so-so.

  • Example styles - APA and Vancouver are pretty good (though not perfect).
    Elsevier’s Harvard is nice because it’s pretty clean and it’s a very simple
    style.
    Chicago Manual is a great style, but it’s also a complex mess, with
    something like 8 layers of nested groups. I really wouldn’t use it as a
    model.

Where I somewhat disagree is that coding by hand is preferable. I think
teaching how to work the visual editor effectively might be more useful and
lend itself better to spread the word and the code it puts out is pretty
clean.
I’ll include a part on CSL in my Zotero workshops this fall and what I’ll
do is to give a general overview over the way CSL styles are built up, but
then focus on using the editor, which still requires a good understanding
of the underlying mechanics, but no attention to syntax.

What I like about the editor is that it will do a lot of things right
automatically - e.g. you don’t have to worry about most of the conventions
for the info section - and that you don’t have to remember/look-up all the
terminology (is it given-name-disambiguation-rule or
name-disambiguation-rule? is the value “by-cite” “minimal” or something
else? etc.) - if you code dozens of styles that’s not a problem, but with
casual contributors that’s a very real issue. Also, the visual editor saves
the style correctly in utf-8, indented correctly, and with a .csl
extension, another set of issues you don’t have to worry about. It doesn’t
validate, though, so careful with that.
It does have a code editor that works pretty nicely, too - very similar to
the Zotero test panel - if you want to work with the code directly.

If you’re interested, some overview styles on CSL by me are at the bottom
of this Prezi:

Hope that helps,
Sebastian

Chicago Manual is a great style, but it’s also a complex mess, with something like 8 layers of nested groups. I really wouldn’t use it as a model.

Out of curiosity, how much of this is down to the inherent
requirements of the style itself vs. the gradual evolution of the
style as CSL was developed?

Regards,
Rob.

it has nothing to do with the evolution of the CSL style in this case
(though the general concern is certainly valid - this used to be the case
for the Vancouver styles, e.g., before I rewrote them from scratch) - Frank
actually rewrote most of it pretty recently. I’d say it’s a mix of two
things. One are the style requirements, the other one is that Frank went
out to code a style with essentially no affixes at all - everything is done
by groups and delimiters (hence the nesting).
My use of “Complex mess” might be a bit misleading here. For the initiated,
this is actually very convenient, because it’s very hard to break things
because of that structure - I’m more confident editing CMoS now than I was
before. But for someone who has never seen CSL it’s going to be confusing
and intimidating.On Wed, Sep 19, 2012 at 10:58 AM, Robert Knight <@Robert_Knight>wrote:

I have to admit that in its current form, even I find the CMoS styles
confusing and intimidating.

Rintze

Where I somewhat disagree is that coding by hand is preferable. I think
teaching how to work the visual editor effectively might be more useful and
lend itself better to spread the word and the code it puts out is pretty
clean.

This is an interesting strategic question. I’m open to having my mind
changed on this.

Bruce

Same here. The editor has evolved a lot and might indeed be a good idea. How about A/B testing, Franziska? :stuck_out_tongue:

charles

While that does help with Frank’s and my plans for world dominance
(mwahahahahaaa…), that’s obviously not good. We might want to rethink the
approach to CMoS - Chicago Manual is never going to be a simple style, but
trying to make the code simpler - not least so that it can serve as a
useful template - seems like a good idea.On Wed, Sep 19, 2012 at 11:33 AM, Rintze Zelle <@Rintze_Zelle>wrote: