ok, need help

OK, so Johan is working on a python port of citeproc.

I was thinking that I could help him with the unit tests and help set a
kind of framework for where we want to be with an API.

The question is, what’s the best way to set up citeproc-py (or the ruby
version) for smooth integration with different code?

My thought is that it’s really clear (to me!) that we need a
ReferenceList class. User then creates it like so:

list = ReferenceList()

… loads it up with Reference objects (which is mostly a
dictionary/hash to store the variables, probably based on a list of
citation ids), and then can just do:

list.to_xhtml
list.to_rtf

… etc.

Right?

In the Ruby version I have, I also have a CitationStyle class, where
the objects are created from the CSL file.

I had originally thought of that as separately generated, but it seems
to me it makes sense to have the CSL object as an attribute of
ReferenceList.

So you’d do:

ReferenceList(style="apa")

By default that would be empty I guess (because sometime you might not
want to output formatted references). If the option is there, the
CitationStyle object gets generated there.

What I suggest, then, following David Wilson’s suggestion, is that the
Reference objects can also return two different kinds of citations:

full (default)
short (for subsequent citations, say)
ibid

[not sure how to deal with local citation modifications like
suppressing author names]

In other words, whatever other code is dealing with citations is
responsible for knowing what kind of output it needs at any given point
in a document (because it’s contextual, so something that citeproc
can’t really know about).

Does this all sound right?

Bruce

Does this all sound right?

Well it sounds OK :slight_smile:

My concern would be that the Reference and ReferenceList objects are
really about the semantics of the citation right–and you happen to
have various ways of representing that semantic content using a
style. How about you have various Formatter classes that are
responsible for rendering a ReferenceList (or Reference) using a
particular style?

references = ReferenceList()

references added here

formatter = APAFormatter()
print formatter.format(references)

I imagine I’m revealing my ignorance about citeproc here.

//Ed

How about you have various Formatter classes that are
responsible for rendering a ReferenceList (or Reference) using a
particular style?

references = ReferenceList()

references added here

formatter = APAFormatter()
print formatter.format(references)

Sounds alright, except that I don’t want to write a class for every
style out there. So Formatter will be a more generic class. So it
would be like this:

formatter = Formatter(style=“APA”) or
formatter = Formatter(stylefile=“apa.csl”)
print formatter.format(references)

Johan

Sorry Bruce for not responding earlier. Very busy.

I am copying Ron Ward, who is the developer who will be working on our bibliographic support for OpenOffice.org and Word.

The only comment I have about this is that the CSL style is not an attribute of the referenceList, but of the output of the reference list.

So instead of ReferenceList(style=“apa”) I would think you want list.to_xhtml(style=‘apa’)

Peter Sefton
Technical Manager, RUBRIC, University of Southern Queensland

@Peter_Sefton
p: +61 (0)7 4631 1640
m: +61 (0)410 326 955-----Original Message-----
From: Bruce D’Arcus [mailto:@Bruce_D_Arcus1]
Sent: Sun 2006-07-23 5:49 AM
To: Simon Kornblith; Edward Summers; Peter Sefton
Cc: Matthias Steffens; development discussion for xbiblio; Johan Kool
Subject: ok, need help

OK, so Johan is working on a python port of citeproc.

I was thinking that I could help him with the unit tests and help set a
kind of framework for where we want to be with an API.

The question is, what’s the best way to set up citeproc-py (or the ruby
version) for smooth integration with different code?

My thought is that it’s really clear (to me!) that we need a
ReferenceList class. User then creates it like so:

list = ReferenceList()

… loads it up with Reference objects (which is mostly a
dictionary/hash to store the variables, probably based on a list of
citation ids), and then can just do:

list.to_xhtml
list.to_rtf

… etc.

Right?

In the Ruby version I have, I also have a CitationStyle class, where
the objects are created from the CSL file.

I had originally thought of that as separately generated, but it seems
to me it makes sense to have the CSL object as an attribute of
ReferenceList.

So you’d do:

ReferenceList(style="apa")

By default that would be empty I guess (because sometime you might not
want to output formatted references). If the option is there, the
CitationStyle object gets generated there.

What I suggest, then, following David Wilson’s suggestion, is that the
Reference objects can also return two different kinds of citations:

full (default)
short (for subsequent citations, say)
ibid

[not sure how to deal with local citation modifications like
suppressing author names]

In other words, whatever other code is dealing with citations is
responsible for knowing what kind of output it needs at any given point
in a document (because it’s contextual, so something that citeproc
can’t really know about).

Does this all sound right?

Bruce

Hi Peter,

Did you see my e-mail to xbiblio-devel about this?

How about you have various Formatter classes that are
responsible for rendering a ReferenceList (or Reference) using a
particular style?

references = ReferenceList()

references added here

formatter = APAFormatter()
print formatter.format(references)

Sounds alright, except that I don’t want to write a class for every
style out there. So Formatter will be a more generic class. So it
would be like this:

formatter = Formatter(style=“APA”) or
formatter = Formatter(stylefile=“apa.csl”)
print formatter.format(references)

This has the benefit that it keeps the Reference class rather dumb,
just storing the data and that the logic of the implementation of the
formatting is kept seperate in the Formatter class (or whatever it
will be named).

I am copying Ron Ward, who is the developer who will be working on our bibliographic support for OpenOffice.org and Word.

What language is going to be used for that? Bruce mentioned something
about OOo be able to use a Python bridge. Is what I have written so
far useful for this? I’ve restructured the code quite a bit since your
last mail btw.

Is Ron Ward subscribed to the xbiblio-devel list already btw? That
might be very useful.

Cheers,

Johan

Hmm … but even basic things like how the RefernceList is sorted and
subsequently processed are determined by the style. It’s not possible
to say that it’s just an attribute of the output; it’s fundamental.

Bruce

Just to explain the difficulty here, let’s take an author-year style
like APA. And let’s say you have two references from the Economist,
both from the same year (1999), and neither of which have authors
listed.

Your bibliography output is:

Economist (1999a) …
Economist (1999b) …

Your citation in the document is then …

(Economist, 1999a).

… because the citation is just like a plain-text link.

The style says how to sort (author-date), and how to substitute when
there is, for example, no author. That substitution logic varies
depending on your reference type.

Output formatting for both citations and bibliographies are thus
dependent on the style.

The only way I see to abstract the two (formatting and sorting
basically) is to have a fixed list of sort-algorithms that also cover
the substitution logic (if book then use title, if article then use
periodical title, etc.).

Then in CSL you could just do:

...

… and if there were variants, allow “author-date-apa.”

That would then allow formatting to be decoupled from the reference
list, and would simplify CSL. But it would also bring with it
limitations (namely that you lose the ability to configure
sorting/substitution, beyond choosing from a list of algorithms).

Bruce