objects and processing

Quick followup:

This is the internal XML representation in citeproc-xsl, where you
can see the parameters I’m adding to properly process the output:

<biblio:Reference rdf:about="urn:isbn:1844670058"
                  cp:shorten-author="false"
                  cp:refclass="monograph"
                  cp:reftype="book"
                  cp:use-reftype="book"
                  cp:sort-on="Butler,Judith"
                  cp:biblist-position="1">

I would expect it to be valuable to have Reference objects if for
nothing else to be able to easily store and reference those
parameters during processing.

But then you can also do stuff like write different output methods:
to_xhtml, to_markdown, to_ris, etc.

Bruce

Originally, I was just assuming there’d be a data set and a CSL file for
each test, and I’d have to format the data by hand, although this approach
doesn’t scale very well if we decide we need more than 5-10 test cases.

Same here. Plus, I don’t quite consider the scaling very problematic.

Put differently, why would you not create objects?

Well, to be exact, I do actually use objects, only these are not by
own custom objects, but of class Element (from xml.dom.minidom). The
reason I chose this was that especially RDF and MODS store stuff in a
hierarchical manner (you know this right, Bruce? ;-)). I would have to
invent a way to do that in a custom object too in order not too lose
data. I didn’t quite felt like inventing an object that could combine
RDF and MODS because I felt it was quite hard to do.

Right now my code just simply uses a method to fetch the data by
asking a method for it. “I need to know for this reference what the
author is (provide it to me according to this options)” (that is this
function: dataForMatch(self, match, keyElement, useEtAlStyle,
biblioref=None)) (will probably rename this function to
dataForRDFMatch)

This function is pretty straightforward, and if I want to support
another storage format, I only need to change it (perhaps name the
method something like dataForModsMatch).

OK, but you are assuming files then. What happens if you change your
mind and want to store the data in an SQL database? Or the XML source
changes? Isn’t it easier to just change some input code and keep the
internal stuff constant?

Of course I could write a dataForSQLMatch that queries data whenever
it is asked for (with some caching obviously). My input code is in the
dataFor…Match function.

That’s a possibility. My only concern about that is that what are you
testing then: the citation processing, or the input code?

It’s not very hard to see where it goes wrong usually. Plus, shouldn’t
both be tested anyway?

Johan

Put differently, why would you not create objects?

Well, to be exact, I do actually use objects, only these are not by
own custom objects, but of class Element (from xml.dom.minidom).

So your object are about the XML, rather than the data.

With XSLT, I’m stuck with doing that pretty much, but it seems like a
bad approach when you have a more powerful language like Python to work
with.

The reason I chose this was that especially RDF and MODS store stuff
in a
hierarchical manner (you know this right, Bruce? ;-)). I would have to
invent a way to do that in a custom object too in order not too lose
data. I didn’t quite felt like inventing an object that could combine
RDF and MODS because I felt it was quite hard to do.

Not really. Think more of the CSL model, and of a dictionary that has
something like the following keys:

type
author*
editor*
translator*
publisher*
container*
collection*
title
short_title
volume
issue
pages

The ones with * are then arrays.

It’s pretty easy to map different formats to that, and then to hook up
to CSL. E.g.:

<title type="container"/>  =  ref.data["container"]["title"]

Bruce