API

Was just modifying the python test a bit for reference list and
wondering again about an API.

What I’m imagining is developer of some unknown application:

imports module/library
creates a ReferenceList object
loads it with data
uses a simple method to get back formatted output for citations and/or 

bibliographies

If ReferenceList stores the CSL object, then you do:

ReferenceList.new("apa")

… and something like:

ReferenceList.bibliography

If you need output for a particular bibliographic entry, it might be:

r = ReferenceList.item.find("urn:isbn:345465667")
print r.reference

If you need to format a particular citation, maybe it’s:

r = ReferenceList.item.find("urn:isbn:345465667")
print r.citation

If you change styles in a running process, I guess you need a method to
destroy and recreate the object.

In any case, just wondering what people think about the basics of a
cross-language API. What are the most critical exposed classes, and
their most essential methods?

I think we should try to answer these questions and adjust the unit
tests as needed.

Bruce

Hi Bruce,

Thanks for the links about Python and testing. I’ll have a look at it
this week. I got sidetracked a bit this week, working on a somewhat
related project, anyway, I hope to put some more work into CiteProc-py
this week.

About your API proposal. It doesn’t make much sense to me yet. So far
I stored my references in a simple list, not an object. For some
purposes it might perhaps be handy, esp. for methods that search
trough and sort the list.

I was more thinking (and implementing) an object that does the actual
formatting (see citation_style.py). The references list is just a
“dumb” list. The actual “intelligent” stuff happens in citation_style.
My main part (see citeproc.py) handles file input, and takes care of
obtaining the needed citations/references and outputting it again
through the various drivers.

So:

    CitationStyle.new("apa")
    CitationStyle.bibliography

Or actually it’s named at the moment:

    CitationStyle(file="apa.csl")
    CitationStyle.textForBibliography

You really want to synchronize the API across the different
implementations? Perhaps we should, but I am not yet convinced. :slight_smile:

Btw, I am thinking to switch to use lxml which is ElementTree
compatible but has XPath support and RelaxNG validation, both of which
I think are very handy. I only still have to figure out how we can
distribute the final code for CiteProc-py without requiring the user
to have to get to install lxml first. That isn’t something the average
computer user can do. But if I am not mistaken it should be possible
to wrap it all up into easy installer packages, at least for OS X and
Windows. The Linux people likely will known how to get lxml installed,
but also there it should be possible. I just don’t yet know exactly
how.

Any objections against using lxml?

Johan

About your API proposal. It doesn’t make much sense to me yet. So far
I stored my references in a simple list, not an object. For some
purposes it might perhaps be handy, esp. for methods that search
trough and sort the list.

I was more thinking (and implementing) an object that does the actual
formatting (see citation_style.py). The references list is just a
“dumb” list. The actual “intelligent” stuff happens in citation_style.

OK, but I think we still need to think about the API.

My main part (see citeproc.py) handles file input, and takes care of
obtaining the needed citations/references and outputting it again
through the various drivers.

OK.

So:

    CitationStyle.new("apa")
    CitationStyle.bibliography

Or actually it’s named at the moment:

    CitationStyle(file="apa.csl")
    CitationStyle.textForBibliography

Peter was suggesting methods calls like:

list.to_xhtml(style='apa')

… and Ed:

references = ReferenceList()

references added here

formatter = APAFormatter()
print formatter.format(references)

Am not sure what I think, but in any case, it makes sense to me that
one formats a list of references; list.format, or format(list).

You really want to synchronize the API across the different
implementations? Perhaps we should, but I am not yet convinced. :slight_smile:

I think it’s enough to do a good one for the Python version and then
worry about others later.

Thinking about how people would actually use the code tends to result
in better design.

It might be worth thinking about three use scenarios:

  1. the simple script, commandline batch processing (a la BibTeX,
    Markdown, etc.)
  2. a web app (say Django-based bib app, PyULike, etc.)
  3. a desktop app (Word or OpenOffice integration)

I think it’s possible to design things in a way that makes it easy to
any of them.

Btw, I am thinking to switch to use lxml which is ElementTree
compatible but has XPath support and RelaxNG validation, both of which
I think are very handy.

Why would RNG validation be “handy” (here; obviously it’s valuable for
CSL stuff per se)?

I only still have to figure out how we can distribute the final code
for CiteProc-py without requiring the user to have to get to install
lxml first. That isn’t something the average
computer user can do.

Right, and ElementTree is part of the standard library, isn’t it?

But if I am not mistaken it should be possible to wrap it all up into
easy installer packages, at least for OS X and Windows. The Linux
people likely will known how to get lxml installed, but also there it
should be possible. I just don’t yet know exactly
how.

Any objections against using lxml?

Ideally, you’d use standard libraries, but if there’s a good case for
using something else, then you probably should. I don’t know Python
well enough to comment much on the best choice.

Bruce

BTW, going back to this:

But if I am not mistaken it should be possible
to wrap it all up into easy installer packages, at least for OS X and
Windows. The Linux people likely will known how to get lxml installed,
but also there it should be possible. I just don’t yet know exactly
how.

I think what you should be thinking of is not standalone installer
packages, but rather:

port install py-citeproc
fink install py-citeproc

… and so forth.

Those packaging systems are widely supported on tons of OSes (though
maybe not on Windows), and they manage dependencies for you (including
for Python itself).

So just include the lxml dependency if you need it, and no worries.

Note, though: I’m not sure if darwin ports has such a package?

py-libxml2 python/py-libxml2 2.6.16 Python bindings
for libxml2
py-texml python/py-texml 1.20 XML vocabulary for TeX
py-xml python/py-xml 0.8.4 XML Tools for Python
py-xmldiff python/py-xmldiff 0.6.6 diff for xml
files as command line tool and python module
py-xmlsec python/py-xmlsec 0.2.1 a set of Python
bindings for the XML Security Library.
py-xmltramp python/py-xmltramp 2.16 easy-to-use
python API for XML documents

Granted, those packages typically just download the archive, and then
run the “python setup.py” command (maybe with options), so that would
be the first step.

Bruce