testing

Simon, you mentioned testing, and I was mentioning it to Johan. Is
there a JS unit testing framework that’s decent that you use?

I got to thinking as I working on a Python test that if we used
YAML/JSON (e.g. YAML that is also valid JSON) we could use the same
test data. It may not be necessary, though (?).

Here, BTW, is a good test for whether you can get correct grouping and
sorting, and substitution, for author-year citations.

! /usr/bin/env python

from citeproc import CiteProc

class TestReferenceList:

 def test_setup():
     data = [
       {
         "type": "Article"
         "title": "Some Title",
         "year": "1999",
         "author": [{"given_name":"Jane", "family_name":"Doe"}]
       },
       {
         "type": "NewsArticle"
         "title": "News Title",
         "year": "2002",
         "periodical": {"title":"Newsweek"}
       },
       {
         "type": "Article"
         "title": "XYZ",
         "year": "1999",
         "author": [
                     {"given_name":"Jane", "family_name":"Doe"},
                     {"given_name":"Susan", "family_name":"Smith"}
                   ]
       },
       {
         "type": "NewsArticle"
         "title": "Second News Title",
         "year": "2002",
         "periodical": {"title":"Newsweek"}
       },
       {
         "type": "Article"
         "title": "Another Title",
         "year": "1999",
         "author": [{"given_name":"Jane", "family_name":"Doe"}]
       }
     ]
     list = ReferenceList(style="apa", data=data)

 # first test basic author-year grouping and sorting
 def test_reference_grouping_sorting():
     # use itertools groupby?
     assert list[1].suffix == "b"
     assert list[2].suffix == nil

 # test that substitution works properly where there is no author
 def test_reference_grouping_sorting_substitution():
     assert list[3].suffix == "a"

Hello all,

About the testing… I am not yet sure as to wether we should go the
way as described. Don’t get me wrong, I do understand the need to
write tests and to use test to make sure everything works as expected.
However, the test that Bruce has shown me so far all implicated that
objects are created in the code to represent the data, in the example
given in this thread a reference.

I am not sure wether or not this is right or not. The code that I have
written so far doesn’t actually do that, instead I use
functions/methods (whichever those are called in Python) that fetch
the desired data from the files. I therefor only fetch data from the
files at the moment I need it. Is it really necessary to create
objects from the references and csl-style etc. for CiteProc? It might
be handy when writing a GUI editor or such, but CiteProc as I see it
is more of a script that runs, does it things and exits. It seems much
simpler to implement it in this script like style than to go and
objectify everything.

Anyway… dinner is ready now… I’ll get back to this later. I also
still need to catch up on reading some mails due to the server
backlog.

Cheers,

Johan

Originally, I was just assuming there’d be a data set and a CSL file for
each test, and I’d have to format the data by hand, although this approach
doesn’t scale very well if we decide we need more than 5-10 test cases.

If we used either Biblio/RDF or MODS as the test data format, that would
work fine for my purposes. Anyone writing a CSL-capable application that
didn’t support the data format could always convert it him/herself. We could
even supply the data in multiple formats.

Simon

About the testing… I am not yet sure as to wether we should go the
way as described. Don’t get me wrong, I do understand the need to
write tests and to use test to make sure everything works as expected.
However, the test that Bruce has shown me so far all implicated that
objects are created in the code to represent the data, in the example
given in this thread a reference.

Yes, because I think when you start getting into the details you’ll
find it quite hard to avoid. I think you need, for example, to attach
methods to manipulate the references.

Note, I am not assuming anything in particular about how the data is
stored internally, but it seems to me the object can be quite simple,
where the data itself is just a dictionary.

That way you can also do things like attach parameters to the objects
during processing. In the XSLT, I do something like this (though it’s
more awkward in XSLT) in order to do substitution, adding of the year
suffixes, etc. I think you need this because, for example, to process a
citation you need to look this information up? Citation formatting is
dependent on the processing of the reference list.

At its simplest, then, you have an object with two primary dictionary
properties: data and parameters.

Put differently, why would you not create objects?

I am not sure wether or not this is right or not. The code that I have
written so far doesn’t actually do that, instead I use
functions/methods (whichever those are called in Python) that fetch
the desired data from the files. I therefor only fetch data from the
files at the moment I need it.

OK, but you are assuming files then. What happens if you change your
mind and want to store the data in an SQL database? Or the XML source
changes? Isn’t it easier to just change some input code and keep the
internal stuff constant?

Is it really necessary to create objects from the references and
csl-style etc. for CiteProc? It might be handy when writing a GUI
editor or such, but CiteProc as I see it
is more of a script that runs, does it things and exits. It seems much
simpler to implement it in this script like style than to go and
objectify everything.

My concern most fundamentally is the interface. Citation processing
gets complicated very quickly.

Bruce

That’s a possibility. My only concern about that is that what are you
testing then: the citation processing, or the input code?

Bruce

Hmm. I guess it barely matters whether we go with the object format or a
different data format. I need to convert whatever format we use into the
Scholar object format anyway, and once that’s done, I can apply any of the
Scholar export translators to provide it in an alternative (XML-based)
format.

Simon

OK, just asking :wink:

Bruce