citing lower-federal, state, and local court decisions

If I’m understanding what you’re describing,
chrome://zotero/content/tools/csledit.xul and/or
chrome://zotero/content/tools/cslpreview.xul already do most of this. It
would only take a few minutes to change those to wrap the output in XML
instead.

  • Dan

Hi All,

Yes, but:

a) reformatting the fragments to be valid JSON would make them much
harder to read visually (and so harder to “figure out what a test
actually does”), and …

With this route, there would be two top-level hierarchies, with
identical content in different forms, say ./humans and ./machines.
The ./humans area would be easy for us to handle, the ./machines area
would be easy for our silicon friends.

Something like this seems reasonable to me, if there is a good way to
write the tests human readably first and then automatically process
them to be easily machine useable (which as Frank suggest I think is
pretty reasonable - it’s essentially what I’ve been trying to do at
present, except in an official, controlled, and less ad hoc way).

From my perspective, a) I don’t want to have to parse a custom format,
especially one un- or lightly documented, if a standard format with
widely available parsers will do (I don’t much care what one - JSON,
XML, YAML, whatever) and b) if possible I want to get something that
can be fed directly into the machinery, or at least require minimal
massaging before doing so (e.g. either valid CSL or the knowledge of
what I can expect to need to add to get valid CSL). And I think this
would be useful for anyone who decides they want to take a shot at
implementing this.

On the other hand, I certainly don’t want to make a burden for people
like Frank who are kindly making such tests available for everyone ,
which I greatly appreciate.

Howard

Yes; I agree on the goals.

I also think we ought, following Frank’s note earlier, separate UI
issues (how to create tests) from test processing issues.

I still think XML is likely to work the best in this context for the
latter, and that the former can be solved in a number of ways, the
simplest being a little script (e.g. rather than writing a script to
create a normalized JSON, do it for XML instead).

I have maybe a crazy idea that I’ll just throw out: what about a
special “test” variant of a CSL style, where the style includes the
expected output and input data in the cs:info header? That still
doesn’t solve the input issue above, but it does mean a single,
self-contained, test (or test setup). It also means we can validate
all of it, include the result output.

But this is just a thought; am certainly open to other ideas.

Bruce

Actually …

I have maybe a crazy idea that I’ll just throw out: what about a
special “test” variant of a CSL style, where the style includes the
expected output and input data in the cs:info header?

This probably IS a stupid idea :wink:

It’s easy enough to tweak a schema to validate CSL embedded in any
XML, so the earlier example I posted (or something similar) has the
same advantages, but is simpler.

Bruce

I suppose the fixtures could be rewritten as XML. E.g.:

blah, blah, blah [...] [... csl fragment ...] [... expected result ...]

Here’s a thought. Casting the tests in XML would be clean, I just
worry that the amount of typing involved might discourage people from
writing them. It needs a UI – and we have one, in Zotero. If there
were a plugin to allow a developer to select a set of items from his
local database, select a style, and select whether they should be
rendered as bibliography or citations, and (possibly) enter a notional
output string, the plugin could dump out XML that’s ready to go.

I wouldn’t be up to writing that myself, but it would be really nice
to have. If there is a prospect of that emerging in the medium term,
the minor pain of refactoring the tests in XML by hand would
definitely be worth it. You could validate the bejeebers out of a
processor with very little effort.

If I’m understanding what you’re describing,
chrome://zotero/content/tools/csledit.xul and/or
chrome://zotero/content/tools/cslpreview.xul already do most of this. It
would only take a few minutes to change those to wrap the output in XML
instead.

That’s the idea! What I have in my mind’s eye is a round-trip setup
in which non-programmers can file complaints about formatting issues
by submitting test data themselves. My description above was
complicated (too close to the trees). More simply, the user would
call up something like csledit.xul, and the tool would offer the
option of entering prefix, suffix and locator for each of multiple
selected items (i.e. same as in a word processor plugin – that would
capture issues related to joins and sorting). They hit preview to get
a rendered version of the citation and bibliography in a “CSL
formatting request” form. They enter some text to describe the issue,
hit submit, and the test is filed.

There would be some work in setting it up, but a workflow like that
might lighten things quite a bit going forward – when you get to
hierarchical relations, it’s easy to imagine a flood of queries and
complaints about the new functionality and its relation (no pun
intended) to CSL processing. Submitted tests could be displayed
online for comment and reference. Communicating a formatting issue as
a verbal description is cumbersome, and seems to generate significant
traffic (and occasional rancour) on the forum. A tool for direct
submission of test reports would save developers a lot of hassle, it
would provide unambiguous illustrations of issues for CSL development
discussion, and it would be a handy tool for quickly building a
repository of usage notes. Something for everyone.

That’s the sales pitch, anyway. (I’m not sure whether it would be
better implemented in Zotero itself, or as a page on the web, drawing
from the user’s data store on the server. When the processor is
ready, you could easily go either way, I guess.)

Frank

That’s the sales pitch, anyway. (I’m not sure whether it would be
better implemented in Zotero itself, or as a page on the web, drawing
from the user’s data store on the server. When the processor is
ready, you could easily go either way, I guess.)

I think the trick would be just to distinguish issues related to style
design and bugs , on one hand, and those that are truly implementation
bugs. The former ought to go on a nice web repo app, where each styles
gets a page with comments. The latter, am not sure, but it might
extend the existing reports feature?

So on the test fixture stuff, I’m not hearing any consensus, so will
just recommend, Frank that you give your approach. My understanding is
that you’re going to write a script to transform the stuff into proper
JSON. Once that’s ready, both Howard and I* can try to see if that
will work for us.

If we have problems, it ought to be easy enough to tweak the script to
output something different.

Bruce

  • I am SLOWLY working on the python version, but now would be a good
    time to put the test infrastructure in place.

That’s the sales pitch, anyway. (I’m not sure whether it would be
better implemented in Zotero itself, or as a page on the web, drawing
from the user’s data store on the server. When the processor is
ready, you could easily go either way, I guess.)

I think the trick would be just to distinguish issues related to style
design and bugs , on one hand, and those that are truly implementation
bugs. The former ought to go on a nice web repo app, where each styles
gets a page with comments. The latter, am not sure, but it might
extend the existing reports feature?

So on the test fixture stuff, I’m not hearing any consensus, so will
just recommend, Frank that you give your approach. My understanding is
that you’re going to write a script to transform the stuff into proper
JSON. Once that’s ready, both Howard and I* can try to see if that
will work for us.

Will do. Should have this ready later in the day.

Frank

That’s the sales pitch, anyway. (I’m not sure whether it would be
better implemented in Zotero itself, or as a page on the web, drawing
from the user’s data store on the server. When the processor is
ready, you could easily go either way, I guess.)

I think the trick would be just to distinguish issues related to style
design and bugs , on one hand, and those that are truly implementation
bugs. The former ought to go on a nice web repo app, where each styles
gets a page with comments. The latter, am not sure, but it might
extend the existing reports feature?

So on the test fixture stuff, I’m not hearing any consensus, so will
just recommend, Frank that you give your approach. My understanding is
that you’re going to write a script to transform the stuff into proper
JSON. Once that’s ready, both Howard and I* can try to see if that
will work for us.

If we have problems, it ought to be easy enough to tweak the script to
output something different.

The test objects are now ready in the rough. I haven’t run them
through citeproc-js yet and I haven’t done any validity checking on
the files, but the should be close. The file README.txt in the ./std
directory gives the rundown.

The one feature of this that might give rise to comment is the way
names are written. I have expressed them in a parseable plain text
syntax that works for everything I’ve seen, and shouldn’t be difficult
to parse into an internal representation. This isn’t any sort of
standard, but I’d like to stick with this, or something similar, if
possible, rather than getting into fixing a JSON structure on names.
Applications may need to remap the sub-elements anyway, and this form
is easy for people to read and write.

Anyway, hope you enjoy them!

Frank

The test objects are now ready in the rough. I haven’t run them
through citeproc-js yet and I haven’t done any validity checking on
the files, but the should be close.

A minimal python file like this:On Sun, Apr 5, 2009 at 1:40 AM, Frank Bennett <@Frank_Bennett> wrote:

====
import json
import glob
import os

TESTS = glob.glob(os.path.expanduser(‘~/xbiblio/citeproc-js/branches/fbennett/std/machines/*.json’))

def run_tests():
for test_path in TESTS:
test = json.loads(open(test_path).read())
print(test)

run_tests()

… fails after the first two, with the following which probably
suggests some problems with the JSON:

Traceback (most recent call last):
File “tests/csl_test.py”, line 20, in
run_tests()
File “tests/csl_test.py”, line 13, in run_tests
test = json.loads(open(test_path).read())
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/init.py”,
line 307, in loads
return _default_decoder.decode(s)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 336, in raw_decode
obj, end = self._scanner.iterscan(s, **kw).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 183, in JSONObject
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 217, in JSONArray
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 183, in JSONObject
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 219, in JSONArray
raise ValueError(errmsg(“Expecting object”, s, end))
ValueError: Expecting object: line 5 column 152 (char 823)

BTW, notwithstanding what are likely minor problems with the JSON
output, this looks promising; thanks!

Bruce

The test objects are now ready in the rough. I haven’t run them
through citeproc-js yet and I haven’t done any validity checking on
the files, but the should be close.

A minimal python file like this:

====
import json
import glob
import os

TESTS = glob.glob(os.path.expanduser(‘~/xbiblio/citeproc-js/branches/fbennett/std/machines/*.json’))

def run_tests():
for test_path in TESTS:
test = json.loads(open(test_path).read())
print(test)

run_tests()

… fails after the first two, with the following which probably
suggests some problems with the JSON:

Traceback (most recent call last):
File “tests/csl_test.py”, line 20, in
run_tests()
File “tests/csl_test.py”, line 13, in run_tests
test = json.loads(open(test_path).read())
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/init.py”,
line 307, in loads
return _default_decoder.decode(s)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 336, in raw_decode
obj, end = self._scanner.iterscan(s, **kw).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 183, in JSONObject
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 217, in JSONArray
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 183, in JSONObject
value, end = iterscan(s, idx=end, context=context).next()
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py”,
line 55, in iterscan
rval, next_pos = action(m, context)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py”,
line 219, in JSONArray
raise ValueError(errmsg(“Expecting object”, s, end))
ValueError: Expecting object: line 5 column 152 (char 823)

There were a couple of stray commas in one of the files. I’ve fixed
that, and citeproc-js now runs them okay. I just checked in the
changes; try again against an update and see what happens.

Frank

Still getting the error on my end. Do you have Python (I’m using 2.6
here, which includes the JSON parser by default) installed?

Bruce

There were a couple of stray commas in one of the files. I’ve fixed
that, and citeproc-js now runs them okay. I just checked in the
changes; try again against an update and see what happens.

Still getting the error on my end. Do you have Python (I’m using 2.6
here, which includes the JSON parser by default) installed?

The copy on this machine claims to be 2.5. Checking the distro, it
doesn’t seem to offer 2.6. Checked the mythtv box in the house, same
thing there.

Is this on all files, or just one of them? It could be some simple
thing common to all of them. I didn’t check the JSON spec, they’re
just anonymous blobs that JS will read without complaint. It’s kind
of late here, I’m going to call it a day; but if you can find what’s
causing the error, I can follow up tomorrow.

Frank

There were a couple of stray commas in one of the files. I’ve fixed
that, and citeproc-js now runs them okay. I just checked in the
changes; try again against an update and see what happens.

Still getting the error on my end. Do you have Python (I’m using 2.6
here, which includes the JSON parser by default) installed?

The copy on this machine claims to be 2.5. Checking the distro, it
doesn’t seem to offer 2.6. Checked the mythtv box in the house, same
thing there.

Is this on all files, or just one of them? It could be some simple
thing common to all of them. I didn’t check the JSON spec, they’re
just anonymous blobs that JS will read without complaint. It’s kind
of late here, I’m going to call it a day; but if you can find what’s
causing the error, I can follow up tomorrow.

Frank

Installed cjson, and a quick shot at ./std/machines/name_Delimiter.json
parsed it without complaint. Just one file, but it looks like it’s not a
uniform failure. If you can pin down the files that are causing the
error, I’ll fix whatever is causing the glitch.

Kinda tired from an evening of rattling the keyboard, but relieved to
get this in place; it’s been a good bout of back and forth over the way
forward, and a good result I think. Should make everyone’s life a little
easier, and that’s what it’s about!

Frank

Here’s a revised test file:===
import json
#import citeproc
import glob
import os

TESTS = glob.glob(os.path.expanduser(’~/xbiblio/citeproc-js/branches/fbennett/std/machines/*.json’))

def run_tests():
for test_index, test_path in enumerate(TESTS):
try:
test = json.loads(open(test_path).read())
print(test_index + 1)
except:
print("oops, “, test_path, " failed.”)

run_tests()

The results show two files being the problem:

  1. name_CollapseRoleLabels.json

  2. name_TwoRolesSameRenderingSeparateRoleLabels.json

Bruce

OK, here’s what JSLint reports on the first of them:

Error:

Problem at line 5 character 161: Unexpected comma.

“input”: [{ “id”:“editor-translator-2”, “type”: “book”, “editor”: [ { "name…

Problem at line 5 character 276: Unexpected comma.

“input”: [{ “id”:“editor-translator-2”, “type”: “book”, “editor”: [ { "name…

I can’t figure out how to get it validate, frankly.

Bruce

Oh, there are trailing commas at the end of a few arrays. Remove
those, and they validate.

Bruce

Hi Frank,
Just got back to this after a while and the new test format looks great. I’m
going to try to fold it into the Ruby implementation before too long.

I think that our parser will still have a problem with the CSL because of
the missing info fields, but I think it’s just as well for me to augment the
CSLs after reading them in and before sending them to the CSL parser as to
clutter the tests.

Also grind.sh on my Mac doesn’t work exactly correctly - it preserves the

===CSL lines, etc. I think perhaps sed works a bit differently. This is no
real problem and not worth any work as I’m just going to use the tests as
is, but I wanted to note it. Someone else might know how to fix it
trivially.

Thanks for all the work you’ve done on this,
Howard

Just got back to this after a while and the new test format looks great. I’m
going to try to fold it into the Ruby implementation before too long.
I think that our parser will still have a problem with the CSL because of
the missing info fields, but I think it’s just as well for me to augment the
CSLs after reading them in and before sending them to the CSL parser as to
clutter the tests.

Right, and perhaps allowing an empty info attribute on the style object?

Also grind.sh on my Mac doesn’t work exactly correctly - it preserves the

===CSL lines, etc. I think perhaps sed works a bit differently. This is no
real problem and not worth any work as I’m just going to use the tests as
is, but I wanted to note it. Someone else might know how to fix it
trivially.

Install gsed, via say mac ports. It will then work correctly.

Bruce

Hi,

Right, and perhaps allowing an empty info attribute on the style object?

I’d be perfectly happy with that too. :slight_smile:

Install gsed, via say mac ports. It will then work correctly.

Thanks, that got it.

Howard