Hi
I decided to develop a simple CSL processor to convert Zotero json strings
to APA citations. The code will be used in ZotPad and after the processor
works, I will publish the code as a separate project in gihub.
I am using json strings from Zotero server as data and validating the
output against formatted citations from Zotero server. The citations are
formatted using the APA style from
styles/apa.csl at master · citation-style-language/styles · GitHub using
the CSL 1.0.1 specification.
I am using the following bibliography item as my test data
Cadogan, J. W., & Lee, N. (Forthcoming). Improper Use of Endogenous
Formative Variables. Journal of Business Research.
There is one thing that I do not understand. In the APA style (lines
429-434) there is a group
<group delimiter=". ">
<text macro="author"/>
<text macro="issued"/>
<text macro="title" prefix=" "/>
<text macro="container"/>
</group>
The macro “author” has a names element with initialize-with=“. " and the
macro “issued” contains a group with prefix " (”. Now to my understanding,
this means that
- The “author” macro will end with ". " [Cadogan, J. W., & Lee, N.]
- The “issued” macro will start with " (" [ (Forthcoming)]
- The macros are delimited with ". "
This results in a bibliographic item that starts by
Cadogan, J. W., & Lee, N… (Forthcoming).
This is obviously not correct. There should not be a double period followed
by a double space, but I do not understand which part of the formatting
logic is incorrect.
Mikko
Mikko,
Below I’ve assumed that the output is from your project code. If I
have it backwards, let me know.
You have the logic right. That’s the literal result you will get from
flattening the structure without anything more:
[author ending in “.”] + ". “{delimiter} + " (”{prefix} + [issued]
Double punctuation needs to be culled by the processor. It’s a little
tricky, since formatting (italics etc) might lie between the two
periods, depending on the style. There is also potential interaction
with quote marks, depending on whether or not the style has
punctuation-in-quotes set true or false. For those reasons, the cull
function can’t work on the output string: it needs to analyse the
nested structure before collapsing to identify “adjacent” punctuation.
With content strings, delimiters and affixes in the mix, it’s pretty
hair-raising. The citeproc-js code for this is heavily tested and
seems to work quite well, but I would be hard-pressed to explain
exactly how it works.
Concerning spaces, there was a long discussion a couple of years back
concerning whether extraneous spaces added by affixes should be
considered style bugs:
http://xbiblio-devel.2463403.n2.nabble.com/how-much-bugged-a-style-may-be-tt5784767.html#none
That thread does not reflect well on me, I’m afraid. The point made by
Andrea (and, I think, Bruce) is perfectly valid: double-space issues
can be eliminated by more careful construction of CSL code, and
should be. It is also true that masking double spaces in the processor
gives a green light to sloppy coding. That said, the amount of work
required to eliminate all potential extra spaces from the CSL
repository would be pretty staggering. At the end of the day, we’re
kind of stuck with this problem.
Double spaces are hard to catch in the processor for the same reason:
you have to work on the nested structure before it is flattened into
an output string. It’s a little simpler because you can assume input
strings will not have leading or trailing spaces; but tracking spaces
across affix and delimiter attributes across multiple nested layers is
still a challenge.
If you are only going to process one style in one output format and a
single locale, you may be able to fix things up by running a regular
expression over the output string. That wouldn’t work as a general
solution, though.
Sorry for the long response. Hope it helps!
Frank