Updates for citeproc-hs

If I understand it correctly:

  • “name-as-sort-order” means: Doe, John, instead of John Doe;
  • “sort-separator” is the comma after Doe in “Doe, John”
  • “initialize-with” means "J. " instead of “John”

If we put everything together we end up with:
Doe, J. .(start macro=“issued”)…and the rest

Is that right?

Yes … unless you have a non-Western name with different sort rules
(like “Mao Zedong”).

Bruce

so, may I say that the apa style is incorrect and that, probably, a
better version of that style would be:

...

with

... ...

which would result in:
Laumann, E.O., Gagnon, J.H., Michael, R.T. (1994). The social organization of sexuality: Sexual practices in the United States. Chicago: University of Chicago Press.
Smith, J.M. (1998). The origin of altruism. Nature, (393), 639-640.
Doe, J., Smith, J. (2000). Introduction: A Chapter Title. In J. Doe, J. Smith (Cur.), Edited Book Title, Series Title… New York: ABC Books.

This presents another problem, the fact that there’s no space between
initials, but I think this could just be left to the implementation,
which has to deal with name suffixes or prefixes any how. I find a bit
confusing and counter-intuitive the fact the many times "initialize-with"
comes with an extra space.

Actually I’ve just pushed a patch that fixes this in citeproc-hs, so
now the result would be:

Smith, J. M. (1998). The origin of altruism. Nature, (393), 639-640.

And harvard would correctly produce:
Lemley, M. A., Lessig, L., 2001. The End-to-End…

I had chosen the previous behaviour more to conform to the apa, while
I now think that the apa style should be corrected in this regard.

What do you think?

Andrea

so, may I say that the apa style is incorrect and that, probably, a
better version of that style would be:

...

with

... ...

I’m not really sure what it is in the current style that you’re correcting here?

which would result in:
Laumann, E.O., Gagnon, J.H., Michael, R.T. (1994). The social organization of sexuality: Sexual practices in the United States. Chicago: University of Chicago Press.
Smith, J.M. (1998). The origin of altruism. Nature, (393), 639-640.
Doe, J., Smith, J. (2000). Introduction: A Chapter Title. In J. Doe, J. Smith (Cur.), Edited Book Title, Series Title… New York: ABC Books.

This presents another problem, the fact that there’s no space between
initials, but I think this could just be left to the implementation,
which has to deal with name suffixes or prefixes any how. I find a bit
confusing and counter-intuitive the fact the many times “initialize-with”
comes with an extra space.

To get a space between initialize, use a value like ". "; is that
really not intuitive?

Actually I’ve just pushed a patch that fixes this in citeproc-hs, so
now the result would be:

Smith, J. M. (1998). The origin of altruism. Nature, (393), 639-640.

And harvard would correctly produce:
Lemley, M. A., Lessig, L., 2001. The End-to-End…

I had chosen the previous behaviour more to conform to the apa, while
I now think that the apa style should be corrected in this regard.

What do you think?

Again; I’m not quite following here. Been a long day for me though :wink:

Bruce

so, may I say that the apa style is incorrect and that, probably, a
better version of that style would be:

I’m not really sure what it is in the current style that you’re correcting here?

See the attached diff.

This presents another problem, the fact that there’s no space between
initials, but I think this could just be left to the implementation,
which has to deal with name suffixes or prefixes any how. I find a bit
confusing and counter-intuitive the fact the many times “initialize-with”
comes with an extra space.

To get a space between initialize, use a value like ". "; is that
really not intuitive?

You mean that the default behaviour of initialize-with should be this:

D.R. Ballman

and that, to get “D. R. Ballman” I must use initialize-with=". " ?

It is counter-intuitive because, if I have to specify spaces between
given names, I would expect initialize-with=“” to produce:

RobertDavid Ballman

Or, in other words, the style harvard.csl should produce “Ballman, D.
R.,” or “Ballman, D.R.,”?

I’d think the second is the wrong one.

Andrea

style.diff (820 Bytes)

You mean that the default behaviour of initialize-with should be this:

D.R. Ballman

There is no default per se. To get that, I’d expect the value of the
attribute to be “.”.

and that, to get “D. R. Ballman” I must use initialize-with=". " ?

Correct.

It is counter-intuitive because, if I have to specify spaces between
given names, I would expect initialize-with=“” to produce:

RobertDavid Ballman

No, you’d have:

RD Ballman

Similarly, initialize-with=" " would yield:

R D Ballman

Or, in other words, the style harvard.csl should produce “Ballman, D.
R.,” or “Ballman, D.R.,”?

Which “harvard.csl”?

Bruce

I’m not sure. The existing styles says “J. D. Smith”; you’re says
"J.D. Smith". Which is correct?

Bruce

It is counter-intuitive because, if I have to specify spaces between
given names, I would expect initialize-with=“” to produce:

RobertDavid Ballman

No, you’d have:

RD Ballman

Similarly, initialize-with=" " would yield:

R D Ballman

To check if I correctly understood, does that mean that:
initialize-with=“. " name-as-sort-order=“all” suffix=”."
should produce:

Ballman, R. D. .

Or this should be:

Ballman, R. D…

Or, in other words, the style harvard.csl should produce “Ballman, D.
R.,” or “Ballman, D.R.,”?

Which “harvard.csl”?

http://www.zotero.org/styles/harvard1

Andrea

To check if I correctly understood, does that mean that:
initialize-with=“. " name-as-sort-order=“all” suffix=”."
should produce:

Ballman, R. D. .

Or this should be:

Ballman, R. D…

Well, this – the general rules for handling trailing punctuation – I
think we have to decide. Has anyone made any progress on this?

Or, in other words, the style harvard.csl should produce “Ballman, D.
R.,” or “Ballman, D.R.,”?

Which “harvard.csl”?

http://www.zotero.org/styles/harvard1

So in that case, it would yield the latter (“Ballman, D.R.,”).

Bruce

I pushed a patch to my implementation so that both formats can be
produced: if requested to evaluate the style strictly the fist output
will be produced.

The default behaviour, instead, is to remove leading and trailing
white spaces and to remove duplicates of the following five
characters: ‘.’ ’ ’ ‘;’ ‘,’ and ‘:’. I didn’t add the option to
produce strict output the test suite, so far.

What do you think?

Andrea

Unfortunately I’m waylaid with other demands at the moment. However I’m
happy to try to support the preferred behaviour once I can get back to it.

Regards,

Liam.> >

So by default you remove any duplicate occurrences of those
punctuation characters, and you strip trailing whitespace, but one can
optionally choose to turn off this handling?

If yes, that sounds good to me.

Bruce

Yes, this is it. I adopted and generalized the Zotero approach: when a
variable is concatenated with something (delimeters, prefixes,
suffixes, other variables), I check if the last character of the
variable is a punctuation (or a space): if yes and it is the same of
the first character of the stuff it is being concatenated with, it is
removed.

But with the “strict” flag you will get exactly what a style is
producing.

Andrea

I’ve just pushed a few patches that add a Pandoc output filer, so that
it is now possible to see some formatting:

  • first compile test.hs (you can run it with runghc if you have
    ghc-6.8.3, otherwise a bug in the interpreter will introduce a final
    character to the output and pandoc will complain):

cd test; ghc --make test.hs -i…/src

  • then run something like:

./test -s -l locales-it-IT.xml -c test.xml -o pandoc styles/apa.csl | pandoc -r native -t html

A sample MODS collection file can be grabbed here:
http://gorgias.mine.nu/tmp/mods_test.xml

Andrea

  • then run something like:

./test -s -l locales-it-IT.xml -c test.xml -o pandoc styles/apa.csl | pandoc -r native -t html

Making some good progress! And the thing is fast too!

Bruce