Citation collapsing question

We have a query that arises from this thread:

The issue is whether, when collapse=“year” is set, citations should
collapse when the visual appearance of two successive name lists is
identical after et al. truncation, but the underlying name lists
differ.

I think the user and adamsmith are probably right on this one, but I’d
like to poll folks for views before making the necessary change. What
does everyone think about this one?

Frank

You have perfect timing. A post from today:

http://blog.apastyle.org/apastyle/2011/02/et-al-when-and-how.html

I’d suggest pinging the APA people.

RintzeOn Fri, Feb 4, 2011 at 10:28 PM, Frank Bennett <@Frank_Bennett>wrote:

You have perfect timing. A post from today:

http://blog.apastyle.org/apastyle/2011/02/et-al-when-and-how.html

I’d suggest pinging the APA people.

Great. Unfortunately, signing into their typepad system via Google or
Twitter sends my browser into a weird race condition. The thing keeps
losing keystrokes until the browser threatens to consume all of the
memory in my machine. Odd, but fatal.

Sorry to be reliant on this one. Pointers (on the style issue) welcome.

Frank

As I just posted on that thread, I disagree. I don’t think this is a bug.

Bruce

Based on subsequent discussion on that thread, I think we probably
need to amend the spec to include more detail on the processing
expectations, add a new attribute (“sort-author-names-as” -->
“full-name-string” | “shortened form”?), and update the test suite
accordingly.

Ugh … I can imagine this is a PITA to program.

Bruce

We have a query that arises from this thread:

http://forums.zotero.org/discussion/16284/citations-not-collapsing/

The issue is whether, when collapse=“year” is set, citations should
collapse when the visual appearance of two successive name lists is
identical after et al. truncation, but the underlying name lists
differ.

I think the user and adamsmith are probably right on this one, but I’d
like to poll folks for views before making the necessary change. What
does everyone think about this one?

As I just posted on that thread, I disagree. I don’t think this is a bug.

Based on subsequent discussion on that thread, I think we probably
need to amend the spec to include more detail on the processing
expectations, add a new attribute (“sort-author-names-as” -->
“full-name-string” | “shortened form”?), and update the test suite
accordingly.

I don’t see anything in that thread about sorting behavior.

The above should have read, “that the grouped references be sorted in
the order of the earliest-dated item in each.”

Frank

Awesome!

So where did we get off the rails here? E.g. are we all OK, or is
there still something to resolve?

Bruce

Sorry I’m joining in so late… well, Bruce supposed output is not
entirely correct. It should instead be:

John Doe and Steve Jones, 2000a, Three Title
John Doe and Steve Jones, 2000b, Four Title
John Doe and Jane Smith, 2000c, One Title
John Doe and Jane Smith, 2000d, Two Title

In a citation year-suffix collapsing would occur: Doe et al 2000a-d.

I’m attaching a couple of tests to show this behavior. I didn’t test
them with citeproc-js but I believe they should pass with it too.

Andrea

collapse_YearSuffixCollapseDifferentAuthors.txt (3.54 KB)

collapse_YearSuffixCollapseDifferentAuthorsBibliography.txt (3.83 KB)

Andrea,

Standard author-date style in my view:

  1. sort and group (by author name string, and then year) reference
    list (where collapsing does not apply)
  2. attach an index to each item within the year sub-group
  3. dump the reference list to formatted string
  4. render citation reference suffix by referencing 2; et al collapsing
    happens here, based on 2

Admittedly CSL has evolved considerably since then, but it seems to me
that’s the fundamental logic of how citations work.

We have, let’s say, four items, all with the same first author, but
some with different second authors.

John Doe and Jane Smith, 2000, One Title
John Doe and Jane Smith, 2000, Two Title
John Doe and Steve Jones, 2000, Three Title
John Doe and Steve Jones, 2000, Four Title

And let’s say we have a style that always says to use et al for
everything but the first author.

Questions:

a) how are those sorted?
b) what are their suffix values?

Aside: there may be a secondary question about what’s a subsequent
author item in this list, but let’s leaves aside.

My argument is:

John Doe and Steve Jones, 2000a, Three Title
John Doe and Steve Jones, 2000b, Four Title
John Doe and Jane Smith, 2000a, One Title
John Doe and Jane Smith, 2000b, Two Title

Not at all, that’s exactly what the citeproc-js will produce. Andrea
can confirm, but from past discussions I’m sure that the same is true
of citeproc-hs.

So a bright spot in the day. :slight_smile:

Sorry I’m joining in so late… well, Bruce supposed output is not
entirely correct. It should instead be:

John Doe and Steve Jones, 2000a, Three Title
John Doe and Steve Jones, 2000b, Four Title
John Doe and Jane Smith, 2000c, One Title
John Doe and Jane Smith, 2000d, Two Title

What do you mean by “not … correct”?

I am telling you that a test created according to the scenario I
presented here should fail with the example output you provided.

“John Doe and Steve Jones” and “John Doe and Jane Smith” are two
different “authors” (really author groups).

So if we have a disagreement here, then my worry that we have a
problem is in fact true.

Bruce

Bruce,

Andrea,

My argument is:

John Doe and Steve Jones, 2000a, Three Title
John Doe and Steve Jones, 2000b, Four Title
John Doe and Jane Smith, 2000a, One Title
John Doe and Jane Smith, 2000b, Two Title

Not at all, that’s exactly what the citeproc-js will produce. Andrea
can confirm, but from past discussions I’m sure that the same is true
of citeproc-hs.

Sorry I’m joining in so late… well, Bruce supposed output is not
entirely correct. It should instead be:

John Doe and Steve Jones, 2000a, Three Title
John Doe and Steve Jones, 2000b, Four Title
John Doe and Jane Smith, 2000c, One Title
John Doe and Jane Smith, 2000d, Two Title

What do you mean by “not … correct”?

I am telling you that a test created according to the scenario I
presented here should fail with the example output you provided.

“John Doe and Steve Jones” and “John Doe and Jane Smith” are two
different “authors” (really author groups).

So if we have a disagreement here, then my worry that we have a
problem is in fact true.

you seem to be right, there must be some disagreement, and I
understand it because in citeproc-hs-0.2 year-suffix disambiguation
worked the way you described above. So I had to change it in the
transition to 0.3 in order comply with the specification, with
citeproc-js and a specific test of the test-suite:

https://bitbucket.org/bdarcus/citeproc-test/src/1a2a481c8130/processor-tests/humans/disambiguate_YearSuffixAndSort.txt

The spec is also silent on the details, and so leaves room for these
different understandings.

You are wrong, I think. And I think Frank implemented it in strict
adherence of the specification wording:

«[...] year-suffix is added to cites that are otherwise identical
(e.g. "Doe 2007, Doe 2007" becomes "Doe 2007a, Doe 2007b").»

As you see no mention is made about cites having the same
contributors’ list, but even the example seems to refer to John Doe
2007 and Jane Doe 2007 (see above “disambiguate-add-givenname” in the
same box).

In a style using short names different authors could be disambiguated
with year-suffixes. The same apply with different authors groups when
et-al is used without name and givenname disambiguation.

I changed citeproc-hs to reflect the new behavior because it makes
sense to me. I thought it was a deliberate decision…

Andrea