The order of disambiguation: failing test

It seems churlish to mention this one, given that I currently pass barely 50 percent of the disambiguation tests, and mostly fair and square fall-on-your-face failure. However:

disambiguate_ByCiteDisambiguateCondition.txt FAILED
-------- EXPECTED --------
Doe et al., Book A (2000); Doe et al., Book B (2000)
----------- GOT -----------
Doe et al. (2000a); Doe et al. (2000b)

See disambiguate_ByCiteDisambiguateCondition.txt

We have disambiguation rules to add-givenname, add-names and add-year-suffix. The givenname disambiguation rule is by-cite, but both add-givenname and add-name must fail, because both works are by the same two (prolific) authors, John Doe and Jane Roe.

We also have a <choose> element that will render the title if disambiguate is true.

The test seems to assume that disambiguate will be true, and that the title will therefore get printed rather than a year suffix. I’m having trouble matching that to the spec. Per the spec:

Disambiguation methods are activated with the following optional attributes, and are always tried in the listed order

Which is a) add more names, b) expand names to initials and given names, c) add a suffix. It is only if those methods fail that we get to the generalised disambiguation conditional:

If ambiguous cites remain after applying the selected disambiguation methods described above, a final disambiguation attempt is made by rendering these cites with the disambiguate condition testing “true” [Step (4)].

So in this case we should:

  • Try step 1 and fail (adding names gets us “Doe and Roe”, twice)
  • Try step 2 and fail (expansion gets us “John Doe and Jane Roe”, twice: neither name is ambiguous, but the cites remain ambiguous)
  • Try step 3 and succeed (expansion gets us “2000a” and “2000b”): we now have unambiguous citations (though not unambiguous names, which we cannot have any which way here).

We should therefore never get to the “final” step 4, so the disambiguation condition is never triggered and we don’t get titles, as the test suggests we should.

What wrinkle am I missing here?

Quite right. The test fixture follows citeproc-js, not the spec. I remember thinking about this one during the drafting, and I think year-suffix should be the last listed (as the last to be tried), because it always succeeds. (This may be another that should be moved out or amended in the interest of faithfulness to the specification.)

FWIW, this one has very few, if any, implications in practice, since the if disambiguate test typically adds a title, which is almost always unique.

I agree with Frank that the citeproc-js, rather than the spec, version allows for more flexibility (I could tell a style to use year suffixes only when the if disambiguate action also fails to disambiguate, e.g. for items with the same title, author, and year) but this is exceedingly rare.

Thanks for the clarification. It sounds as if this is one of those tests where I can more or less please myself (in the sense that it tests an edge case that is unlikely to occur with any sanely coded style). FWIW it always seemed to me unlikely that anyone would ever use a year suffix and a disambiguation rule, because the only circumstance I can imagine where that combination would make any sense is if one was dealing with a set of ambiguous cites that never printed a year.

Since the spec is at least clear on this, I’ll probably follow the spec, though I agree with Frank that flipping the order of year-suffix and disambiguation-condition rules would actually make good sense.

I’ve adjusted a couple of disambiguation tests that touch both year-suffix and disambiguate="true" to conform to the spec, and moved the remainder out to the citeproc-js repo.

citeproc-js will now conform to spec by default, with a sys option to restore its former behavior. Will flag that change in a release note at the next update.

Edit: not -> now