More fun with disambiguation

citeproc-js has a few problems with year-suffix disambiguation, and I
need some guidance on one case that I hadn’t tested before. This one
is mercifully simple to explain.

The options are:

et-al-min = 3
et-al-use-first = 1
disambiguate-add-names = true
disambiguate-add-year-suffix = true

After some debugging, I’m getting this test result:

[0] Smith, Jones & Brown (1986a); Smith, Jones & Brown (1986b); Smith, Jones, Brown & Green (1986c); Smith, Jones, Brown & Green (1986d)

My question is whether it should instead look like this:

[0] Smith, Jones & Brown (1986a); Smith, Jones & Brown (1986b); Smith, Jones, Brown & Green (1986a); Smith, Jones, Brown & Green (1986b)

That is, when cites have the same base form (“Smith et al.”), should
year-suffix disambiguation be applied in sequence across the set, or
should the sequence reset for each ambiguous grouping?

I think I know what the answer will be (i.e. that the sequence should
be per-group, and I have a little more work to do), but I thought I’d
ask before taking the next plunge into the code.

Frank

I’m working on something like this too… I’m trying to get the second
result.

Andrea

I agree.

Bruce

I’m working on something like this too… I’m trying to get the second
result.

I agree.

After many blissful months of happy disinterest, an afternoon spent
fighting with yet-another “last” bug in the citeproc-js disambiguation
code. Maybe revision is the soul of learning; this time around, the
requirements finally came into focus. Although I managed to convince
myself that I did at last have the problem by the short hairs, the
existing code was such a mess of cruft and misdirection that I kept
losing the thread.

So … I spent most of Saturday reimplementing cite-level
disambiguation from scratch, and for once I’m able to understand how
the thing works, without that extra cup of morning coffee. For Andrea
and other implementers, in particular, the code is here:

http://bitbucket.org/fbennett/citeproc-js/src/tip/src/disambig_cites.js#cl-66

The runDisambig() loop is the core of it. The key adjustment, which I
had thought about when building the previous module (but was too weary
to implement) was to encase the list of cites to be disambiguated in a
further list wrapper, so that sublists can be extracted and queued for
processing independent of the original set.

I was pleasantly surprised to discover how simple the processing flow
becomes with this loop structure. Non-clashing cites identified at
any increment are removed from the current list, composed into a
separate list with its own “base” expansion parameters, and the
freshly composed list bundle is appended to the wrapper queue, for
processing after the current list clears.

The only slightly annoying bit is the need (shared with the old code)
to “rewind” cites that clear in the givennames phase in by-cite mode,
to eliminate trailing names belatedly found not to be necessary for
disambiguation. (That part is in the decrementNames() function, at
the end of the file.)

So anyway, hurray, a bit of progress, and two cheers for slightly more
maintainable code. The basic approach might be useful to others,
hence this post.

Enjoy!

Frank