Disambiguation in note styles

A note on some fresh developments in citeproc-js land that affect the CSL
test suite.

In response to feedback on the MLZ Bluebook style, I put in some work in
citeproc-js to get backreference glosses working. The form implemented for
Bluebook support looks like this:

Smith, His Very Long Book Title (2000) [hereinafter Smith, His Book]

The tricky bits are that (a) the gloss should be applied only if there are
subsequent back-references; and (b) the note number should be included for
disambiguation purposes. A test that captures the behaviour is here:

https://bitbucket.org/bdarcus/citeproc-test/src/737afd7171005f9d53cf221f8a71f21007d10386/processor-tests/humans/disambiguate_BasedOnSubsequentFormWithBackref2.txt

A question for the list is whether first-reference-note-number should
always be included for disambiguation purposes, or whether it should be
discretionary. In the current implementation, it is included only if
givenname-disambiguation-rule=“by-cite” (the default). When another rule is
used, the cite to Roe in the test fixture linked above would have the
gloss, and the backreference would show the title.

While the name of givenname-disambiguation-rule suggests that it affects
only given names, the general effect of the “by-cite” rule is to make
citations as compact as possible; and dropping the gloss where is is not
strictly necessary has that effect.

While testing the implementation, I found it necessary, in styles that use
disambiguate=“true”, to force a rerun of disambiguation for first
references that are moved in the document, together with all
back-references that point to it, to assure that the document reflects the
actual disambiguation state of each reference in the set. This change in
behaviour affected three tests in the test suite:

Finally, to control the appearance of the gloss on first references, I had
to introduce a test condition (which I’ve added to the CSL-m schema) that
returns true only if there are subsequent references to the item:

A condition that tests for subsequent back-references is needed to
implement back-reference glosses, regardless of whether note numbers are
included for disambiguation purposes.

Frank

sorry, I feel like I’m missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?
Also, not all styles will have givenname disambiguation at all, so
it’s very much possible to have a style without a givenname
disambiguation rule. Generally I’m not happy using that rule for
anything but givennames—that’s just going to create chaos.

I’m probably missing something, but I’m going to guess that I’ve spend
more time on this than most other, so if I don’t get it, probably a
lot of others won’t, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

Disambiguation is just hard, and I may have mistated some things (as well
as being unclear). Let’s take the points in order.

Disambiguation settings are a property of the disambiguation “pool” of
which the item (not the specific cite) is a member. Members of a pool are
those that render identically with all disambiguation settings turned off.

In the current processor version running in Zotero, ambiguity can be
determined by comparing bare items, without regard to their position in the
document. As a result, everything works fine.

When note numbers are included in the comparison, the relative position of
two cites becomes relevant to the disambiguation comparison. If cites are
in separate notes, they are always unambiguous:

  1. Smith, Book A (1999)
  2. Smith, Book B (2000)
  3. Jones, Other Book (2014)
  4. Smith, supra note 2.

In this example, we know which work by Smith is intended in note 4, because
it points to a note that contains only one item by Smith.On Thu, Jun 12, 2014 at 10:34 AM, Sebastian Karcher < @Sebastian_Karcher> wrote:

sorry, I feel like I’m missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?


Scenario 1

If note 2 in the example above is deleted, and the Book B reference is
added to note 1, the title must be added to the subsequent reference:

  1. Smith, Book A (1999); Smith, Book B (2000)
  2. Jones, Other Book (2014)
  3. Smith, Book B, supra note 1.

In the current version of the processor in Zotero, disambiguation
evaluation is performed only when items are inserted or (entirely) removed.
Therefore, the insertion of Book B to note 1 does not trigger reevaluation
of disambiguation parameters, and the title (“Book B”) is not added; only
the note number will change, as a result of rerendering the cite, with the
same disambig parameters, using refreshed input data.

The solution in this case is to rerun disambiguation of all partners in the
disambiguation set (i.e. the Book A and Book B items).

You are right that changes below the first reference will not affect the
cites in note 1, in this example. Avoiding an update to first-position
cites when it is not strictly necessary is not quite so simple as it may
sound, however. A further example may help to illustrate.


Scenario 1

Let’s start with the same arrangement as Scenario 1, but with a hereinafter
gloss following the ambiguous first-reference cite, as required by some
styles:

  1. Smith, Book A (1999); Smith, Book B (2000) [hereinafter Smith, Book B]
  2. Jones, Other Book (2014)
  3. Smith, Book B, supra note 1.

If note 3 is edited to remove Book B, the gloss becomes unnecessary. The
cites should look like this:

  1. Smith, Book A (1999); Smith, Book B (2000)
  2. Jones, Other Book (2014)
  3. Ibid.

In this case, an update to note 1 is required when the Book B reference is
removed. This is not required in the previous case, but to determine when
it is and is not required, we must identify whether a disambiguate=“true”
condition will be encountered when it is rerendered. Given the potential
complexity of condition statements, the simplest way to do that is to rerun
all cites in the pool.

There may be some scope for reducing this small overhead without breaking
things, but it seems sensible to start with a procedure that is known to
work.

Anyway, that’s the thinking there.

(oops. the second example should be Scenario 2, of course.)

sorry, I feel like I’m missing things here.
Why would the appearance of first-reference-note-number be contingent
on a disambiguation preference unless it is in an if loop testing for
disambiguation?
Also, not all styles will have givenname disambiguation at all, so
it’s very much possible to have a style without a givenname
disambiguation rule. Generally I’m not happy using that rule for
anything but givennames—that’s just going to create chaos.

“Chaos” is a little strong, surely; but I understand your reservation, and
that’s why I explained the rationale for the choice: it’s certainly not
carved in stone or anything.

Since givenname-disambiguation-rule=“by-cite” is the default behaviour, the
default behaviour here would be to include first-reference note numbers
when disambiguating. That would be toggled off when
givenname-disambiguation-rule is set to some other value.

Alternatively, the setting could easily be given its own attribute. All
that would be needed is to decide what it should be called, and what its
default value would be. A total of nine independent repository styles use
first-reference-note-number and perform disambiguation of some sort, so the
set is pretty limited – mostly legal styles. From a quick look at
available documentation, I’d say that guides are generally unclear on what
exactly is meant by “ambiguity”, but flexibility is a good thing, so I’ll
revise and suggest a solo attribute
“disambiguate-on-first-reference-note-number”, with a default value of
“false”.

I’m probably missing something, but I’m going to guess that I’ve spend
more time on this than most other, so if I don’t get it, probably a
lot of others won’t, either. So maybe you could step back a bit and
try to explain again why the the disambiguation rule and the
first-note number are involved here?

The test case shows what the code needs to accomplish.

As CSL revisions are not an issue at the moment, I’m just posting this so
that the details will be on file when the design cycle rolls around again.
There isn’t any pressure to make decisions about it in the short term.

Frank

I get confused by long messages :wink:

Can we start with big picture? Please confirm the following:

  1. this question is driven by the idiosyncrasies of supra referencing?*

  2. CSL doesn’t currently support supra referencing, and so this is an
    extension in MLZ?

If both are true, perhaps you should go with your suggestion (which seems
reasonable), see how it works, and use that experience to suggest possible
additions or changes to CSL proper?

Bruce

  • I do know this is an important feature, but man I hate it; not only is it
    a PITA to implement, it’s hostile to readers (me!). One additional wrinkle
    here related to both: what’s the scope for back referencing in 600 page
    book? The book? The chapter? The page? Do you need to allow this to be
    configured? If yes, how would you even implement it. :wink:

I get confused by long messages :wink:

Can we start with big picture? Please confirm the following:

  1. this question is driven by the idiosyncrasies of supra referencing?*

Yes.

  1. CSL doesn’t currently support supra referencing, and so this is an
    extension in MLZ?

It does, actually: first-reference-note-number is one of the standard CSL
variables, from CSL 1.0:

http://citationstyles.org/downloads/specification.html#standard-variables

CSL also supports the “five-footnote rule” imposed by some legal styles:

http://citationstyles.org/downloads/specification.html#note-distance

If both are true, perhaps you should go with your suggestion (which seems
reasonable), see how it works, and use that experience to suggest possible
additions or changes to CSL proper?

Yep!

Bruce

  • I do know this is an important feature, but man I hate it; not only is
    it a PITA to implement, it’s hostile to readers (me!). One additional
    wrinkle here related to both: what’s the scope for back referencing in 600
    page book? The book? The chapter? The page? Do you need to allow this to be
    configured? If yes, how would you even implement it. :wink:

It’s used almost exclusively in article-length works, scoped to the
individual article.

sorry if “Chaos” sounded rough–it was just short-hand for a concern
that it might be confusing. I think I understand the issue now and
that makes sense. I’d much rather have that in a separate attribute,
not just because of the naming, but also because I think
systematically the givennname-disambiguation-rule should only apply
when givenname disambiguation is turned on, which may or may not be
the case in the references you’re dealing with.

Sebastian