disambiguation, one more

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

Andrea

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

We’ve been here before. Does a declaratory judgement have force of
precedent? :slight_smile:

http://sourceforge.net/mailarchive/forum.php?thread_name=53208a5f0903291905n7e3904a6k6901fb0976f1f55b%40mail.gmail.com&forum_name=xbiblio-devel

Seriously, though, this pushes beyond the limits of precision in the
style guides. I wonder what a copy editor would say?

Frank

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

We’ve been here before. Does a declaratory judgement have force of
precedent? :slight_smile:

I belong to a civil law system and so I do not feel committed to any
kind of stare decisis… :slight_smile:

Thread: [xbiblio-devel] One last disambiguation wrinkle | XBib

Seriously, though, this pushes beyond the limits of precision in the
style guides. I wonder what a copy editor would say?

Yes, I remember that discussion. Still I had the feeling that the new
disambiguate-add-names treatment (if it fails all names should be
displayed, and, according to yesterday’s decision, this applies even
to the successive disambiguate-add-year-suffix rule) should apply to
this case too.

That is to say, if disambiguate-add-names fails, but the citation may
be disambiguated with given-names, than all names should be displayed
(without “et al.”). Then the minimum effort to disambiguate with given
names should be used.

In other words, what is the exact meaning of the specification when it
reads: “names that would otherwise be hidden as a result of et-al
abbreviation are added one by one, until either the target reference
is uniquely identified, or all names are shown”? Does this apply when
the citation CANNOT be disambiguated with names but CAN with
given-names (because we already decided it applies when a year-suffix
is needed)?

Andrea

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

We’ve been here before. Does a declaratory judgement have force of
precedent? :slight_smile:

I belong to a civil law system and so I do not feel committed to any
kind of stare decisis… :slight_smile:

Thread: [xbiblio-devel] One last disambiguation wrinkle | XBib

Seriously, though, this pushes beyond the limits of precision in the
style guides. I wonder what a copy editor would say?

Yes, I remember that discussion. Still I had the feeling that the new
disambiguate-add-names treatment (if it fails all names should be
displayed, and, according to yesterday’s decision, this applies even
to the successive disambiguate-add-year-suffix rule) should apply to
this case too.

That is to say, if disambiguate-add-names fails, but the citation may
be disambiguated with given-names, than all names should be displayed
(without “et al.”). Then the minimum effort to disambiguate with given
names should be used.

In other words, what is the exact meaning of the specification when it
reads: “names that would otherwise be hidden as a result of et-al
abbreviation are added one by one, until either the target reference
is uniquely identified, or all names are shown”? Does this apply when
the citation CANNOT be disambiguated with names but CAN with
given-names (because we already decided it applies when a year-suffix
is needed)?

True, it is a bit inconsistent, isn’t it, and the same reasoning (ease
of locating the source in the reference list) applies. If the full
list of names is retained, only one name needs to be expanded for
disambiguation to succeed, though. I would be happy with either of
these two results for that test case:

[1] Dropsy, Edward Enteritis, Fever (2000); Dropsy, Ernie Enteritis,
Fever (2000)

[2] Dropsy, Enteritis, X. Fever (2000); Dropsy, Enteritis, Y. Fever (2000)

Of the two, [1] seems the most logical choice because it gives greater
“information weight” to earlier-listed authors. It also seems to
follow the specification description most closely. What’s your
feeling?

Frank

Another option would be to roll back the initials on Fever in the
Dropsy-Enteritis-Fever test case. They’re only there because that was
the last attempted

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

We’ve been here before. Does a declaratory judgement have force of
precedent? :slight_smile:

I belong to a civil law system and so I do not feel committed to any
kind of stare decisis… :slight_smile:

Thread: [xbiblio-devel] One last disambiguation wrinkle | XBib

Seriously, though, this pushes beyond the limits of precision in the
style guides. I wonder what a copy editor would say?

Yes, I remember that discussion. Still I had the feeling that the new
disambiguate-add-names treatment (if it fails all names should be
displayed, and, according to yesterday’s decision, this applies even
to the successive disambiguate-add-year-suffix rule) should apply to
this case too.

That is to say, if disambiguate-add-names fails, but the citation may
be disambiguated with given-names, than all names should be displayed
(without “et al.”). Then the minimum effort to disambiguate with given
names should be used.

In other words, what is the exact meaning of the specification when it
reads: “names that would otherwise be hidden as a result of et-al
abbreviation are added one by one, until either the target reference
is uniquely identified, or all names are shown”? Does this apply when
the citation CANNOT be disambiguated with names but CAN with
given-names (because we already decided it applies when a year-suffix
is needed)?

True, it is a bit inconsistent, isn’t it, and the same reasoning (ease
of locating the source in the reference list) applies. If the full
list of names is retained, only one name needs to be expanded for
disambiguation to succeed, though. I would be happy with either of
these two results for that test case:

[1] Dropsy, Edward Enteritis, Fever (2000); Dropsy, Ernie Enteritis,
Fever (2000)

[2] Dropsy, Enteritis, X. Fever (2000); Dropsy, Enteritis, Y. Fever (2000)

Of the two, [1] seems the most logical choice because it gives greater
“information weight” to earlier-listed authors. It also seems to
follow the specification description most closely. What’s your
feeling?

Hold that thought. The test is run with the “all-names” rule, so it
has to be X. Fever and Y. Fever. So your preferred result is the
right one, if we opt to retain all names. The processor should
produce [1] if the “by-cite” rule is used (at least that would be my
suggestion, following the reasoning above).

Andrea and I seem to be in the same space on this; if anyone has
objections, feel free to jump in.

Hi,

one more on disambiguation, because I do not understand the rationale
behind the following test:

disambiguate_AllNamesBaseNameCountOnFailureIfYearSuffixAvailable

Since ITEM-3 and ITEM-4 fail with disambiguate-add-names, I would then
think that the correct result should be:

Dropsy, Edward Enteritis, X. Fever (2000); Dropsy, Ernie Enteritis, Y. Fever (2000)

and not

Dropsy, Edward Enteritis, et al. (2000); Dropsy, Ernie Enteritis, et al. (2000)

What am I missing?

We’ve been here before. Does a declaratory judgement have force of
precedent? :slight_smile:

I belong to a civil law system and so I do not feel committed to any
kind of stare decisis… :slight_smile:

Thread: [xbiblio-devel] One last disambiguation wrinkle | XBib

Seriously, though, this pushes beyond the limits of precision in the
style guides. I wonder what a copy editor would say?

Yes, I remember that discussion. Still I had the feeling that the new
disambiguate-add-names treatment (if it fails all names should be
displayed, and, according to yesterday’s decision, this applies even
to the successive disambiguate-add-year-suffix rule) should apply to
this case too.

That is to say, if disambiguate-add-names fails, but the citation may
be disambiguated with given-names, than all names should be displayed
(without “et al.”). Then the minimum effort to disambiguate with given
names should be used.

In other words, what is the exact meaning of the specification when it
reads: “names that would otherwise be hidden as a result of et-al
abbreviation are added one by one, until either the target reference
is uniquely identified, or all names are shown”? Does this apply when
the citation CANNOT be disambiguated with names but CAN with
given-names (because we already decided it applies when a year-suffix
is needed)?

True, it is a bit inconsistent, isn’t it, and the same reasoning (ease
of locating the source in the reference list) applies. If the full
list of names is retained, only one name needs to be expanded for
disambiguation to succeed, though. I would be happy with either of
these two results for that test case:

[1] Dropsy, Edward Enteritis, Fever (2000); Dropsy, Ernie Enteritis,
Fever (2000)

[2] Dropsy, Enteritis, X. Fever (2000); Dropsy, Enteritis, Y. Fever (2000)

Of the two, [1] seems the most logical choice because it gives greater
“information weight” to earlier-listed authors. It also seems to
follow the specification description most closely. What’s your
feeling?

Hold that thought. The test is run with the “all-names” rule, so it
has to be X. Fever and Y. Fever. So your preferred result is the
right one, if we opt to retain all names.

Let’s reboot this, and take another look. I think the test is right
as it stands. In fact, the test passes before all names have been
added at the disambiguate-add-names stage.

We have works authored by the following persons:

Book A (2000) authors
Devon Dropsy
Edward Enteritis
Xavier Fever

Book B (2000) authors
Devon Dropsy
Ernie Enteritis
Yves Fever

We’re using the following options:
et-al-min=“3”
et-al-use-first=“1”
disambiguate-add-names=“true”
disambiguate-add-givenname=“true”
disambiguate-add-year-suffix=“true”
givenname-disambiguation-rule=“all-names”

So we start with one name:

Book A (2000) authors
Dropsy

Book B (2000) authors
Dropsy

For disambiguate-add-names, we’ll add names until the cites clear, if at all.

With “all-names” disambiguation, the base form of the names, before
the cites are run through disambiguation, will be:

Book A (2000) authors
Dropsy
Edward Enteritis
X. Fever

Book B (2000) authors
Dropsy
Ernie Enteritis
Y. Fever

This is because, under all-names disambiguation, the names must
first be distinguished, throughout the document.

So we add one name, producing:

Book A (2000) authors
Dropsy
Edward Enteritis

Book B (2000) authors
Dropsy
Ernie Enteritis

… and the cites clear.

The processor should
produce [1] if the “by-cite” rule is used (at least that would be my
suggestion, following the reasoning above).

There are other issues with “by-cite”. Sorry for raising it here;
let’s hold that one for later, and a separate thread. But this test
result is correct as written, I think.

Sorry for the confused early response; it took me awhile to get back
into the zone.

Frank

This seems a clear rule: before adding names to an “et-al” citation
the name must be first disambiguated if the add-given-name is set.

I will have to change to code significantly, even though the job
should not be hard, since this basically changes the disambiguation
algorithm. I have the feeling this could have some side-effect I’m not
grasping yet, though. But I could be plainly wrong. I’m also wondering
whether this is actually consistent with the wording of the
specification.

I’m working on the by-cite implementation (which is just a special
case of the given-name disambiguation routine), so I will see what
happens.

Andrea

This is meant to explain the document-wide discrimination of names.
Seems clear, maybe it can be improved:On Thu, Jun 3, 2010 at 10:17 PM, Andrea Rossato <@Andrea_Rossato1> wrote:

On Thu, Jun 03, 2010 at 05:54:38PM +0900, Frank Bennett wrote:

This is because, under all-names disambiguation, the names must
first be distinguished, throughout the document.

So we add one name, producing:

Book A (2000) authors
Dropsy
Edward Enteritis

Book B (2000) authors
Dropsy
Ernie Enteritis

… and the cites clear.

This seems a clear rule: before adding names to an “et-al” citation
the name must be first disambiguated if the add-given-name is set.

I will have to change to code significantly, even though the job
should not be hard, since this basically changes the disambiguation
algorithm. I have the feeling this could have some side-effect I’m not
grasping yet, though. But I could be plainly wrong. I’m also wondering
whether this is actually consistent with the wording of the
specification.


The scope of names transformation

With a value of “all-names”, “all-names-with-initials”,
“primary-name”, or “primary-name-with-initials”, disambiguation is
performed for all relevant names, without regard to ambiguity in
individual cites. Transformations governed by these rules apply to all
cites throughout the document. Disambiguation of cites is in this case
incidental to the disambiguation of names.


This explains the specific effect of “all-names”:


“all-names”
The default value. If a name is rendered the same in different
cites (e.g. “Doe 2000” and “Doe 2001”), the name is progressively
transformed until it can be distinguished from the others (e.g. “A.
Doe 2000” and “B. Doe 2001”), or until the transformation steps are
exhausted.


Do you find the reference to “cites” misleading? Or would you suggest
other amendments?

This is because, under all-names disambiguation, the names must
first be distinguished, throughout the document.

This seems a clear rule: before adding names to an “et-al” citation
the name must be first disambiguated if the add-given-name is set.
[…]
I’m also wondering whether this is actually consistent with the
wording of the specification.

This is meant to explain the document-wide discrimination of names.
Seems clear, maybe it can be improved:
[cut]
Do you find the reference to “cites” misleading? Or would you suggest
other amendments?

I actually find the add-givenname rule description perfectly clear.
What may be misleading is the fact that, in a five-step procedure, you
ask me to perform step 2 and 3 while executing step 1.

I agree that this leads to the best possible result, but I wonder if
the five-step description of the disambiguation algorithm is helpful
for an implementer. I’ve been misled.

Andrea

This is because, under all-names disambiguation, the names must
first be distinguished, throughout the document.

This seems a clear rule: before adding names to an “et-al” citation
the name must be first disambiguated if the add-given-name is set.
[…]
I’m also wondering whether this is actually consistent with the
wording of the specification.

This is meant to explain the document-wide discrimination of names.
Seems clear, maybe it can be improved:
[cut]
Do you find the reference to “cites” misleading? Or would you suggest
other amendments?

I actually find the add-givenname rule description perfectly clear.
What may be misleading is the fact that, in a five-step procedure, you
ask me to perform step 2 and 3 while executing step 1.

I agree that this leads to the best possible result, but I wonder if
the five-step description of the disambiguation algorithm is helpful
for an implementer. I’ve been misled.

I’ll take another look at the language.