SRU

For reference, there’s a long thread in the comments on this:

http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2004/11/27/
citation-ids

Bruce

Date: Tue, 17 May 2005 20:05:00 +0200
From: Matthias Steffens <@Matthias_Steffens>

Perhaps. We could have the database remember which was the first
Janensch 1929 paper it was told about, and have that one remain
in use whenever plain “Janensch1929” is used?

Yep, that would be a good rule. Still, it would only work within
the scope of one database. What if there are two or more
databases where different Janensch articles from 1929 were
identified as “Janensch1929”?

So are you saying we need to use “Janensch1929p39W” from the get-go?

(If so, the “JanenschW1929p39” would be less offensive.)

Well, I fear that there are cases were even “Janensch1929p39W” isn’t
unique. In fact, any cite key may fail that does not include all of
the important bibliographic source info.

Absolutely true. This is why I am calling these keys “nearly unique”.
We have to judge where the trade-off is between a tiny bit more
uniqueness and the extra verbosity/opacity that it requires. My own
feeling is that Janensch1929p39W is at, or maybe beyond, the point
where it’s worth adding more stuff in.

Here’s what I’d like to be able to say:

\cite{Janensch1929}
\cite{Janensch1929p39}
\cite{Janensch1929p39W}

in ascending order of whether-I-need-it-ness, and if that fails:
\cite{doi:1234.5678/abcdefg}
for articles that have DOIs, and
\cite{openurl:rft.aulast=Janensch&…}
for those that don’t.

In our database we have the same problem with naming of files that
are associated with a given database entry. We use the DOI if
available, otherwise we use file names like

Angel1994Nature367p126.pdf
Thomas+Dieckmann2004Science276p394.pdf
AdamsEtal2001MarBiol138p281.pdf

However, while these names may be unique (who knows if they really
are!?) they are ugly when used as cite keys within a document – but
they are still better than a DOI number, IMHO.

Yes. To me, they are past the point of being easily typable. I can
throw in a quick \cite{WilsonSereno1998} without thinking about, and
the goal has to be to make that possible in 95% of cases where the
data allows.

/| ___________________________________________________________________
/o ) / Mike Taylor <@Mike_Taylor> http://www.miketaylor.org.uk
)v_/\ “It became necessary to destroy the village in order to save it”
– Attributed to an anonymous senior US military officer.–
Listen to free demos of soundtrack music for film, TV and radio
http://www.pipedreaming.org.uk/soundtrack/

Date: Tue, 17 May 2005 13:53:33 -0400
From: Bruce D’Arcus <@Bruce_D_Arcus1>

So are you saying we need to use “Janensch1929p39W” from the get-go?

(If so, the “JanenschW1929p39” would be less offensive.)

For reference, there’s a long thread in the comments on this:

http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2004/11/27/citation-ids

Thanks, there are some interestingn ideas here.

(I’ve got to stop getting involved in these things! :slight_smile:

/| ___________________________________________________________________
/o ) / Mike Taylor <@Mike_Taylor> http://www.miketaylor.org.uk
)v_/\ “St. Augustine […] came up with the conclusion that the story
in Genesis 1 and Genesis 2 was not a simple historical sequence
of events. It just couldn’t be. It’s not what the words meant.
It just wasn’t” – Robert Bakker.–
Listen to free demos of soundtrack music for film, TV and radio
http://www.pipedreaming.org.uk/soundtrack/

Look at it this way: the time you wasted now may be saved many times
over in the future if you end up with better tools :wink:

Bruce

How would that “map” be implemented?

Bruce

Envelope-to: @Mike_Taylor
Delivery-date: Wed, 18 May 2005 00:42:38 +0200
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
s=beta; d=gmail.com;
h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
b=CmVBFmuet5BhvoWEN9U6urCWV7kwGLRLFLX7gnGqK/LSAKCSK3u73bHI/pB6XtgX1aFa0wbOrNvAqdNeQHdMOq+/ZVAWV2Tj2bZi5litUytmBtTnA6LZFEBdIOb1AVn7svXjmWjQHilmLbmd9+3dIJSTAwaRGSDSh9DdWRjmIUE=
Date: Tue, 17 May 2005 18:42:36 -0400
From: Bruce D’Arcus <@Bruce_D_Arcus1>

Each user has a map between their own citation key (eg Sanderson05) and
the globally unique identifier for the article
(doi://dlib.org/sanderson/05/03/sandersonLevanYoung03 or whatever)
and the system does the mapping at search time.

How would that “map” be implemented?

DBM?

(I am only half joking. The point about the map is that it’s so
trivial. It really is just a hash-table, nothing more.)

/| ___________________________________________________________________
/o ) / Mike Taylor <@Mike_Taylor> http://www.miketaylor.org.uk
)v_/\ “NASA Delays Shuttle Launch Out Of Sheer Habit” – headline
from www.theonion.com–
Listen to free demos of soundtrack music for film, TV and radio
http://www.pipedreaming.org.uk/soundtrack/

Each user has a map between their own citation key (eg Sanderson05)
and the globally unique identifier for the article and the system does
the mapping at search time.

How would that “map” be implemented?

DBM?
(I am only half joking. The point about the map is that it’s so
trivial. It really is just a hash-table, nothing more.)

Right. In the CQL you’d do something like:

xbib.userCiteKey any “sandy05 taylor03 bdarcus04”

And the system would know that userCiteKey words need to be looked up in a
table somewhere.

Rob

   ,'/:.          Dr Robert Sanderson (@Dr_Robert_Sanderson)
 ,'-/::::.        http://www.csc.liv.ac.uk/~azaroth/

,’–/::(@)::. Dept. of Computer Science, Room 805
,’—/::::::::::. University of Liverpool
____/:::::::::::::.
I L L U M I N A T I Cheshire3 IR System: http://www.cheshire3.org/