I think we’ve got the basics of that: it’s the new “macro” element.
Bruce
Yes, I think macro is the core. I think we just need a way to
(1) include in a macro some variables that are automatically
altered as a bibliography is sorted, scanned, or referenced,
(2) give the macro-writer some predefined variables, and
(3) allow a macro’s results to look to a citation like variables in
a bibliographic record.
Starting with (3):
While or after sorting, number each bibliographic item. Make the
result a predefined variable, say, sort-order-number. The style’s
author writes a simple macro
Why create a separate variable here? What does this accomplish beyond
the current element?
[…]
Now maybe for a particular style, the key is more complicated.
I think an APA key can always be determined by looking at the
fields in a biblio item and the same fields in the previous and
next biblio items (previous and next in the sort order). For now
let me assume this is true of all styles.
As far as I know, this is a safe assumption.
So then we need to give the macro-writer access to all fields for
the two surrounding items. He writes a macro like this:
Maybe that’s done differently, but the result is a variable,
disambiguating-letter, that the citation can now use.
There doesn’t seem to be much of a point to explicitly coding a
disambiguating-letter macro. We will need specialized rules to do it
(e.g., your increment=“abc” attribute) no matter what approach we take.
yields (Jones 2004; Jones 2005; Doe 1987a; Doe 1987b) assuming
that’s the order it was entered. Now change the sort order and the
grouping:
yields (Doe 1987a, b; Jones 2004, 2005).
(I might have the spaces and prefixes wrong on that.)
XML doesn’t allow multiple attributes on an element, for one thing,
and doesn’t care about order, for another. You’d want to model this as:
...
or something of that sort. However, this allows users to do some
complicated things, which a parser would have to support, but which
might never get used (e.g., sort-by order different from group-by
order).
You can jigger the sortbys and groupbys to get any combination you
need. The macro decided whether it would comes out 1987A, 1987i,
1987-a, 1987 with a small-caps A, etc.
If you’re worried about the above cases, I’d suggest be a built-in variable. However, to
my knowledge, there are no styles that use something besides 2001a/b/
c for disambiguation of years, so the old option would probably be fine.
Some styles might require two or three macros, maybe disambiguated-
author, disambiguated-short-title, and disambiguating-letter. Some
would require none.
disambiguated-author is a different animal. I don’t know whether
there’s any need for extensibility here (my hunch is no), and I don’t
know whether it’s possible to create an extensible way of creating
this disambiguated-author macro that doesn’t require a host of new
variables.
My main complaints about the scheme you propose are as follows:
- There’s more logic than I feel is necessary here. We might handle
disambiguation better in some strange fringe cases, but the current
approach would work for almost everything. How many cases will there
be where where: 1) you’re using an obscure bibliographic style that
handles disambiguation in some strange way, 2) this new syntax
handles disambiguation, but the old syntax does not, and 3) you are
using multiple sources by the same author published in the same year.
We provide the same disambiguation power EndNote does. Only BibTeX
might do it better, and that’s because its styles are actual
programming code (and you’ll probably have to run the style
formatting script 5 times to get it right). Word 2007 doesn’t even
support disambiguation, or didn’t as of the beta. More powerful
disambiguation is probably unimportant to 99% of our prospective user
base.
- Style authors might simply avoid this logic, because it’s
confusing to code, and because it’s very easy to complete your style
and not realize these things don’t work. Try adding this to apa.csl
and see how many lines you need to do it. We shouldn’t require that
all author-date styles replicate some complex series of sorting/
grouping rules. We should just implement these rules in the parser so
styles can easily enable/disable them.
You’re right, however, that we might want to provide more powerful
syntax for grouping. The two cases I can think of are:
(Doe 1987a; Doe 1987b) → (Doe 1987a, b) (your second example)
[1; 2; 3] → [1-3]
Right now the first is handled by disambiguate-year-suffix-collapse,
which you dislike. I wouldn’t mind replacing it with something more
extensible, but the approach you describe here is, to me, excessively
complicated. Preferably, no one should have to define new macros
simply to handle disambiguation. Besides, it’s not intuitively clear
how to me how to handle the latter case with your approach. We could
add a “group-by” option to replace disambiguate-year-suffix-collapse,
but then we’d either have to hard-code the options (perhaps author-
year, author, and cited-number) or come up with some other way of
specifying the syntax.
At least in Zotero, your first example:
[4:12-14; 4:45-47; 8:34] → [4:12-14, 45-47; 8:34]
is irrelevant, since to do this, you’d put “12-14, 45-47” into the
locator field. I’ve never seen a style that requires the author’s
name twice when specifying two page ranges, so this is probably safe.
I would not be opposed to changing to
a more general , where the value is the
name of a macro that describes the sort order, e.g.:
...
That’s five more lines of code, but eliminates the dirty “magic”
author macro.
Ultimately, there’s no question that disambiguation is a tough
problem to deal with. However, it’s easier for the programmer to
implement and harder for the style author to f**k up when all the
logic is in the parser and all the author has to do is set a few
options to “true”.
Simon