page number collapsing

Bruce_D_Arcus1 · May 26, 2009, 2:55pm

So do we want to support this?

http://forums.zotero.org/discussion/7185?page=1

If we do, it probably suggests a global (or maybe citation or bib)
option something like “page-number-collapse” with a value that is the
algorithm to use (the only I implemented in my XSLT code was
Chicago’s).

Bruce

Rintze_Zelle · May 26, 2009, 3:12pm

A big yes. But why not use a number attribute instead of an option? That
seems a bit more flexible.

Rintze

Bruce_D_Arcus1 · May 26, 2009, 3:15pm

A big yes. But why not use a number attribute instead of an option?
That seems a bit more flexible.

Makes sense.

Bruce_D_Arcus1 · May 26, 2009, 5:04pm

So I guess the RNC code would be something like:

to be optional on cs:number

attribute collapse-range { “chicago” }

We’d thus want to document the chicago algorithm (and any others we’d include).

Of course, we also have to figure out that other problem that Frank
noted with locators.

Bruce

Rintze_Zelle · May 26, 2009, 7:34pm

I guess there also should be an (explicit) option so that a page range gets
expanded if it is supplied in a collapsed form.

Rintze

Bruce_D_Arcus1 · May 26, 2009, 8:34pm

Blah.

Maybe …

Bruce

Frank_Bennett · May 26, 2009, 10:40pm

So do we want to support this?

http://forums.zotero.org/discussion/7185?page=1

If we do, it probably suggests a global (or maybe citation or bib)
option something like “page-number-collapse” with a value that is the
algorithm to use (the only I implemented in my XSLT code was
Chicago’s).

citeproc-js has a range mechanism (used for citation-number and
year-suffix collapsing) that could be extended to support this, but it
requires clean integers and formatting hints to work from. The
problem is how to get the data into structured form without adding
significantly to the hassle of data entry, and I think that’s going to
be a real problem, unfortunately. It’s date parsing on steroids, with
roman numerals (pages xi-xxv or XI-XXV), prefixed sequence numbers
(sections N23-N25), and combined locators with labels (ch. 3, pp. 3-7,
chs. 4-9). There’s very little structure to work from.

I agree that it would be really nice to have this, for a bunch of
reasons (I can imagine a world where clicking on pages while taking
notes on a document sets the correct locators for a cite tied to a
note), but the UI challenges would be formidable.

Bruce_D_Arcus1 · May 26, 2009, 11:11pm

…

citeproc-js has a range mechanism (used for citation-number and
year-suffix collapsing) that could be extended to support this, but it
requires clean integers and formatting hints to work from. The
problem is how to get the data into structured form without adding
significantly to the hassle of data entry, and I think that’s going to
be a real problem, unfortunately. It’s date parsing on steroids, with
roman numerals (pages xi-xxv or XI-XXV), prefixed sequence numbers
(sections N23-N25), and combined locators with labels (ch. 3, pp. 3-7,
chs. 4-9). There’s very little structure to work from.

I agree that it would be really nice to have this, for a bunch of
reasons (I can imagine a world where clicking on pages while taking
notes on a document sets the correct locators for a cite tied to a
note), but the UI challenges would be formidable.

I’m not sure it’s that big a problem. The practical use case for this
is page numbers for the source; not so much locators. In most apps,
this will be a single field (as in Zotero), or maybe even two.

Bruce

Frank_Bennett · May 27, 2009, 12:04am

…

citeproc-js has a range mechanism (used for citation-number and
year-suffix collapsing) that could be extended to support this, but it
requires clean integers and formatting hints to work from. The
problem is how to get the data into structured form without adding
significantly to the hassle of data entry, and I think that’s going to
be a real problem, unfortunately. It’s date parsing on steroids, with
roman numerals (pages xi-xxv or XI-XXV), prefixed sequence numbers
(sections N23-N25), and combined locators with labels (ch. 3, pp. 3-7,
chs. 4-9). There’s very little structure to work from.

I agree that it would be really nice to have this, for a bunch of
reasons (I can imagine a world where clicking on pages while taking
notes on a document sets the correct locators for a cite tied to a
note), but the UI challenges would be formidable.

I’m not sure it’s that big a problem. The practical use case for this
is page numbers for the source; not so much locators. In most apps,
this will be a single field (as in Zotero), or maybe even two.

Even for simple page numbers you’d need to do something to handle
upper and lowercased roman numerals, I suppose, so the application
would need to cope with that and deliver a number and a hint. Apart
from that, so long as all that is needed is range collapsing against a
list of numbers, it’s no real problem at the implementation end.

But doesn’t the distinction between locators and page specifiers begin
to evaporate with the introduction of hierarchical relations?

Bruce_D_Arcus1 · May 27, 2009, 2:14am

…

Even for simple page numbers you’d need to do something to handle
upper and lowercased roman numerals, I suppose, so the application
would need to cope with that and deliver a number and a hint. Apart
from that, so long as all that is needed is range collapsing against a
list of numbers, it’s no real problem at the implementation end.

Well, and we can also define what’s allowed. For example, we can start
by saying only page numbers can get collapsed, and only if the input
is an integer range.

But doesn’t the distinction between locators and page specifiers begin
to evaporate with the introduction of hierarchical relations?

No. By “locators” here I was meaning the details of the citation; not
the source.

Bruce

Rintze_Zelle · May 27, 2009, 5:22am

FWIW, I was annoyed by PubMed serving collapsed page ranges in its XML, so I
wrote a bit of translator code some time ago to expand page ranges. I think
it already should handle your last two cases: prefixed sequence numbers,
which are becoming common with electronic-only journals (e.g. E53-E56), and
multiple number ranges in a single string (also important for non-continuous
page ranges). Roman numerals shouldn’t be much of a problem either, as long
as you (reliably) can use hyphens as indicators that a range is present in
the string.

(original patch)

(plus a minor bug-fix)

RintzeOn Wed, May 27, 2009 at 12:40 AM, Frank Bennett <@Frank_Bennett>wrote:

Frank_Bennett · May 27, 2009, 5:33am

It’s date parsing on steroids, with
roman numerals (pages xi-xxv or XI-XXV), prefixed sequence numbers
(sections N23-N25), and combined locators with labels (ch. 3, pp. 3-7,
chs. 4-9). There’s very little structure to work from.

FWIW, I was annoyed by PubMed serving collapsed page ranges in its XML, so I
wrote a bit of translator code some time ago to expand page ranges. I think
it already should handle your last two cases: prefixed sequence numbers,
which are becoming common with electronic-only journals (e.g. E53-E56), and
multiple number ranges in a single string (also important for non-continuous
page ranges). Roman numerals shouldn’t be much of a problem either, as long
as you (reliably) can use hyphens as indicators that a range is present in
the string.

Great stuff! I’m not against supporting parsed ranges per se; I just
think that maybe the CSL processor isn’t the best place for string
parsing code to reside.

Topic		Replies	Views
number collapsing algorithms CSL Development	0	213	July 24, 2008
chicago range collapsing CSL Development	12	292	September 30, 2009
How to collapse consecutive numbers in a citation CSL Development	4	315	August 25, 2007
head's up on number styles and citation sorting/collapsing CSL Development	4	264	August 12, 2008
collapse question CSL Development	15	329	March 4, 2009

page number collapsing

to be optional on cs:number

Related topics