Input: page field

Frank_Bennett · August 10, 2013, 11:39pm

From comments posted by Brecht Machiels.

Brecht writes:

Which values are allowed for the “page” input field? I see multiple
ranges can also be specified. I think the CSL spec should, in general,
also define the format of the input fields. Personally, I would opt for a
structured format (like the date fields) as opposed to a string-format
(the page field). Individual CSL processors can still convert a
string-formatted field to the structured data. This would require changes
to the tests.

There is a similar issue with the locator field (and, in MLZ/CSL-m,
the section field on legal item types). Configured for MLZ,
citeproc-js currently parses out the content of all three (page,
section, locator), to extract label overrides and embedded labels,
where appropriate to combine locator and section (a hard-coded
pinpoint available on things like statute items), and to suss out
whether the top-level label should or should not be pluralized.

The logic works, and it addresses some show-stopping issues affecting
legal resources (i.e. label overrides and embedded labels): but it is
completely off-specification.

If there is a move to specify structured input for these fields, I can
provide use cases from the legal side. It would be good to have them
covered in the specification, although it would take a fair amount of
work to pin the behaviour down.

Frank

Brecht_Machiels1 · August 11, 2013, 10:16am

Hello,

From comments posted by Brecht Machiels.

Brecht writes:

Which values are allowed for the “page” input field? I see multiple
ranges can also be specified. I think the CSL spec should, in general,
also define the format of the input fields. Personally, I would opt for a
structured format (like the date fields) as opposed to a string-format
(the page field). Individual CSL processors can still convert a
string-formatted field to the structured data. This would require changes
to the tests.

There is a similar issue with the locator field (and, in MLZ/CSL-m,
the section field on legal item types). Configured for MLZ,
citeproc-js currently parses out the content of all three (page,
section, locator), to extract label overrides and embedded labels,
where appropriate to combine locator and section (a hard-coded
pinpoint available on things like statute items), and to suss out
whether the top-level label should or should not be pluralized.

The logic works, and it addresses some show-stopping issues affecting
legal resources (i.e. label overrides and embedded labels): but it is
completely off-specification.

It’s these kind of things in the test suite that I run into now and then
that
feel like black magic. Since its not part of the spec, it’s hard to
support that behavior. For now, I’m focusing on supporting what’s
described in the spec.

I understand that the behavior is hard to describe due to its very nature.
I believe the first step is to define a clear structured input data format.

If there is a move to specify structured input for these fields, I can
provide use cases from the legal side. It would be good to have them
covered in the specification, although it would take a fair amount of
work to pin the behaviour down.

I assume that in citeproc-js, you are parsing the string input data into
some kind of structured format? This could serve as a starting point.

The page field could be a list of page ranges, where a “page range”
doesn’t necessarily need to have an end page specified (to indicate a
single page).

page: [
[‘ii’, ‘vi’]
[5],
[8, 9]
]

Does this cover all use cases?

Regards,
Brecht

Sylvester_Keil · August 11, 2013, 1:27pm

Hello,

From comments posted by Brecht Machiels.

Brecht writes:

Which values are allowed for the “page” input field? I see multiple
ranges can also be specified. I think the CSL spec should, in general,
also define the format of the input fields. Personally, I would opt for
a
structured format (like the date fields) as opposed to a string-format
(the page field). Individual CSL processors can still convert a
string-formatted field to the structured data. This would require
changes
to the tests.

There is a similar issue with the locator field (and, in MLZ/CSL-m,
the section field on legal item types). Configured for MLZ,
citeproc-js currently parses out the content of all three (page,
section, locator), to extract label overrides and embedded labels,
where appropriate to combine locator and section (a hard-coded
pinpoint available on things like statute items), and to suss out
whether the top-level label should or should not be pluralized.

The logic works, and it addresses some show-stopping issues affecting
legal resources (i.e. label overrides and embedded labels): but it is
completely off-specification.

It’s these kind of things in the test suite that I run into now and then

that
feel like black magic. Since its not part of the spec, it’s hard to
support that behavior. For now, I’m focusing on supporting what’s
described in the spec.

I understand that the behavior is hard to describe due to its very
nature.

I believe the first step is to define a clear structured input data
format.

If there is a move to specify structured input for these fields, I can
provide use cases from the legal side. It would be good to have them
covered in the specification, although it would take a fair amount of
work to pin the behaviour down.

I assume that in citeproc-js, you are parsing the string input data into

some kind of structured format? This could serve as a starting point.

The page field could be a list of page ranges, where a “page range”
doesn’t necessarily need to have an end page specified (to indicate a
single page).

page: [
[‘ii’, ‘vi’]
[5],
[8, 9]
]

Does this cover all use cases?

A while ago we started a list of observations that should be considered
when standardizing the input format. Perhaps you could add the page field
there as well for future reference:

Sylvester

Topic		Replies	Views
Test suite input format CSL Development	16	243	May 1, 2009
page-range-format for locators? CSL Development	4	471	May 16, 2013
CSL spec and test cases CSL Development	9	395	August 17, 2013
page-range, page-first CSL Development	2	246	August 5, 2009
xbiblio-devel Digest, Vol 86, Issue 9 CSL Development	1	330	October 8, 2013

Input: page field

Related topics