Parallel cites

Parallel citation is a show-stopper issue for law that I have so far
shoved aside in the work on citeproc-js. It is required by most legal
citation styles for law cases, legislation and treaties, and look like
this:

Chaoulli v. Quebec (Attorney General), 2005 SCC 35, [2005] 1 S.C.R. 791.
Green v. State, 334 Ark. 484, 978 S.W.2d 300 (1998).

I’ve been assuming that this will need to be handled somehow through
the “fabled” hierarchical data model, but I’m now playing with the
idea that, in the short term at least, it can be done in the processor
as a special case of collapsing behavior. That is, users would have
two entries for each of the cases above in their database, and when
they are inserted into a citation in sequence, they can be collapsed
into the form shown above. As far as I can tell, there are just two
cases that such simple output collapsing would not handle correctly.
One is “ibid” backreferencing:

  1. Green v. State, 334 Ark. 484, 978 S.W.2d 300 (1998).
  2. Id. at 485, 978 S.W.2d at 301.

This would not work. The cite to S.W.2d would be seen as the
immediately preceding cite, causing the Id. backreference to fail. In
the absence of some sort of workaround in the position machinery, you
would get something like this instead:

  1. Green v. State, 334 Ark. 484, 978 S.W.2d 300 (1998).
  2. Green, 334 Ark. at 485, 978 S.W.2d at 301.

The user would need to manually change the first reference in cite (2)
to “Id. at 485”. I’m thinking that this is preferable to the current
situation, where the output would look like this:

  1. Green v. State, 334 Ark. 484 (1998), Green v. State, 978 S.W.2d 300 (1998).
  2. Green, 334 Ark. at 485, Green, 978 S.W.2d at 301.

The second place where this would break is in bibliographies; until
relations support comes down the pike, these would produce a separate
entry for each of the items.

With this interim approach, collapsing would only occur where the
rendered cites in a sequence differ only in their volume,
container-title, page and locator. As far as I can tell, this would
be safe to impose generally, without the need for a controlling option
in CSL. It’s only a provisional solution, but the intention is to at
least get us off the ground for legal support, without complicating
things for other, better solutions down the road.

Thoughts and reactions? Andreas?

Before we get farther into this, can you please stop and explain the
above to the legal laypeople here? Are these two separate legal
opinions, each published in more than one place, so that you only
needed to include one to illustrate the point? Or something else?

Bruce

Parallel citation is a show-stopper issue for law that I have so far
shoved aside in the work on citeproc-js. It is required by most legal
citation styles for law cases, legislation and treaties, and look like
this:

Chaoulli v. Quebec (Attorney General), 2005 SCC 35, [2005] 1 S.C.R. 791.
Green v. State, 334 Ark. 484, 978 S.W.2d 300 (1998).

Before we get farther into this, can you please stop and explain the
above to the legal laypeople here? Are these two separate legal
opinions, each published in more than one place, so that you only
needed to include one to illustrate the point? Or something else?

Quite right. Court judgements, legislation and international
agreements are often published in multiple reporters. The same text
will be contained in each reporter, the only difference being the
pagination (or other locator information). Courts and law reviews
often require that several commonly available reporters be cited for a
single judgement, as a convenience to readers who have access to only
one of the reporters in which the text appears.

This is a different problem from properly hierarchical relations such
as cites to appeals judgements or amending legislation, although it is
somewhat related in the implementation.

Parallel citation is a pain, but it’s extremely important in the
trade. There’s even a lawsuit that turns on it, the (non-parallel)
citation for which is: West Pub. Co. v. Mead Data Cent., Inc., 616 F.
Supp. 1571 (D. Minn. 1985), aff’d, 799 F.2d 1219 (8th Cir.), cert.
denied, 479 U.S. 1070 (1986).

I have a feeling this is going to be a problem that’s going to get
solved slowly, with some pain.

The data modeling could actually be a major PITA for BIBO RDF, as well
as for how we’ve previously discussed the new model in Zotero.

Bruce

Parallel citation is a show-stopper issue for law that I have so far
shoved aside in the work on citeproc-js. It is required by most legal
citation styles for law cases, legislation and treaties, and look like
this:

Chaoulli v. Quebec (Attorney General), 2005 SCC 35, [2005] 1 S.C.R. 791.
Green v. State, 334 Ark. 484, 978 S.W.2d 300 (1998).

Before we get farther into this, can you please stop and explain the
above to the legal laypeople here? Are these two separate legal
opinions, each published in more than one place, so that you only
needed to include one to illustrate the point? Or something else?

Quite right. Court judgements, legislation and international
agreements are often published in multiple reporters. The same text
will be contained in each reporter, the only difference being the
pagination (or other locator information). Courts and law reviews
often require that several commonly available reporters be cited for a
single judgement, as a convenience to readers who have access to only
one of the reporters in which the text appears.

This is a different problem from properly hierarchical relations such
as cites to appeals judgements or amending legislation, although it is
somewhat related in the implementation.

I have a feeling this is going to be a problem that’s going to get
solved slowly, with some pain.

The data modeling could actually be a major PITA for BIBO RDF, as well
as for how we’ve previously discussed the new model in Zotero.

I’ll reveal my ignorance by asking, but does this need to be reflected
in BIBO? The relation of the entries can be deduced from their field
content, and if it’s just a matter of output formatting, that can be
handled reliably and fairly simply in the implementation.

I’m not sure how relevant the relationship would be to data exchange,
since the “parallel” items do refer to different documents: even
though the bulk of their content consists of identical text, they
differ in pagination, and in some publisher-supplied metadata such as
commentary and other annotations.

Frank

I’ll reveal my ignorance by asking, but does this need to be reflected
in BIBO? The relation of the entries can be deduced from their field
content, and if it’s just a matter of output formatting, that can be
handled reliably and fairly simply in the implementation.

I don’t understand you above. What “field content”? And “it’s just a
matter …” for what agent?

I’m not sure how relevant the relationship would be to data exchange,
since the “parallel” items do refer to different documents: even
though the bulk of their content consists of identical text, they
differ in pagination, and in some publisher-supplied metadata such as
commentary and other annotations.

So then how do you store all this in the database, and represent it in the RDF?

Bruce

I’ll reveal my ignorance by asking, but does this need to be reflected
in BIBO? The relation of the entries can be deduced from their field
content, and if it’s just a matter of output formatting, that can be
handled reliably and fairly simply in the implementation.

I don’t understand you above. What “field content”? And “it’s just a
matter …” for what agent?

I should have said “variable content”, the agent being the CSL processor.

I’m not sure how relevant the relationship would be to data exchange,
since the “parallel” items do refer to different documents: even
though the bulk of their content consists of identical text, they
differ in pagination, and in some publisher-supplied metadata such as
commentary and other annotations.

So then how do you store all this in the database, and represent it in the RDF?

I think I’m suggesting that it may be sufficient to describe the items
individually. In CSL output, we collapse to (Smith 1990a, 1990b) on
the basis of variable content seen by the processor. I’m wondering
whether parallel cites are closer to that case than to hierarchical
items like a translation, a republication or an appellate judgement.

I’ll reveal my ignorance by asking, but does this need to be reflected
in BIBO? The relation of the entries can be deduced from their field
content, and if it’s just a matter of output formatting, that can be
handled reliably and fairly simply in the implementation.

I don’t understand you above. What “field content”? And “it’s just a
matter …” for what agent?

I should have said “variable content”, the agent being the CSL processor.

I’m not sure how relevant the relationship would be to data exchange,
since the “parallel” items do refer to different documents: even
though the bulk of their content consists of identical text, they
differ in pagination, and in some publisher-supplied metadata such as
commentary and other annotations.

So then how do you store all this in the database, and represent it in the RDF?

I think I’m suggesting that it may be sufficient to describe the items
individually. In CSL output, we collapse to (Smith 1990a, 1990b) on
the basis of variable content seen by the processor. I’m wondering
whether parallel cites are closer to that case than to hierarchical
items like a translation, a republication or an appellate judgement.

Further to this item, I have some parallel cite detection code brewing
in citeproc-js that will be able to reliably format parallel cites as
used in legal styles. Working up the code has convinced me that this
really can and should be treated more as a matter of output formatting
than of relationships within the underlying data.

Cites that collapse in this way have to have a fairly narrow set of
visual characteristics if the collapsed form is going to make any
sense, so even if two items were flagged as “siblings” in the
database, we can’t really take that on faith; elaborate validation is
still required, in the processor, to determine whether it is safe to
collapse them. Because the formatting constraints are tight, the
result of parallel-collapsing-is-okay validation can be taken as a
reliable proxy for a these-items-are-siblings hint in the data. We
don’t really need such a hint, and if it were available, it wouldn’t
gain us anything.

I know that may sound cryptic, but I’ll leave it there for now. The
main thing is that I’ve figured out how to handle the use case for
citation purposes, and there is no immediate or pressing need to worry
about how to describe parallel or sibling items in RDF, or to flag
them in the database. All’s well.

Frank