Citation Style Language

Cite collapse test failure

Reminds me we could use description metadata in these tests.

I never feel confident making firm declarations on tests, but I think you’re right. I can’t see anything that would justify that expected output.

But maybe we’re just missing something.

@Frank_Bennett?

Well, I don’t think it’s a massive deal in any event. With things like cite collapsing I am inclined to think that so long as the output is reasonable one lives with a little uncertainty in corner cases. Neither cite could lead to any ambiguity.

There is 'way too much CSL code in that test than necessary, which is distracting. It’s been a long time since I worked on sorting, and I had to dig into it myself to figure out what was going on. So … yeah, a comment would have been helpful. The result is correct, though.

Cite grouping is applied only if the citations are sorted in some way, otherwise they should be rendered in entry order. If you add this to the test, the seminal works of Aalto will be grouped:

<sort>
  <key variable="author"/>
</sort>

(Edit: Cite grouping isn’t actually exercised in this one, since each author’s refs are all from the same year.)

2 Likes

Thanks Frank.

I added your text to a new description field on that test.

We can probably do that piecemeal as questions come up.

Well that explains the thinking behind the test, but I’m still not sure it matches the Spec.

The Spec makes it clear that the various collapse rules “Activate[] cite grouping and collapsing”. So you don’t collapse without grouping. And grouping, doesn’t depend on prior sorting, it is a sort of limited kind of sorting.

With grouping

With cite grouping, cites in in-text citations with identical rendered names are grouped together, e.g. the year-sorted “(Doe 1999; Smith 2002; Doe 2006; Doe et al. 2007)” becomes “(Doe 1999; Doe 2006; Smith 2002; Doe et al. 2007)”

So I guess … I’m digging my heels. I think my behaviour is per spec :slight_smile: Frank’s is perfectly sensible in its way, if anything is sensible about collapsing (!), but it ignores the grouping step which is implicit in any style that sets collapsing.

ETA: Frank’s addendum suggests grouping only applies when each author’s works are from different years. But the spec contains no such limitation. It simply depends on the output of the first rendered name being the same.

All fair points. A couple of factors probably led to making group sorting dependent on an explicit sort. One relates to laziness, the other to UX.

On the laziness front, Zotero offers an option to render citations unsorted in styles that have a cs:sort element under cs:citations, to give users control over ordering when they need it. In a processor that supports that, the flag exposed to the client would need to raise when collapsing is invoked as well as when cs:sort is present, and a client request to disable sorting would need to affect both the explicit and the implicit group sorts. That’s easy enough to implement, and not a likely vector for bugs even in the ratty citeproc-js code base.

Regarding UX, a group sort not backed up by explicit sort terms will cluster cites by author in the order they are first mentioned. That probably seemed a trap for the unwary at the time (i.e. a user hacking an existing style removes the cs:sort element, and then is surprised that cites do not appear in the listed order). (The effect of first-mention grouping can be accomplished, in the odd event that it is desired, by sorting on the citation-number variable, so nothing is lost.)

Anyway, I don’t have any strong feelings one way or the other. But those are probably the factors that shaped the test in its current form.

Understood. I doubt it matters. I’m inclined to leave my code as it is, since I think the result is … at least reasonable and suffix collapsing is a terrible trial, so I’m reluctant to introduce yet another special case.

Do you think we should we should clarify anything on this in the spec for 1.1 @PaulStanley?

This case is important though, for example:

(Aalto, 2010a–c; but cf. Jones, 2010; and the rejoinder Aalto, 2010d)

We should make the unsort flag on-spec if it’s not.

@bwiernik: that seems a different case to me, because there’s text between the cites, so they are really different citation groups: the “rejoinder” reference would have to be in a suffix or maybe a totally different cite (depending on whether you allow cites in suffixes. I don’t think that raises the same point.

After all, even if there was sorting on in that case, you wouldn’t expect the final reference to be sorted alongside the early ones, would you?

So I agree with your instinct, but I don’t think it’s relevant here.

In general I think the spec is right and, though for very understandable reasons, Frank is wrong. Absent the suffix use that you mention, grouping should mean grouping, and is not dependent on prior sorting. But I think it’s such a corner case, I doubt it’s worth changing anything.

1 Like

I agree with @PaulStanley: while sorting and grouping may be related, they’re not the same thing, and I don’t think this test matches the spec, which I also agree is right.

I wonder if for questionable or controversial tests like this we should move them to an “archive” subdir or some such, in the same way we now have an “experimental” one?

I think it would be sufficient just to include some narrative in the commentary explaining what is happening. There will always be tests where the results are open to argument. In practice any processor-developer is going to have to look, in the end, at a bunch of failing tests and decide for themselves whether they are going to be corrected. I think we do all understand that in the end the test suite is piggybacked off Frank’s labours, so it has to work for him and the users of citeproc-js, and we gratefully live with that because it’s such a marvellous resource. The odd case where we disagree is not worth a whole different directory.

1 Like

I quite disagree. If there is no sorting, then the order of the cites in the citation should be taken as the intended order given by the author. It would be incorrect to change that order, and the reason for that is (1) cites may have affixes where the order matters, or (2) even if there are no affixes, the cites are presented in order of relevance to the argument.

What is incorrect here is an ambiguity in the spec, not the test.

If there is no sorting, then the order of the cites in the citation should be taken as the intended order given by the author.

There really isn’t ambiguity in the Spec here. On this point at least, the spec is clear.

Grouped cites maintain their relative order, and are moved to the original location of the first cite of the group.

When grouped cites are moved, works by the same author are gathered together. It can change the order. It’s supposed to.

It’s really not a problem for the user, any more than sorting (which also disturbs the order of citations compared to the order in which they have been entered). The moving only happens within a single citation group. So if I do

[Prefix][A, B,C][Suffix]

I may end up with [Prefix][A,C,B][Suffix] if cites are grouped.

But if I do

[Prefix][A,B][, ] [][C][Suffix] (i.e. I separate into two groups)

I won’t get that.

A style that provides for sorting, or grouping, or collapsing cannot guarantee that cites will appear in the same order they are entered by the user within a single citation-group, but there is always an easy way for the user to set them straight.

The alternative is worse, because it requires the user to know what order the cites should be in and if they are wrong to fiddle with the text.

If any change were required here to meet the logic of your concern, it wouldn’t be to demand sorting as a precondition to grouping, it would be to say that BOTH sorting AND grouping should be suspended iff there are non-empty prefixes or suffixes to the citation group (just as collapsing is suspended if there is a locator). That might make sense, but my hunch is it would as often do a bad job of reading the author’s mind as a good one.

(I think Paul is right that the spec text is currently contradicting the test unambiguously, so this is about how CSL should ideally work, not what the current spec says)

It’s really not a problem for the user, any more than sorting (which also disturbs the order of citations compared to the order in which they have been entered).

I think that’s the crucial point here, though. Sorting can absolutely be a major issue. Zotero (at least) has an option to disable sorting for styles that do normally sort, and having such an option is, imo, crucial for when citations are ordered by some sort of logic. This is often associated with affixes such as in @bwiernik’s example but need not be.

Most author-date styles do sort, Chicago (author-date), the style in the test, does not sort. The reason is that the style guide specifically states that order of works in a citation may be meaningful (e.g. most important work first, etc.). So there absolutely needs to be a way for authors using Chicago style to maintain the original order of citations, including in citations that might otherwise collapse.

The current test behavior is maximally flexible, if possibly tedious for an author: if the author wants the citations collapsed, they can move the item within the cite. Unless I’m missing something (and that’s possible), absent further modifications or flags, your implementation makes it impossible to get (Aalto 2015a,b; Bartleby 2010; Aalto 2015c–e) in Chicago style. I think that’s a bad idea.

Is that last group not analogous to a “see also” list; so a citation list within a citation list?

Is that last group not analogous to a “see also” list; so a citation list within a citation list?

Frequently – I guess I’d try to keep this as flexible as possible without trying to predict all logics used. Once people manually sort citations (which again, overall, is comparatively rare), they do all sorts of things: see also lists, order by importance, depict exchanges in the literature, and likely many more.

2 Likes

In most cases, it will be the spec that is wrong. So in these cases, contradictions should be brought up and corrected.

Does grouping as such then ever apply.

E.g.

Suppose I have three cites: Doe 1985, Doe 2001, and Smith 1990.

If I sort by author I will naturally produce a group (Doe 1985, Doe 2001, Smith 1990). No “moving” is required. Collapsing can then apply simply to contiguous cites with the same names (year collapse) or years (year-suffix collapse).

But suppose I have a reverse sort by year. That gives me Doe 2001, Smith 1990, Doe 1985. Do I say “I have a sort so group” (Doe 2001, Doe 1985, Smith 1990), or do I say “don’t group because you are messing with my carefully constructed sequence”?
It seems to me arguable that one could
simply dispense with grouping, as such, altogether. If it happens, it happens. If it happens, then collapsing may remove redundant portions of any group which happens to be there. That’s all. It’s up to a style author to use sorting to create the groups, if that is what is intended (which, after all, is not that hard to do: a sort by name / year / year-suffix will do the job if you want aggressive grouping.

If that’s right one can simply remove all reference to grouping from the spec, and explain collapsing alone.