Nested names nodes

While looking into a error report from Sebastian, I found that
citeproc-js was not coping gracefully with nested full-form cs:names
nodes.

While a cs:names node without children should inherit the labels from
the preceding superior cs:name node on its parent, a full-form node
should (I think) completely override all superior cs:label nodes.

I suspect that other CSL implementations are likely to have cleaner
internals than citeproc-js, and already behave in this way; but I
thought I would mention it because I was a little surprised to find
such a fundamental error in code that has been in service for so long.
Here is a test that covers the issue:

http://bitbucket.org/bdarcus/citeproc-test/src/ab136a6aa8f2/processor-tests/humans/substitute_SuppressOrdinaryVariable.txt

(If I’m wrong and other implementations are tripped up on these nested
structures, it might be worth mentioning the difference in behaviour
explicitly in the specification.)

Frank

I have finally hit upon the issue noted by Frank below and, indeed, I
could need a little help understanding the specification in this case.

Here I am referring to test case:

https://bitbucket.org/bdarcus/citeproc-test/src/ab136a6aa8f2/processor-tests/humans/substitute_SuppressOrdinaryVariable.txt

When rendering the second item we have a book with an editor and a
title. Because there are no translators, the editor will be rendered
instead using the editor macro. But this macro is tricky.

Since we have a book in this case, we could simplify it like this:

Now, this presents three questions to me:

First, how should we deal with names nodes (when used as substitutions)
that have child nodes? (If I am not mistaken this is the question to
which Frank wanted to alert us) — I do agree that this should be stated
explicitly in the spec. My initial approach was to not inherit anything
from the original names node, thinking that if style authors added a
node there specifically, it should always be used. But then I thought of
macros and realized that it is very likely that macros could be used in
substitutions and in this case I think it makes more sense for the
formatting of the nodes being substituted for always taking precedence.
I am undecided what makes more sense here, and just wanted to know what
the consesus is on this point?

The second question I have is about the label node. The spec says this
in the Substitute section:

“A shorthand version of cs:names without child elements, which inherits
the attributes values set on the cs:name and cs:et-al child elements of
the original cs:names element, may also be used.”

Does that mean that the label node is never inherited?

But there is a third issue with this test case. When rendering Item-2 ,
which is a book, the editor macro looks like the one I printed above
(unfolding the choose node and editor-short-label macro).

citeproc-ruby renders this as “John-boy Doe ed.John-boy Doe editor.” and
I can’t say that I find any fault there. Here is what the spec says
about substitution and subsequent suppression:

“If cs:substitute contains multiple child elements, the first element to
return a non-empty result is used for substitution. Substituted
variables are suppressed in the rest of the output to
prevent duplication.”

I have implemented it this way: when rendering substitutions, I start
rendering each child. Before the rendering starts, an observer object is
attached to the citation item that keeps track of all variables being
accessed. Now, if the child renders a non-empty result, this will be
used as the substiution. At this point can I mark all variables that
were used as suppressed (they are not deleted, but marked as suppressed,
so that conditional tests etc. still find the values, but at the same
time I can make sure they will not be printed again etc.). For this
reason, the output above makes a lot of sense to me.

However, the expected test result is just: “John-boy Doe ed.” — but this
does not seem right to me. The spec says, the first element to return a
non-empty result is used for substitution. In this case, this is the
editor macro which actually prints the editor twice!

Obviously, this is a quite complicated example, so if I’ve made a
mistake, please do point me to it. In any case, I would be very
interested in your opinion on this, because as it stands I am reluctant
to change the way citeproc-ruby handles this.

Thanks!

Sylvester

signature.asc (198 Bytes)

In reverse order:

Like Sylvester I’m confused about 3) as well and would have expected this
to produce duplicate editors (as the editor macro by itself does).

FWIW, I just tested this in my Zotero install and it doesn’t actually
produce the test result. I always get “Jane-girl Doe editor” even for
books.

Re 2.) We’ve been writing all styles assuming that the label node is
inherited. I assume that is an omission in the specs, I’d suggest just
adding that.

Re 1) I’d go with your original hunch and not inherit anything that’s
explicitly set in the child node. If people use macros in cs:substitute,
that’s presumably precisely because they want to prevent inheriting
attributes. Otherwise they could just use cs:names. Again, making this
explicit in the specs would be a good idea.

SebastianOn Fri, Jan 24, 2014 at 10:21 AM, Sylvester Keil <@Sylvester_Keil>wrote:

Sylvester writes: “The spec says, the first element to return a
non-empty result is used for substitution. In this case, this is the
editor macro which actually prints the editor twice!”

The citeproc-ruby behaviour seems correct, I think. Shall we amend the test?

I’ll probably leave the citeproc-js code as it is (and call it a bug),
unless this comes up as an issue in a running style.

For 3, the spec says “If cs:substitute contains multiple child
elements, the first element to return a non-empty result is used for
substitution.”

I’m pretty sure “the first element” is meant to refer to the first
direct child element of cs:substitute that returns a non-empty result.
So it should return the output of the entire “editor” macro, process
both cs:names elements, and print the “editor” name variable twice.—

For 2, I agree with Sebastian. I can think of no reason to treat
cs:label different, and we should fix the spec here. It was probably
an oversight.


For 1, the spec says “When an inheritable name attribute is set on
cs:style, cs:citation or cs:bibliography, its value is used for all
cs:names elements within the scope of the element carrying the
attribute.”

So when cs:substitute calls a macro that uses cs:names, I think it
would be most consistent if the cs:names element in the called macro
would not inherit any attribute settings from the cs:names element
that is the parent of cs:substitute. However, it probably should
respect any inheritable name attributes set on cs:style, cs:citation
or cs:bibliography.

Rintze

Thanks everybody for your replies!

Frank, I doubt that this is a configuration that would occur in a real
style anyway (why would anyone want to print the same names twice?), so
it’s probably okay not to touch it — I was mostly concerned that I might
have overlooked a subtlety in how to interpret the test case so thanks
for your clarification.

Going back to the first question, I do see some merit in allowing
substitution cs:names to inherit the child nodes from the original
cs:names even when they are called by a macro; not allowing it is easier
to implement, but by allowing it, style authors could effectively
override the cs:name, cs:et-al and cs:label settings of a macro. In any
case, I think it would be great for both developers and style authors if
we could make the following issues very clear in the spec:

A ‘shorthand’ cs:names version uses the cs:name, cs:et-al, and cs:label
nodes of the cs:names node it substitutes (even when called from a
macro / or only when it is a direct descendant of a cs:substitute node).

What about if the cs:names node has children? I would think it best to
be very strict: either nested cs:names nodes always use the children
of the parent of cs:substitute (in this case style authors could
override the configuration of a macro when used inside a cs:substitute)
— or, alternatively, if cs:names has child nodes it never uses any of
the children of the node it substitutes.

Rintze mentioned inheritance; substitution and inheritance are two
parallel processes: the question of substitution is about which nodes to
select; inheritance determines the attributes of the selected node. Is
that correct?

In all these cases, I do not have a strong preference, really. When
rendering cs:names citeproc-ruby currently checks to see if the node is
being rendered as a substitute (even when part of a macro). If it is
being rendered as a substitute the cs:name, cs:et-al and cs:label of the
cs:names being substituted are used (even if the current cs:names has
child nodes of its own!). The cs:names and the selected cs:name node
inherit from cs:citation/cs:bibliography and cs:style.

I could easily change this to: if cs:names is rendered as a substitute
it takes the original cs:names’ child nodes only if it does not have
child nodes of its own.

What I would advise against is a mixed approach where each of the
cs:name, cs:et-al and cs:label must be checked individually.

Sylvester

signature.asc (198 Bytes)

Having cs:names in a called macro inherit the attribute settings from
the parent cs:names of cs:substitute makes CSL less expressive. E.g.,
if the parent cs:names defines et-al abbreviation, there will be no
way to get rid of it in the called macro. This isn’t an issue for
inheritable name attributes, since there, if some name variables need
to be rendered without et-al abbreviation, the style author can always
abstain from defining the et-al attributes on cs:style, cs:citation,
and cs:bibliography.

It also makes the style harder to understand, since the output of the
called macro will depend not only on cs:style, cs:citation, and
cs:bibliography (which are relatively easy to locate), but also on the
calling cs:names element.

Rintze

Right, I’ve changed the processor to use the cs:names parent of
cs:substitute during the substitution process only:

  • When the current cs:names does not have children of its own
  • And when the cs:names that is being substituted is an ancestor of the
    current cs:names in the XML tree

Does that sound good?

Sylvester

signature.asc (198 Bytes)

Yes. By the way, looking at the schema, I just noticed how much
flexibility there is in defining elements within cs:substitute
(basically any content allowed within macros can be used within
cs:substitute). CSL 1.0.1 also made cs:name an optional child element
of cs:names, which makes it impossible to distinguish between the
"shorthand" cs:names an a regular cs:names. Perhaps we should consider
limiting cs:substitute in future versions of CSL, and e.g. only allow
the cs:names shorthand and cs:text (to call variables and macros) as
children. Anything complex can then be moved out into a macro.

Rintze

Going back to the first question, I do see some merit in allowing
substitution cs:names to inherit the child nodes from the original
cs:names even when they are called by a macro; not allowing it is easier
to implement, but by allowing it, style authors could effectively
override the cs:name, cs:et-al and cs:label settings of a macro.

Having cs:names in a called macro inherit the attribute settings from
the parent cs:names of cs:substitute makes CSL less expressive. E.g.,
if the parent cs:names defines et-al abbreviation, there will be no
way to get rid of it in the called macro. This isn’t an issue for
inheritable name attributes, since there, if some name variables need
to be rendered without et-al abbreviation, the style author can always
abstain from defining the et-al attributes on cs:style, cs:citation,
and cs:bibliography.

Right, I’ve changed the processor to use the cs:names parent of
cs:substitute during the substitution process only:

  • When the current cs:names does not have children of its own
  • And when the cs:names that is being substituted is an ancestor of the
    current cs:names in the XML tree

Does that sound good?

Yes. By the way, looking at the schema, I just noticed how much
flexibility there is in defining elements within cs:substitute
(basically any content allowed within macros can be used within
cs:substitute). CSL 1.0.1 also made cs:name an optional child element
of cs:names, which makes it impossible to distinguish between the
“shorthand” cs:names an a regular cs:names.

I’d say that in the case of placing cs:names inside cs:substitute, style
authors must be aware of the inheritance rules; therefore, if they do
not want to inherit the cs:name node from above, they would put an empty
cs:name node inside the nested cs:names to make this clear, don’t you
think?

signature.asc (198 Bytes)

Yes, we could say that in the spec. We could also limit the
"shorthand" to direct children of cs:substitute.

Rintze> I’d say that in the case of placing cs:names inside cs:substitute, style