Handling main/sub title splits (citeproc-js)

A user from Brazil has posted a request for main/sub title splits to the Zotero forums.

As I note in the thread, citeproc-js is capable of doing this. While I mentioned in the thread that “the default could be changed,” I won’t be changing the default in the processor source, since the variables “title-main” and “title-sub” that it makes available are outside the CSL specification. The feature would need to be enabled in the processor by the calling client. [1]

In response to a a note on zotero-dev @Dan_Stillman suggested that the issue should be discussed in CSL, hence this post.

I’m putting this up because the feature would solve a problem for Brazilian users. If eventually adopted in CSL, it should be safe to mandate in the specification and deploy as default behavior in processors—but whether to do so is of course not my call.

The splitting logic is the same as that used for subtitle capitalization in APA and a few other styles, and is illustrated in this test fixture.


[1] As an implementation-specific detail, the option would be enabled via the sys object used to instantiate the processor, with something like:

sys.main_title_from_short_title = true;

(And yeah, the option is badly named. It dates from the time when main/sub splits required matching content in the shortTitle field, which is no longer the case.)

FB

I think that would definitely be a useful feature. If it is technically possible I don’t really see why this shouldn’t be adopted. With this, it should also be possible to change the punctuation between title and subtitle, right? If yes, that would solve the “colons vs. dot”-problem that scholar writing in languages other than English sometimes have.

While I cannot speak for @John_MacFarlane, I also have the impression that this might be useful for pandoc-citeproc. Right now, pandoc-citeproc merges biblatex’s title and subtitle into CSL JSON title using hard coded delimiters, which could be avoided if we had corresponding variables in CSL. (Of course, biblatex’s titleaddon will still need individual treatment, but having distinct fields for title and subtitle would be a good start.)

This is the concern I have, with the possibility (I don’t know) that there are more options than just those two.

I confess to not thinking about CSL details much in awhile, and am super tired, but don’t we already handle split titles of sorts with the “short” variants?

So we just don’t handle the subtitle?

If yes, we could just add the latter variable in a way that is consistent with the former?

I’m not sure if that’s what you mean but I know of at least four variants of punctuation between title and subtitle:

  • colon space
  • space colon space
  • period space
  • space dash space
1 Like

Well, the test fixtures indicate that most of those punctuation variants should be covered.

Well, at least with Juris-M I get this behaviour with a title “A long title: with a subtitle”

<text variable="title" form="long"/> => A long title: with a subtitle
<text variable="title" form="short"/> => A long title

With the short title field in the GUI I can even shorten the first part of the title:

<text variable="title" form="short"/> => Title

In the specs I have found this:
“[A variable] may be accompanied by the form attribute to select the “long” (default) or “short” form of a variable (e.g. the full or short title). If the “short” form is selected but unavailable, the “long” form is rendered instead.”

But I haven’t found details about this implicit handling of the title variable.

(I cannot test with Zotero at the moment.)

Right, I’m just suggesting adding a “sub” or similar option to that form attribute.

I don’t have time to dig this up, but we’ve had a longish exchange on the main/subtitle issue before, and I’m in favor of adding them to the CSL specifications, yes.

So we just don’t handle the subtitle?
If yes, we could just add the latter variable in a way that is consistent with the former?

No, I don’t think we should do that. While Frank (reasonably given the Zotero data model) uses the short title as a proxy for the main title, those two are not the same. E.g. in the Chicago Manual’s discussion of short titles, two out of three examples don’t fit that model:

  • The Literature of Harlem → Literature of Harlem
  • Poverty and Inequality in Latin America: The Impact of Adjustment and Recovery → Poverty and Inequality
  • Nationals and Nationalism: Adultery in the House of David → Nationals and Nationalism

There are many more examples for the need of a short title different from the main title on the Zotero forums. Given this, I think introducing the new variables title-main and title-sub while keeping title and title-short/title form="short" seems right to me for the CSL data model. Implementation details could be left to the processors/reference managers (and using title-short as a proxy for title-main may be a reasonable first stab – I just don’t want to lock us into that).

1 Like

To avoid possible confusion, title-main and title-short are not tightly linked in the citeproc-js implementation. If the title contains a splitting delimiter (like a colon-space), title-main and title-sub will be derived from it. There are a few loose connections to title-short, though:

  • If the case-normalized values of title-main and title-short match, the latter value is overwritten with the former, to avoid user confusion over a discrepancy.
  • If title-main would end in a single character, the split will not be performed unless the value of title-main matches title-short. This avoids false-positive splits for things like: “Chapter A: Emerging Issues.” (This constraint might be extended in the future, if users report further problems with false positives.)

Currently, if no explicit value is set on title-short, it remains empty. It could be implicitly set to the value of title-main in that case, but forcing that change on user data might cause frustration.

1 Like

Absolutely! At least four forms are necessary: title, title-short, title-main, title-sub. Chicago Manual of Style is only one example out of many – MLA also asks for shortened titles that do not match title-main. Ideally, you would be able to set title-short on per-style or per-document basis, but that’s of course a question to be left to reference managers.

Just an idea, but would it be an option to treat title-main and title-sub not as regular variables but to use those in cs:locale to determine to rendering of the title variable? Something like this:

<locale xml:lang="en">
    <variables>
      <variable name="title">
        <long>
          <group delimiter=": ">
            <text variable="title-main"/>
            <text variable="title-sub"/>
          </group>
        </long>
      </variable>
    </variables>    
</locale>
<locale xml:lang="de">
    <variables>
      <variable name="title">
        <long>
          <group delimiter=". ">
            <text variable="title-main"/>
            <text variable="title-sub"/>
          </group>
        </long>
      </variable>
    </variables>    
</locale>

With this, we could just keep using <text variable="title"> in cs:citation and cs:bibliography. Of course, we can already achieve this with macros, but macros aren’t language aware…

Are you sure that’s a real world requirement? I’d strongly hesitate to implement such locale if it weren’t explicitly required in the original style guide. What I see is that an institutional style guide claims to follow a certain style (e.g. APA) with the exception to substitute a period for the colon between title and a subtitle (well, usually other exceptions are formulated as well…). So adding such localization would accomodate this very institutional style guide but might well annoy lots of other users. This be better solved in a separate csl style.

What might be necessary to localize is title orthography as some styles require English item titles to follow different rules from other language titles (title vs. sentence casing but also casing of the first word of the subtitle following a colon) but IIRC this is already present somehow?

On a different note: the same logic is necessary at least for container-title. container-title-main and container-title-sub are needed for book chapters. I’m not sure yet about the other titles in CSL.

You might be right here concerning my suggestion.

Concerning title orthography: that’s indeed already there (see here).

Thanks for the reference.

After thinking a bit more about it, I think it would make sense to add an inheritable attribute title-sub-delimiter (or something similar). The delimiter supposedly should be identical in title, container-title and collection-title. Setting it globally as a style attribute or to cs:bibliography would render macros for setting the delimiter unnecessary and existing styles wouldn’t have to be changed too much.

That sounds like a very lean solution.

As I come across this, just FYI, we are addressing this in v1.1, in part with a “sub” and “main” @form for title variables.