Sentence case variants

Hope everyone had a nice weekend,

To recap the status quo on sentence case: Currently, CSL
recommendations are to store titles in sentence case and to not use
the text-case=“sentence” attribute on titles.

Something that had escaped me is that “sentence case” isn’t actually
uniform in the US. In particular, styles that derive broadly from the
APA manual (i.e. a lot of the social sciences & education) capitalize
the first letter of the subtitle (i.e. after the colon) as in “Cooked:
A natural history of transformation,” while styles derived from the
NLM’s “Citing medicine” don’t, as in “Cooked: a natural history of
transformation”.

There is currently no good way to handle this via CSL, though I think
it’s clear that I should be.
I see two possibilities, but there may be more:

  1. Format titles and subtitles separately. In that case we could
    simply add a “capitalize-first” for the subtitle in APA style. This
    may also be nice as different languages have different title-subtitle
    delimiters. This is also what BibLaTeX is doing, so it’d work nicely
    with citeproc-hs. The big downside, of course, is that the demands on
    data are significant and very few datasource apart from MARC records
    separate titles and subtitles, so I’m not sure this is going to be
    feasible.

  2. I think this is what I’d prefer, because it’s simple&easy: We
    recommend users store titles in “NLM” style sentence case, which is
    also what MARC and PubMed have. Then we re-purpose/redefine
    text-case=“sentence” to no longer force lowercase, but instead
    capitalize the first letter after the colon (and quotation/exclamation
    mark) and apply that for APA and related styles. The pro here is the
    ease with which we could make this work. The downsides are a) that I’m
    not sure that using text-case=“sentence” that way makes a lot of
    intuitive sense (so we could also add a different option) and b) it’s
    probably the less thorough and systematic solution, so it’ll only work
    98% of the time.

Thoughts?–
Sebastian Karcher

Option #2 sounds like the more appealing approach – working 98% of the time
is much better than <50% of the time if subtitle data isn’t usually
available from most sources.

I’d think it would be cleaner to introduce an attribute that handles this
specific difference in formatting. Something like
capitalize-after-colon=“true” applied to a element would practically
explain itself. (Or “capitalize-after-punctuation” if the APA rule is more
general.)

But there is always the danger that whatever rule for detecting where to
capitalize post-colon words in the APA style (the algorithm for which would
presumably be specified in the spec?) might backfire for certain edge
cases. The APA blog itself suggests that this rule may require semantic
interpretation (
http://blog.apastyle.org/apastyle/2011/06/capitalization-after-colons.html),
and things like chemical symbols (
http://www.ncbi.nlm.nih.gov/pubmed/24562861) or quoted single words in
titles might make things complicated.

And there’s nothing more frustrating for a user than seeing something
incorrectly auto-capitalized and having no control over it. (Yes,
citeproc-js supports and Papers has its own syntax
for protecting case, but neither are documented parts of CSL.)

So overall I’d say my feelings are mixed. I like the idea, but the question
is whether the overall improvement is worth the potentially frustrating,
albeit rare, situation of over-capitalization in APA styles.

–greg

I appreciate the concern, but I think the overcapitalization risk is
smaller than you make it out to be: The examples in APA blog where
semantic context matters refer to sentences in the body of the text,
not to the title, where the rule to capitalize after colons is
categorical and without exceptions. Chemical elements all start with a
capital letter (or a number in a formula), so, as in your pubmed
example, capitalizing after the colon wouldn’t change anything (also,
generally speaking journals with chemical formulas in the title are
more likely to follow the Vancouver version of sentence case). In
other words, I don’t see an exception to this so far.
Just to restate why I believe this is quite important to solve: Right
now users would have to manually capitalize all letters for correct
APA style. If a client builds a function to automate that (Zotero e.g.
has a right-click option to convert to pseudo sentence-case) it works
incorrectly for all users using Vancouver-type styles. So you end up
requiring people who’re using APA and related styles to manually
uppercase the first letter after the colon in every title and even
with that if you have, e.g., a clinical psychiatrist publishing in
both psychology and medicine, that person is screwed. There is no way
for them to input the title that will work in both APA and Vancouver.

Thanks for the extra clarification. The case for making this change is
strong – if there really are few or no exceptions, it seems like it would
be a great improvement.

For what it’s worth, citeproc-js offers the option (enabled in
citeproc-js) of casting two virtual variables, title-main and
title-sub, derived from the title and shortTitle (title-short) fields
using logic originally suggested by Dan Stillman.

Frank

I think https://forums.zotero.org/discussion/8077 is the Zotero forum
thread Frank just referenced (“using logic originally suggested by Dan
Stillman.”).

Repurposing text-case=“sentence” feels like too much of a hack to me.
If the title-subtitle delimiter and subtitle capitalization should be
localizable, we could add two ‘localize options’
(http://citationstyles.org/downloads/specification.html#localized-options).
Otherwise, I prefer to assume we have structured main-title and
sub-title metadata, or can do a reasonable job parsing them, and
offering options to format these independently.

Rintze