Upcoming CSL meetup context

I’ve only thought about this just now, so may be missing important details (like substitution and formatting), but …

What about ditching XML entirely, in favor of one of the new cross-language template languages?

Here’s liquid:

Hello {{ 'tobi' | upcase }}
Hello tobi has {{ 'tobi' | size }} letters!
Hello {{ '*tobi*' | textilize | upcase }}
Hello {{ 'now' | date: "%Y %h" }}

So the | pipes the variable through one or more transformation functions, which can include arguments.

For CSL, the obvious filters, or filter groups:

  • Names
  • Dates
  • Titles

Could have a default filter for each group, and then variants that encapsulate arguments (the logic in existing style macros) for ease of authoring.

A simple hypothetical style fragment:

{{ author.names | shorten-names-apa }} ({{ issued | date-year | date-suffix }}). {{ title }}.

And then combine the template strings in a larger YAML file?

citation:
  mode: intext
  et-al-min: 4
  et-al-use-first: 1
  disambiguate-add-year-suffix: true
  disambiguate-add-names: true
  disambiguate-add-givenname: true
  givenname-disambiguation-rule: primary-name
  collapse: year
  after-collapse-delimiter: "; "
  template: ({{ author.name }}, {{ date | fmt-year }})

EDIT: it occurred to me that the more complex logic in CSL now is actually described in the simple attributes. So in the above, the template can actually be pretty simple. I don’t know how substitution would work elegantly however ATM.

Any future development work would come down to describing the filters, much like we do now for rendering elements, and adding input/output examples to the test suite.

And in refactoring, can consider things like multilingual from the beginning.

Finally, this would be:

  • easier for users to edit
  • maybe possible to convert to from existing styles.

This assumes most of the logic can be contained in the filters, so the actual templates are pretty simple. Am not sure about that.

Am I crazy, or might this be a way to address all of the above issues?

If you think about it, all the collective knowledge of citation formatting is embedded in that style repo; maybe we can use that to facilitate a more radical break (and also do other cool things)?

Or maybe the basic idea of chained filter transformation could be implemented in the XML syntax; not sure. As I said, this is just a very tentative idea.

Can you explain why that should solve these issues?

Getting ahead of myself there, given this is a hunch more than fully formed idea, but the essence is:

Maybe it’s possible to put more logic in named “filters” and parameters so as to simplify the actual template syntax, which could make compatibility issues easier to manage going forward?

If still XML, it might mean heavier use of attributes (which are easy to ignore in processing).

EDIT: me playing a bit:

RNC:

template = element template { template.atts,
                              ( template.list | template.render)+
                              }

template.atts { attribute context { text }?, attribute name { text }? }

## A single `template` element, that can be used at top-level, or
## within `citation` and `bibliography`
template.render = element render { render.atts }

## And a list element.
template.list = element list { template.affixes, template.render+ }

## A general conditional.
template.cond = attribute when { text }

template.affixes =  attribute suffix { text }?,
                    attribute prefix { text }?,
                    attribute delimiter { text }?

render.atts = { attribute variable { render.vars } }
render.vars = "author" | "editor" | "issued" | "title" | "cited-locators"
render.fmt = {
    ## a template to call for partial rendering
    attribute template { text }?,
    attribute bold { xsd:boolean }?,
    ## an attribute that takes a list of filters, which transform the input
    attribute filters { list { text } }?,
    template.affixes,
    ## a conditional and substitute; can we do these with attributes?
    template.cond?,
    attribute substitute { list { render.vars } }?
}

Example XML:

<template name="citation-apa-paren"
          description="For default rendering of parenthetical APA citations.">
  <list prefix="(" suffix=")" delimiter="; ">
    <render variable="author" suffix=" "
            substitute="editor"
            filters="names"/>
    <render variable="issued" filters="fmt-date-apa"/>
    <!-- unclear filters vs other templates -->
    <render variable="cited-locators" filters="fmt-locators-apa"/>
  </list>
</template>

Aside: there’s one or more problems with this example, since list here isn’t exactly equivalent to cs:group, and I’m not actually clear on the logic (though CSL 1 has the same issue). But it’s enough to illustrate the idea for now.

Basically, in this scenario, we merge cs:macro, cs:text, cs:group, etc into a consistent template, render, and list, where template can be used in different places.

There’s also a consistent way to signal when to transform the input, and when to output it as is.

This could include potentially loading templates from a separate file.

I can imagine extracting and converting, for example, the most widely used and developed styles (the chicagos, APA, etc.) from the styles repo and making them available in one or more files from a CSL NEXT styles repo, so new styles wouldn’t often have to include new template “macros”.

Given the huge technical debt, the question is not just whether this could work, but if a clean break is the best path forward.

But given the conversation last Summer, I felt like we were stuck, and so thought this worth proposing.

My hunch (again) is that for existing processors, if we did this right, it would simplify the parsing and processing code (potentially a lot?), and allow reuse of much of the existing logic (since the key processing-related attributes would stay).

It should also simplify style updating and such, and schema maintenance (since the schema itself should much simpler, and less often in need of updating).

I’ve been discussing the ideas with @Denis_Maier off-forum the past few days, and decided to instead to just ask a question over here.

To come back to this, a possible way to achieve this may be to simplify the template part of the language, and move most (all?) configurable logic to extensible parameters that are set independently of the templates.

Let’s get CSL 1.1 out this year. Given our experience working on updates over the past years, I think we should approach future updates with the perspective of maintaining backwards compatibility as a baseline and making updates as modular as possible.

With that in mind, I suggest we review the current implemented 1.1 changes and remaining TODOs and determine which deprecations should be removed and if any new features need to be adjusted.

I guess that brings us back to the process question: who is “we”, and how would that work?

Maybe one of us (you or I) can create a tracking issue at the repo for feedback, with some deadline?

On a related note, I added this issue awhile back when I couldn’t rebase the branch. The git history is kind of screwed up.