A second try at an alternative using LLMs

A few years ago, I started experimenting with a new approach to evolving CSL in GitHub - bdarcus/csln: Reimagining CSL .

But I’m busy, and am amateur programmer with aspirations that (far) exceed my time or skills.

In the past week, however, I’ve been doing a deep dive into new agentic coding tools; notably using the latest Claude’s Opus and Google Gemini models.

After I got more comfortable with how to exploit these tools, I threw this new project together in less than 24 hours, and I have to say: I am super impressed. It already does much more than what I achieved with the earlier project (though is borrowing code from it).

I should add, however, a huge question for me remains whether the 100% fidelity claim mentioned in the README is even possible. I aim to figure this out over the coming weeks (though early progress is slow, so it may take many weeks!)!

Basically, I had these tools analyze how to extend my earlier experiments (which got pretty far actually) in order to bring the vision to completion.

Perhaps the most interesting possibility this opens up, I think, is reflected in the contributing section of the README, which you can see if you create a new issue and select “domain expert.”

The prior art analysis is also super interesting. It’s the result of me asking how to synthesize all the information in the respective code bases, the csln repo issue tracker, and in the spec documents for CSL/M 1.0. That’s now incorporated in to the roadmap (this file, while human readable, is aimed at the LLM tools).

Here’s a parallel project with the start of a rust-based server, upon which I intend to build a client UI based on an idea I’ve previously talked about, that I will make sure the core code supports (for live-previewing and such). Here’s the browsing UI I imagine:

And this is a representation of the creation wizard I’ve previously discussed, with the idea being it has live previewing.

I’ve added a couple of design docs to address some long-standing issues and questions:

  1. How to deal with style reuse and duplication; here there will be no dependent styles, but a more composable alternative. Notably also, iterative development is now driven by the priorities and knowledge reflected in actually-existing 1.0 styles.
  2. How to make finding and creating styles easier.

I got a basic demo of the rust server + sveltekit client front-end working. The previews are (mostly) generated dynamically on the server.

As I said above, I’m trying to develop these in parallel so that they are fully complementary, though will now turn back to the core code.

The last few weeks I’ve focused on improving the XML based migration of styles to these new, quite different models. When I saw multiple high end models fail, I decided to pull the plug on the approach, which was wasting a lot of time and resources.

Instead, I had the strong hunch an earlier idea I had of inferring templates from formatted citeproc-js, would be simpler and more reliable.

So, I had the new Opus 4.6 model run an analysis of the two approaches, and have an architect agent propose a plan.

That plan is here; it uses the XML for what it’s good at, and a new JS inferred script for the templates.

That does includes initial experiments that justify the change in approach:

The inferrer validates that the hard problem (template structure) is better solved by observing output than by parsing XML. The XML compiler’s 0% bibliography match was not a bug — it was evidence that procedural-to-declarative translation via macro flattening is fundamentally harder than reverse-engineering from rendered output.

Next step is to hook up this script to other scripts in order test full rendering impact.

Latest updates:

  1. Per above, I ditched the approach of trying to parse 1.0 XML macros and templates and map them to the very different new model; instead the focus will be deriving styles from the output (using citeproc-js). It’s much easier to reason about and debug parsing common input data than it is the insanely complex 1.0 styles.
  2. I also found an extension of this idea to have an LLM not only do a good job of creating a style, but also that it could iteratively improve the code to match the expected output. Reflected in a new styleauthor agent and skill. More.

And this wrinkle from APA, not supported in CSL 1.0, now works (along with integral/narrative citations generally)!

=== apa-7th.yaml ===

CITATIONS (Non-Integral):
  [pew_social_media] (Auxier & Anderson, 2021)
  [berger_luckmann] (Berger & Luckmann, 1966)
  [vaswani_attention] (Vaswani et al., 2017)
  [aad_atlas_higgs] (Aad et al., 2012)

CITATIONS (Integral):
  [pew_social_media] Auxier and Anderson (2021)
  [berger_luckmann] Berger and Luckmann (1966)
  [vaswani_attention] Vaswani et al. (2017)
  [aad_atlas_higgs] Aad et al. (2012)