Tool talk: style validation

Zotero currently uses a JS module done by Simon Kornblith in 2012 to validate styles. It’s built from rnv, an early RelaxNG validator written in C, compiled to JS via emscripten. I recently took a shot at recompiling it, and came down like Icarus.

After burning time on the compile attempt, I noticed that a lab in France now offers a tool for RelaxNG validation written in typescript, available on npm. The repo contains a code sample that is close to a general-purpose RNG validator for use in node. Having this bundled as an npm package that provides a schema-agnostic function otherwise similar to Simon’s module would be great, if anyone is game for it. Most immediately it would allow the java dependency to be removed from tools like Citeproc Test Runner; but more generally it would also make it easier to deploy more flexible infrastructure when revisions to CSL 1.0.1 start arriving.

(should note that “anyone” there refers to “anyone but your poor correspondent” :slight_smile:)

Psst… I think I can do better. While citeproc-rs can already validate styles, in very short order I believe I can turn it into a Language Server Protocol implementation, which means diagnostics and completion in your favourite text editor. I know RelaxNG already does those, but it’s a pain to set up, so a single binary would be better — I’m with you on Java dependencies. It would also be linked to a real implementation (maybe with CSL-specific non-RNG-encodable parts/unimplemented warnings/spec-difs/deprecation warnings/fix suggestions). I think the one thing that RelaxNG cannot support is the feature flagging we discussed a few threads back, so it would be good to cover that too.

If we’re really game, we could hook it up to ACE or Monaco within Zotero. We can dream, at least.

An LPS for CSL will be really great when it comes on stream, and drawing on it in tools like Zotero will be a natural step. I took citeproc-rs for a spin, the install was straightforward and painless, and the CSL 1.0.1 parsing to intermediate format runs like a top. Great work so far! I think there is also something to be said for modularity, though, particularly in the short term. RelaxNG validation is a pain to set up mainly for the Java dependency of jing. A CSL-capable npm package with a similar interface (schema + code -> result), whether built on RelaxNG tools or elements of citeproc-rs, would likely see immediate use.

Glad to hear it works!

Basically, just let me know what shape you want your results delivered in, and we can make it happen. It’s got line numbers, a fn (line, col) -> String to show you what you typed, and of course error messages, so that can be turned into an LSP response on stdio, be piped out as JSON or bridged across wasm into whatever JS object interface. The last few Node.js releases will run WASM just fine, as will all the browsers we care about. The Rust wasm tools can even build npm packages and publish them.

(One sticking point, noting your post the other day about encountering async JavaScript, wasm modules are imported asynchronously with await import("wasm-module-name"), so slightly different than normal. But not super difficult, it only has to load once. There are people working on making that unnecessary.)

(I should note these published npm packages don’t need a Rust installation. They are just wasm binaries with a wrapper to instantiate the module, roughly equivalent to publishing module.exports = eval('(function mywasmcode() { return 5; })');. So you just npm install.)

I’m okay with async/await. There’s a lot of that in Zotero code that I worked with to adapt Jurism at the 5.0 threshold. But I didn’t do my homework when the rest of the world was going through the throes of Promises and Eventing and Async. So when I hit tools that use the other models, I basically yearn to twist them back to exhibit async/await behavior, but because I don’t know what I’m doing, when I write original code of my own I stumble about a bit with trial and error and feel unhappy with myself. Async/await though is totally :+1:.

1 Like