Citation Style Language

CSL Funds & Projects

on the CI part: Is there already infrastructure to run tests similar to this?

Otherwise, this seems doable. Samples of what’s desired and a pointer to what’s already available in terms of testing infra would be really helpful to get an idea of the scope of the project.

1 Like

As a separate possible line of expense, it might be a good idea to start subscribing to a proper backup service for CSL’s main repositories (“styles”, “locales”, “documentation”, etc.). GitHub endorses BackHub, which costs $144/year for the cheapest tier (10 repos). Or we could hack something cheaper together ourselves. Seems like a good thing to have in case an account gets hacked or we dramatically mess up a repo ourselves.

https://help.github.com/en/articles/backing-up-a-repository
https://backhub.co/pricing/

I’m particularly curious to hear from the large downstream projects (Zotero, Mendeley, etc.) whether they think this would be valuable to them.

(we separately might also want to use Zenodo to e.g. archive CSL schema releases for long-term archiving and getting citable DOIs; https://guides.github.com/activities/citable-code/)

Not that I want to be the one to have argued against backups if something bad happens, but I don’t see a lot of value in this. The whole point of Git is that a copy of the entire commit history exists on everyone’s computer (and would still be available in the reflog for a while even after a catastrophic force-pull). And while that doesn’t include things like issues and pull requests, restoring those from a backup would be pretty ugly. So while periodically running one of the open-source scripts that dumps everything available via the API seems like a decent idea, I don’t see this as something that’s particularly worth paying for.

I’d say the most important thing would be making sure that people with write access have 2FA enabled and limited API key settings.

On style tests and comparisons, I’m working on some changes to Citeproc Test Runner that may be useful. An outline of the plan is in the dev section of the Jurism docs. Basically, the plan is to set up style-specific collections in a public Zotero group, and dump tests from there into eponymous subdirectories of a GitHub project that houses style tests. The dumped fixtures will then either be approved as valid using the Test Runner, or manually edited in the RESULT field to reflect desired output.

An additional part of the plan, not reflected in the Jurism docs note, is to output each form of the item: initial reference; ibid; subsequent; and !near-note, each with and without locator, plus the bibliography entry if applicable. That should give a pretty good picture of what a style does; and by running tests for one style against another, you can get an at-a-glance view of the differences.

I need this for Jurism development anyway, so no funding is needed to push it along, but someone might be able to build something interesting on top of it for use in CI or so.

Okay, just checking. (for a free solution, we could also e.g. introduce quarterly releases of the “styles” and “locales” repos and have those automatically deposited into Zenodo)

We currently use Travis CI on various repos (https://travis-ci.org/citation-style-language), with e.g. RSpec tests for the https://github.com/citation-style-language/styles and https://github.com/citation-style-language/locales repos for quality control, and webhooks to alert https://github.com/citation-style-language/Sheldon, https://github.com/citation-style-language/distribution-updater, and Zotero of build results.

Sheldon posts GitHub comments to pull requests in our “styles” and “locales” repo to assist contributors. See the posts by csl-bot in https://github.com/citation-style-language/styles/pull/4003 for an example. distribution-updater updates https://github.com/citation-style-language/styles-distribution whenever a Travis build for the “master” branch of https://github.com/citation-style-language/styles completes successfully. (the styles-distribution repo is a bit redundant now that GitHub offers protected branches (https://help.github.com/en/articles/about-protected-branches), but we haven’t bothered making the change yet)

That looks pretty comprehensive already. Does csl-bot not already do what’s requested above? I see it producing differences in th GH issue.

Sheldon/csl-bot just links to the Travis CI build reports of failing builds. The tests in these builds currently don’t include any CSL processor-based citation rendering. The differences reported in e.g. https://travis-ci.org/citation-style-language/styles/builds/509502463 are just the result of some string matching within the CSL XML code (see e.g. https://github.com/citation-style-language/styles/blob/5f60c0b0c26b463754661c587c95a0626f60e999/spec/styles_spec.rb#L184).

I’ve created a GH issue with a mock-up for the citation rendering part of the request. I agree with Rintze that easy access to the diffs would be great, but don’t currently have a good idea how that would even look, so will leave that to him to add either in the same mock-up or in a separate ticket.

Edit: One more idea in this issue: https://github.com/citation-style-language/Sheldon/issues/14

There are auto-generated diffs at https://aglc4.cormacrelf.net/csl that basically turn split green/red when there are differences in a normalised HTML string. There’s a lot of code in jest-csl to build on, similar to Frank’s test runner, but there are also React components for laying out test results (no real difference from diffing) that could be made into a static site with one page per file (with Gatsby.js) that the ci bot embeds in a comment or links to.

@retorquere – not particular rush, but just wanted to see if the tickets & mock-ups make sense and if you think something along these lines might be doable?

Sorry – I have been slammed the past week, but that should clear up around next Monday.

The mockup looks OK, but that seems like a fairly simple thing to do? If I’m reading it right, it’s just to add a rendered citation/bibliography when the test passes?

… and ideally better (customizable) error reports, right on github, when it fails. Might well be simple – all the better. It’ll save us a ton of time and, more importantly, reduce poor quality styles.

Just so I have a clear picture – this would then be a modification of Sheldon qua scope. Would it be OK to introduce node there? That would be the easiest way to make sure it actually goes through citeproc-js.

Absolutely, yes. Keep an eye towards ease of maintenance, but we’re agnostic in terms of tooling.

Wait – Sheldon is just a separate bot that picks up on travis results, correct? That’s trickier because it doesn’t have easy access to the PR context to generate the diffs. It seems to me it would be a lot easier to add this to the actual test runner – the test runner could either actively push out comments to the GH issue (that’s how the BBT builds do it), or actively ping the bot with build assets who then takes care of announcing to the GH issue.

I haven’t worked with Travis bots before – I’d have to dig into that first.

We’re completely agnostic about methods. We don’t want people to have to click through to Travis etc., but absolutely this doesn’t have to be a Travis bot/app. Sheldon is some years old, it’s possible the testing framework simply wasn’t sufficiently advanced to do this or that Sylvester simply was more comfortable doing this with a bot, but in either case, in spite of having a name, we’re not terribly attached to Sheldon per se as long as we get similar output (and can customize the message text with reasonable effort).

WRT backing up the repos, pushing a copy of the repo to backblaze in a nightly Travis job (or anything else where a clone can be fetched and copied to b2) would run about $5 for a rolling full year of daily full snapshots if my calculations were correct (locales + styles is currently 156MB but let’s call that 200MB to factor in growth, times 365 days makes some 73GB for one year, at 0.005/GB/month would be $4.38/year).

As an update, @retorquere has done an amazing job updating Sheldon so that we now get previews of changes in PRs. He’s put a lot of work into this and still offered to do it at the low end of our suggested rate above, i.e for US$1,000.

This is already making our work reviewing style PRs easier, so Rintze and I would be more than happy to pay this out. We’ll wait a week for any concerns raised here and then, absent objections, pay Emiliano.

We’re still looking for someone who wants to take on the csl-editor update. Please post here and-or to a separate thread.

1 Like

These previews are pretty amazing!