Feature proposal: Custom Schemas (custom-schemas)

Just a new thread for an idea I had, didn’t want it to get buried in the weeds.


Addendum

I would add that it’s easy to determine which styles people care about seeing GUI fields for: there’s already a list of styles you care about in Zotero settings! I wonder if that would end up showing a lot by default. I think if done right, the idea generally strikes an OK balance complicating the spec and flexibility. The actual work for implementors is miniscule; every implementation already tests variables and checks conditions. You just parse 2 extra nodes, schema:-prefixed variable names, and make the possibilities a little more dynamic. So I wouldn’t say there is undue burden or unnecessary complication. The complications saved and flexibility gained for authors and users far outweighs any implementation difficulty. I also think people who are bound by absolute compliance to style guides but don’t want to format all their references themselves deserve a bit of a break.

Some misc other benefits, which I will proceed to exaggerate a little:

  • It moves the decision of whether to add something to CSL proper away from the people who would be annoyed if you said no.
  • It lets people use free software in a way that works for them, without requiring their opinionated approaches to affect everyone else.
  • It enables collecting information about what kind of hacks the community finds necessary, whereas before this is hidden in 2 ways:
    • Hacks by misusing fields (not sure how prevalent this is on the main repo, I suspect it’s policed a little)
    • People using local overriding versions of styles that misuse fields or use notes to set overrides (I know I do this constantly)
  • The rampant abuse of the Extra/notes field by every single implementation or downstream user of CSL, without exception, tells us that people really do need the functionality it brings, and that it’s worth formalising. The negatives of that approach shouldn’t colour this solution, which is structured, easier to use, and actually viable for permanent non-spec-addition hacks for various styles. I would have no trouble with forever using the schema for my idiotic style guide to do the annoying bits.
  • You can also avoid the the four hundred million different formats there are for information embedded in Extra. I will ultimately have to, but really don’t want to, implement that stuff.
  • It can encode many of CSL-M’s extensions just fine (not eg institutions, but the variables yes)

It would be under a custom-schemas feature flag for a good while, and would probably require users to opt in before they can use the feature. Styles should generally still do best-effort rendering if their custom variables are not set, not render nothing and break for people who aren’t using the tweaks they offer.

I’ve already written that I really like the proposal. Just to add to this: I think this could actually lead to metadata portability since it would render hackish solutions obsolete. Consider the problem with introductions in Chicago I have outlined in the other thread: Using @bwiernik’s idea to use genre or medium without a title here could be nice solution of CMoS, but you won’t be able to cite such items as-is with other styles that treat introductions just like an ordinary chapter. With a special boolean for CMoS your general item metadata would still be usable with other styles.

1 Like

@Dan_Stillman any thoughts about this and about the idea of schema additions in general?

With respect to the introduction idea—it’s really not a hack so much writing styles to robustly handle diverse metadata formats. It’s not an uncommon thing for style guides to specify that untitled items should display a description of the item. I’ve accommodated this sort of requirement in APA and other psychology styles I’ve written by writing the title macros to handle an item that does not have a title but instead has just genre or medium. This sort of thing can already be handled with the existing CSL spec—it just requires writing styles that are robust to nuanced selections of metadata. That’s not how most styles are currently written, but I would prefer to take that approach, especially with the big guides like APA, Chicago, MLA, Vancouver, etc., rather than bifurcating schemas in individual styles.

Yes, I’m totally in favor of not bifurcating schemas without a good reason. In my opinion the problem here is that one style treats this as an untitled item while other styles take “Introduction” as a proper title. So, if we change styles we’d also have to move the title from title to genre or medium.

I’m afraid I dont’ have the time necessary to fully do justice in comments to this well-thought-out proposal, but I don’t think it’s feasible.
It adds significant complexity upstream, including to end-users e.g.

He releases Chicago version 2.3.0 as above, and she puts in her Zotero settings (e.g.) that her library depends on Chicago version range “2.3”, i.e. ^2.3.0 in Node semver notation. Zotero then creates some field UI to edit this.

will be a disaster in getting users to understand and in providing support for.

It also seems miserable to update all these differentyl versioned styles (e.g. when a style guide updates, do I have to update all existing versions of the style? Which ones are deprecated? And having a unified suit of styles is imo a core strenght of CSL

So while I understand this is attractive from the pov of a style and potentially citeproc authors, I think it’s a bad idea for repository maintainers, end-users, implementers and, because it creates a multiverse of CSL styles, ultimately the CSL project as a whole.

(FWIW, the excess of Extra/Note stuff is much more on the Zotero end because of the – soon to be removed – block on field updates. So while that is indeed a problem, I don’t agree it’s as big of a CSL problem as you suggest)

Yeah, I’m afraid I’m in complete agreement with @Sebastian_Karcher on this. This would just add massively too much complexity all around, particularly around support.

(And the Extra issue will indeed be resolved in Zotero soon, but the supported formats are explained in the citeproc-js docs. It’s really just a basic hack that’s pretty trivial to support.)

What does that it mean that the Extra issue will be resolved soon? Some details somewhere? Will this mean that entering additional varialbes via Extra/Note will stop working?

We’ll finally be able to start making changes to types and fields in Zotero again. That’s been on hold for many years for technical and user-experience reasons related to syncing, but in the current Zotero beta we finally have a plan for dealing with it. Once that rolls out, and we give it enough time (probably a month or two) that we can do a sync cut-off, we can start updating types and fields.

Known fields in Extra will be automatically migrated to new fields. (E.g., original-date: 1908 will move to an Original Date field.)

That’s just a feature of citeproc-js, so no.