CSL evolution process again

The development that has been done recently and that is being done now brings us in a better situation than we’ve had a couple of weeks ago. We’ve reduced the massive backlog of issues in various repos, and it looks like we’ve managed to take the Zotero-bits repo out of the equation—finally!

Nevertheless, I still have the feeling that CSL development is a bit rough, not as smooth as it could be. It’s too easy to feel lost sometimes. We’ve introduced the new project board to remedy that situation, but I think with moderate success only.

I think there are a few open questions:

  1. How many repos do we need? Can we consolidate everything into one repository? If not, do we need evolution besides schema and documentation? It’s not really clear what the purpose of this repo is next to the others. I think there are basically two options: Either we get rid of it, or we give it a real purpose and start to use it for that. That could mean adding issues only to evolution, while PRs go to schema or documentation.

  2. How to keep track of all issues and their status? We’ve tried to use a project board for this, but yeah… I’ve had a look again at what Github offers in terms of project management tools, and there seem to be three useful features:

  • labels
  • milestones
  • project boards

I suggest we start looking into labels again. Perhaps we can add something like the sane github labels. I suspect that would already give us some filtering possibilities and a bigger picture just by looking at the labels. Of course, we would have to label issues consequently. But that’s probably easier than managing the project board. I can make a proposal regarding this if that is of interest.

Concerning milestones and project boards: They seem to have a quite similar purpose. The advantage of project boards is that they can be used across repositories.

Thoughts?

1 Like

I’ve been thinking similar as we’ve been working, but wanted the experience of doing this to see what might make sense.

Thoughts:

  1. we should delete evolution, and consolidate those functions in schema; I’m pretty confident in this, as essentially where the majority of development work happens ideally needs to integrate change history, issues, PRs
  2. we could rename schema along with that; not sure
  3. I’m not sure on documentation. There are pros and cons of status quo vs merge, and I ended up closing the issue to consider this. A middle ground would be to only move the spec document to schema. But that’s the most important document to retain the git history on, which we lose if we move it; so not really a solution.
  4. I started experimenting with project boards in part to address the problems of having issues spread across multiple repos. But they only make sense if we can automate them so they provide a useful summary view, without us having to manually intervene. I assumed initially this would require some additional coding. Perhaps our answer on project boards should depend on what we decide about the above?

PS - git-labelmaker (mentioned in the article you linked to) looks cool. If we went this direction, we could start with the templates they have, and maybe add another file specific to our needs, or modify this existing one?

Ok.

We can do that, but I’m not sure we have to. What would be a good name?

Could we just keep the old repo as an archive, sort of? And move current development to schema (o however we will call this)? In the end, I don’t think it matters much. But we will have to make sure that each closed schema PR results in an opened documentation issue. And, having two repos means having two milestones (or project boards).

I think it really depends on the purpose of the boards. The way we used them initially required a lot of manual intervention. If we use them as like milestones, it should be possible to set up automation in such a way that issues/PRs get moved automatically, e.g. when reviewers approve, an issue gets closed, etc.

On this: Should we try to come up with a simple system for labeling issues? I think we don’t need much, perhaps

Type: ???
Status: ???

I don’t think we need the priority labels though. Also, we should decide where we’ll add the version tags? Should this be done with labels? Or better with milestones or projects (whichever we will use)? Using different tools for essentially the same information seems a bit confusing to me.

In general, we need to track type, status, and scope (whether applies to schemas vs tests, etc.).

Not really sure what to do with it, but I forked the template repo, and added the scope.json file.

The milestones are useful for tracking progress associated with a release.

The labels for releases are lighter-weight annotations. I sometimes add them to suggest a target, but only add the milestone when we agree. It’s possible we could stop using them.

Projects aren’t good for the same thing. Their only value currently, in my view, is to consolidate info across repos. Probably better to remove in the end.

Keep in mind, though, the labels aren’t incorporated into the git version history; they are apart from it.

So perhaps we should a label for target? (Perhaps not in the current form with fixed version numbers, but rather just indicating whether this should go in a x.x release or a x.x.x release.) And once we have a couple of issues together we can add some of those to the next milestone?

1 Like

In that case, it’s maybe more (or also) a matter of minor vs major change?

Yep. That’s what I was thinking of. But we will need to define what these terms are supposed to mean. (What is minor, what is major? What will be major enough to warrant for 2.0? And so on…)

1 Like

Regarding this thread: https://github.com/citation-style-language/schema/issues/238

Everything minor could then always be merged into master, major into the next x.x branch, right? If yes, that sounds good.

I honestly don’t see much problem with the existing schema and documentation repo structure. The loss of git history with changing them is big, and I’m not sure the upsides outweigh that. GitHub could technically handle redirects if schema were renamed, but I don’t really see the value there. What’s unclear about schema?

1 Like

@bwiernik Ok. So you vote for keeping schema and documentation as is. What about merging evolution into schema?

And the label question? You think that would be of value?

I think csl-evolution can just be archived after the existing issues are closed. It was created at an earlier point when a good development workflow was less clear. I’d like to keep the closed issues, so I think archiving the repo rather than deleting might be better; also would be cleaner than migrating the issues to another repo.

Reserve milestones are issues/PRs that have been agreed to; these can function as a checklist to prepare a release.

In terms of labels, major/minor is probably the most important. I think that the current 1.0.2 vs 1.1 labels are reasonable for that purpose, but major and minor could be fine too. No real preference.

Could be useful to be a bit more specific, a minor issue/PR might be new_variable (including variables, types, terms), bugfix (e.g., something like the locator="sub verbo" issue that requires a minor processor change), or documentation (clarification or reconciliation of the spec/tests). A major issue/PR might be style (a major change to style behavior or a new element, something requiring changes to either processors or styles) or data (a major change to the data structure).

1 Like

Archiving is better then deleting, yes. Couldn’t we just migrate the remaining open issues already and archive the repo now? I imagine it will take some time to close some of them.

Agreed.

I guess in the end, both solutions will be fine. I was just thinking that major/minor has the advantage of being independent of specific releases, which should perhaps be tracked with milestones, anyway. But you’re right. It won’t matter much.

These are all useful, of course. I thought that could be covered with the type prefix mentioned above, e.g. type: bugfix, etc.

1-3. Sure.
4. Adding type: seems unnecessarily verbose.

Do you have the rights to migrate the open issues? Or someone else @Bruce_D_Arcus1 ??? Would be one more step towards a leaner process.

There is a labeler GH Action that we can setup, which will auto apply labels based on the path of the file being modified.

So you could imagine a config something like this:

input: schema/input/*.json
csl-main: schema/styles/csl.rnc
csl-terms: schema/styles/csl-terms.rnc
csl-variables: schema/styles/csl-variables.rnc
test: test/schema/*
tools: tools/*

If you guys think this is a good idea, let me know if you think the above works, or if you suggest any changes?

Are you asking about moving issues from evolution to schema?

Yes. That’s it. (Regarding evolution to schema)

Would be good to know though if there are objections… @Rintze_Zelle @Sebastian_Karcher

How exactly is that going to help us? (Regarding auto-label)
That works on PRs but not on issues, right?

When someone submits a PR, it would auto-assign the tags; so yes, only PRs.

Could use something like this for issues, in conjunction with our templates (to ensure keywords we’re looking for are there)? It can also do PRs, though on first glance not based on paths; it only looks at the commit messages.

Here’s the key config parameter content:

[
   {
      "keywords":[
         "bug",
         "error"
      ],
      "labels":[
         "BUG"
      ],
      "assignees":[
         "username"
      ]
   },
   {
      "keywords":[
         "help",
         "guidance"
      ],
      "labels":[
         "help-wanted"
      ],
      "assignees":[
         "username"
      ]
   }
]

The hardest work is just determining the labels we want in the end!