example outline

Not checked it in yet, but here’s the results of the (incomplete)
alternate draft schema.

Three main pieces.

Defaults:

Citation:

Bibliography:

It is indeed a rather nicely consistent approach, simultaneously fixing
a number of things in consistent ways. I think it fits the Perl vision
statement, though I’m still thinking about the locator and contributor
elements. My hunch is it make sense to split off the locators into
their own elements, but keep contributors unified.

Instead of this, then:

cs-fields =
cs-contributor
> cs-titles
> cs-date
> cs-publisher
> cs-locator
> cs-access
> cs-medium
> cs-genre

It’s be:

cs-fields =
cs-contributor
> cs-titles
> cs-date
> cs-publisher
> cs-volume
> cs-issue
> cs-pages
> cs-access
> cs-medium
> cs-genre

And while there’s good logic in having the generic contributor element,
there’s a counter logic to having author as its own element, namely
that it’s easier to control validation associated with substitution.

I’m also still not sure about the cs:substitute element, though, as
overriding the substitution means basically overriding the entire
field. An attribute-based approach would make this easier I think.

Another option is to allow the substitute attribute on
contributor/author, such that it could be inherited by the substituted
field (like title).

Bruce

Not checked it in yet, but here’s the results of the (incomplete)
alternate draft schema.

[…]

It is indeed a rather nicely consistent approach, simultaneously fixing
a number of things in consistent ways. I think it fits the Perl vision
statement, though I’m still thinking about the locator and contributor
elements. My hunch is it make sense to split off the locators into
their own elements, but keep contributors unified.

Looks pretty good to me, although shouldn’t:

Be:

[…]

And while there’s good logic in having the generic contributor element,
there’s a counter logic to having author as its own element, namely
that it’s easier to control validation associated with substitution.

I’m also still not sure about the cs:substitute element, though, as
overriding the substitution means basically overriding the entire
field. An attribute-based approach would make this easier I think.

I’m not entirely sure what to do, either. The main benefit to the substitute
attribute is that we can place attributes on the children elements. For
example, to correctly model the in-text citation, we actually need something
similar to:

This does look kind of icky, but even if we don’t specify how to generate
the short title in the CSL, it still makes sense to me to explicitly
differentiate the short title from the long title, and to do that, we need
to be able to supply an attribute to the titles element.

If we used a substitute attribute, we could make substitutions
context-sensitive, substituting with a short title in the citation or a long
title in the bibliography, but that would seem to subtract from the
consistency of this approach (the same elements would have different
meanings in different parts of the CSL file).

If we go the attribute route, what approach do you have in mind?

Another option is to allow the substitute attribute on
contributor/author, such that it could be inherited by the substituted
field (like title).

I’m not quite sure what you mean by this. Can you provide an example?

Simon

Looks pretty good to me, although shouldn’t:

Be:

Ahem … yes.

And while there’s good logic in having the generic contributor
element,
there’s a counter logic to having author as its own element, namely
that it’s easier to control validation associated with substitution.

I’m also still not sure about the cs:substitute element, though, as
overriding the substitution means basically overriding the entire
field. An attribute-based approach would make this easier I think.

I’m not entirely sure what to do, either. The main benefit to the
substitute
attribute is that we can place attributes on the children elements. For
example, to correctly model the in-text citation, we actually need
something
similar to:

This does look kind of icky, but even if we don’t specify how to
generate
the short title in the CSL, it still makes sense to me to explicitly
differentiate the short title from the long title, and to do that, we
need
to be able to supply an attribute to the titles element.

If we used a substitute attribute, we could make substitutions
context-sensitive, substituting with a short title in the citation or
a long
title in the bibliography, but that would seem to subtract from the
consistency of this approach (the same elements would have different
meanings in different parts of the CSL file).

If we go the attribute route, what approach do you have in mind?

The other option is:

… on the bibliography template likely, not the citation, and then
maybe allow this to cover the citation example above:

… where the short form can apply to the default name or the
substituted title.

That means substitution gets defined on the bibliography templates, and
applies to the citation, but one can alter the representation on the
citation.

So consistent logic (sorting and such), compact design, and formatting
flexibility.

Another option is to allow the substitute attribute on
contributor/author, such that it could be inherited by the substituted
field (like title).

I’m not quite sure what you mean by this. Can you provide an example?

See above.

Let me know what you think.

Bruce

I checked in the new schema as csl-alt.rnc (not complete yet), with the
beginnings of an example.

I’m wondering about the notion of having a sort of inheritance. So this:

… and this:

 <locator>
   <number/>
 </locator>

… but then within templates like:

   <reftype name="article">
     <author substitute="container-title"/>

… and:

<pages>
    <label/>
    <number/>
  </pages>

So author above would inherit the basic layout from contributor, but
override the substitution. Likewise pages from locator.

The reason is fairly obvious: it’s useful to have the specific elements
for the templates, but also to have the more generic to avoid
duplication. The end result would leave the templates quite similar to
how they are now.

Part of the usefulness, BTW, is tighter validation. I can do things
like say only author element that are first can have a substitution
rule, etc. It might be possible to do with attribute-based stuff, but
elements are generally more flexible for this sort of thing.

Bruce

I checked in the new schema as csl-alt.rnc (not complete yet), with the
beginnings of an example.

I’m wondering about the notion of having a sort of inheritance. So this:

[…]

The other option is:

… on the bibliography template likely, not the citation, and then
maybe allow this to cover the citation example above:

… where the short form can apply to the default name or the
substituted title.

That means substitution gets defined on the bibliography templates, and
applies to the citation, but one can alter the representation on the
citation.

So consistent logic (sorting and such), compact design, and formatting
flexibility.

One of my problems with this approach is that it’s not exactly clear where
the information is coming from at first glance. I suppose that’s not too big
an issue.

We will eventually need more than just form=“short” if we attempt to handle
cases where there are two authors with the same last name, but we can worry
about that later.

Another option is to allow the substitute attribute on
contributor/author, such that it could be inherited by the substituted
field (like title).

I’m not quite sure what you mean by this. Can you provide an example?

See above.

Let me know what you think.

I must admit that I’m not a big fan of any of these approaches. It seems
like we either need to rely on hidden relationships between sort-order,
author in the citation, and author in the bibliography, or we’d need to add
an entirely new top-level tag to handle things right.

I guess that your approach is the simplest, and, while it may not be as
clear as I’d like, 99% of the time it will result in the desired behavior.

Simon

We will eventually need more than just form=“short” if we attempt to
handle
cases where there are two authors with the same last name, but we can
worry
about that later.

Right.

Another option is to allow the substitute attribute on
contributor/author, such that it could be inherited by the
substituted
field (like title).

I’m not quite sure what you mean by this. Can you provide an example?

See above.

Let me know what you think.

I must admit that I’m not a big fan of any of these approaches. It
seems
like we either need to rely on hidden relationships between sort-order,
author in the citation, and author in the bibliography, or we’d need
to add
an entirely new top-level tag to handle things right.

It would probably mean a sort and/or substitution element in one or
more places. E.g. maybe you’d have this on the bibliography element:

And then those would need to be overridden on each reftype template
probably, and you’d need to explicitly flag that a citation use the
same rules (in the XSLT version, I have noname-substitute attribute
that holds the substitution value, which both the bibliography and
citation formatting can use).

The reality is that there is a quite complicated set of relationships
between all this stuff, and we need to decide when explicitly modeling
it in the XML is necessary, and when it’s better to just document it
somewhere else (or in the schema of course).

FWIW, I think author-year styles (and note ones that include
bibliographies) are actually the hardest to implement correctly, and
that’s because among other things, the citation and bibliography
formatting are interdependent.

I guess that your approach is the simplest, and, while it may not be as
clear as I’d like, 99% of the time it will result in the desired
behavior.

Then I’d suggest we start working with the new schema (which right now
is trivially different from the previous in the citation and
bibliography elements).

One of the reasons I paid a lot of attention to Chicago and APA in
doing all this, BTW, is because a) they are well-documented, and b)
they are quite complex. The only way to really know if a particular
design detail works is to try to encode a style with it.

Bruce

Is the list dead again? Let’s see …

I must admit that I’m not a big fan of any of these approaches. It
seems
like we either need to rely on hidden relationships between sort-order,
author in the citation, and author in the bibliography, or we’d need
to add
an entirely new top-level tag to handle things right.

I’ve been thinking about this more, and I guess the option is something
like this:

That would completely remove the sort and substitution logic from the
templates.

We could even assume default algorithms, so that doing this …

<sort algorithm="author-date">

… would assume something like the above by default.

I’m really not sure (??) if this would be a better approach.

Bruce

Oh, there’s one other issue:

What about his notion I’ve had of style repositories?

Something to worry about later?

Bruce

So long as you guys keep giving me good feedback, it’s doable. We are
indeed almost done.

I think the main details we have to settle right now are, in order of
priority:

  1. sorting/substitution (the easy solution is to keep things as they
    are)
  2. metadata (I haven’t added that back; is it fine as is?)
  3. is the grouping support in the list structure sensible?

Not sure if there’s anything else?

The main things I wanted to figure out before 1.0 were grouping and
international support, and I think that’s almost done to my
satisfaction.

I do think we want to have time to create test styles to confirm
everything is fine.

So let’s go with that roadmap then. August 15, pre-1.0, and a few weeks
for testing?

Bruce

  1. metadata (I haven’t added that back; is it fine as is?)

If we intend to go for Atom feeds for style repositories, does it make
sense
just to encapsulate our current CSL as Atom (like in the example at
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/
module/ins
tance.atom>>)?

That’s what I’m wondering about. I guess I see two options:

  1. leave the metadata out (aside from maybe an attribute or two) and
    leave it up to an Atom wrapper

  2. keep it in, and risk duplication

  1. is the grouping support in the list structure sensible?

It seems fine to me, but I haven’t really thought much about it. Do
you have
any specific concerns?

No, I think it’ll work fine. It’s just a PITA to implement (at least in
XSLT) so I haven’t really tested it.

Bruce

The CSL is kept in a separate file, but the separate files (e.g.,
header.atom at
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/module/hea
der.atom>) each contain both an Atom wrapper with metadata and the XML
content itself. Only the metadata necessary to determine if a given file
should be downloaded (title, date updated, etc.) is included in the main
feed.

The advantage to this approach is that the feed is kept as small as
possible, a minimal amount of metadata is repeated, and the metadata that is
repeated is in the Atom format in both the feed and the CSL file.

If we were to include all the metadata in the feed, we could completely
avoid repetition of metadata, but we would substantially increase the feed
size (since some things, like author, source, field, description, etc.
aren’t necessarily important until the CSL file is downloaded, but would
still need to be put into the feed). If we keep the approach previous
schemas used, we have to convert the metadata that does need to be in the
feed (title, dates) into Atom format. Although there’s no serious problem
with any of these mechanisms, I prefer M. David Peterson’s approach.

The Scholar for Firefox scraper repository is even more complicated than
this in order to minimize the amount of content that’s downloaded when an
update is available (we send the timestamp of the last check and only get
metadata for what’s been added or updated since then), but if we’re looking
for a general, standardized approach, Atom looks solid to me.

Simon

  1. metadata (I haven’t added that back; is it fine as is?)

If we intend to go for Atom feeds for style repositories, does it make
sense
just to encapsulate our current CSL as Atom (like in the example at
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/
module/ins
tance.atom>>)?

That’s what I’m wondering about. I guess I see two options:

  1. leave the metadata out (aside from maybe an attribute or two) and
    leave it up to an Atom wrapper

  2. keep it in, and risk duplication

I thought that we could wrap the actual CSL files in Atom, then reference
them in the repository through a file like
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/default.om
x#L40>. By wrapping the CSL, we can embed the metadata in the file but avoid
duplication with the repository.

  1. is the grouping support in the list structure sensible?

It seems fine to me, but I haven’t really thought much about it. Do
you have
any specific concerns?

No, I think it’ll work fine. It’s just a PITA to implement (at least in
XSLT) so I haven’t really tested it.

I haven’t fully implemented it either. It doesn’t seem too complex to
implement in JavaScript based on what’s in the schema, though.

Simon

Hmm. I think I slightly prefer this. I could easily live with the other
approach, but I think the function of this markup looks a bit clearer,
especially if we use instead of .

You mean on the templates?

We still need to clear up the issue with (Doe JK, 1998) vs. (J. Doe,
1998) for authors with
the same last name somewhere, though. Any ideas on a suitable approach?

In your “Doe JK” example, what would the name formatting be like for
the bibliographic entry? The same, or different?

In any case, this is something specific to citations, and it’s a
disambiguation rule. I need to think more about exactly how it would
work, since CSL doesn’t deal with name formatting at a very fine level.
One possibility is:

… or some such. But that doesn’t solve the formatting.

Bruce

The CSL is kept in a separate file, but the separate files (e.g.,
header.atom at
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/
module/hea
der.atom>) each contain both an Atom wrapper with metadata and the XML
content itself. Only the metadata necessary to determine if a given
file
should be downloaded (title, date updated, etc.) is included in the
main
feed.

OK, I see.

The advantage to this approach is that the feed is kept as small as
possible, a minimal amount of metadata is repeated, and the metadata
that is
repeated is in the Atom format in both the feed and the CSL file.

The one thing that feels wrong to me, though, is that the file then is
an Atom file with embedded CSL content, rather than a CSL file that
happens to get distributed via Atom.

If we keep the approach previous schemas used, we have to convert the
metadata that does need to be in the feed (title, dates) into Atom
format.

Correct. In fact, it would probably mean simply copying over the
contents of the info element and converting the namespace from CSL to
Atom.

I guess the question we get down to is whether my concern about using
atom:entry as the root for the primary file is a problem or not.

Bruce

OK, here’s another option.

Substitution happens in defaults. Add a new choice element that is
understood as an ordered if-else sort of structure.

 <author>
   <substitute>
     <choice>
       <editor/>
       <title/>
       <text idref="anonymous"/>
     </choice>
   </substitute>
 </author>

Sorting gets configured like:

The bibliography and citation templates, then, do not deal with
substitution or sorting directly.

On the name disambiguation in citations, Chicago also does this in
inconsistent ways.

I think the most sensible option is just:

Note, Chicago also has a rule for reference lists. If for some reason
you have:

Doe, A. A.
Doe, A. A.

… you should expand one (or both?) given names to disambiguate!

To leave room for that, I suggest:

cs-disambiguate = attribute disambiguate { “sort” | “display” | “true” }

Bruce

Sounds good. Btw, there sure is quite some stuff to document about the
CSL schema, just to prevent that people who are doing an
implementation will misunderstand the meaning of it. I would never
have thought that bibliography could be so complicated, but every time
again I find out more and more how “inventive” humankind has been in
thinking of new ones! :slight_smile:

Johan

Btw, there sure is quite some stuff to document about the
CSL schema, just to prevent that people who are doing an
implementation will misunderstand the meaning of it.

I agree. Probably most structures ought to have an annotation (the “##
…” stuff).

I would never have thought that bibliography could be so complicated

Yeah, tell me about it!

Bruce

So long as you guys keep giving me good feedback, it’s doable. We are
indeed almost done.

I think the main details we have to settle right now are, in order of
priority:

  1. sorting/substitution (the easy solution is to keep things as they
    are)

See my other message.

  1. metadata (I haven’t added that back; is it fine as is?)

If we intend to go for Atom feeds for style repositories, does it make sense
just to encapsulate our current CSL as Atom (like in the example at
<http://dev.extensibleforge.net/browser/trunk/X5.AspNet/Resources/module/ins
tance.atom>>)?

  1. is the grouping support in the list structure sensible?

It seems fine to me, but I haven’t really thought much about it. Do you have
any specific concerns?

Simon

OK, checked in the newest changes.

Bruce