Types

John’s remarks on type mapping have prompted me to point out that CSL
is missing types for blog entries (which are apparently in the
Chicago Manual of Style now), dictionary entries, and encyclopedia
entries. See https://www.zotero.org/trac/ticket/699. I think that
before Bruce disputed whether dictionary and encyclopedia entries
were truly necessary, but it seems like in APA they’re different from
the fallback types (although not from each other). I can simply add
these to the schema if there are no objections.

Simon

Simon Kornblith wrote:

John’s remarks on type mapping have prompted me to point out that CSL
is missing types for blog entries (which are apparently in the
Chicago Manual of Style now), dictionary entries, and encyclopedia
entries. See https://www.zotero.org/trac/ticket/699. I think that
before Bruce disputed whether dictionary and encyclopedia entries
were truly necessary, but it seems like in APA they’re different from
the fallback types (although not from each other). I can simply add
these to the schema if there are no objections.

Actually, the link on that ticket isn’t very helpful. The section on
"weblog entry" doesn’t in fact include a weblog entry proper. It
includes the weblog per se, and a comment on an entry.

I would presume a weblog entry would be formatted like an article, and
not need any type-specific logic.

The problem I have with tying formatting (or even data model)
fundamentally to types is that they’re really fragile and chaotic
concepts. Moreover, they tend to give the style author a false sense of
security; as if the only good style is one where you can clearly
identify patterns for any type you might use.

I’d say quite the contrary, a well-written style will properly format
sources for which it does not have a specific template for the type.
That is the measure of a good style, and it will ensure it works more
reliably for different users.

But I guess we need to make a decision, and establish a policy. What the
hell is a type, and when do we add it? What’s the convention for minting
them? Is it just the old Zotero approach of adding them ad hoc as
requested? Or can we find a better way?

Bruce

Followup thought:

To ease the burden on types further, maybe we need to be able do stuff
like or ?

Bruce

I know the question here is primarily about the schema, but I would just
say that Zotero’s rationale for adding a type is probably different from
CSL’s, and probably will be even once a new ontology is in place. We
have to support (and attract) end users, and there’s a degree to which
even types that are identical to existing types are necessary in order
to meet user expectations and avoid repeated support requests. That
doesn’t mean we should happily comply with every type add request from
users–most of the time, we will be able to say, “Just use ‘document’ or
’journalArticle’ or ‘bookSection’”–but there are some, like blog
entries, that really need to be in there to comfort users, even if
they’re no more than pointers to fallback types in CSL.

Dan Stillman wrote:

I know the question here is primarily about the schema, but I would just
say that Zotero’s rationale for adding a type is probably different from
CSL’s, and probably will be even once a new ontology is in place. We
have to support (and attract) end users, and there’s a degree to which
even types that are identical to existing types are necessary in order
to meet user expectations and avoid repeated support requests. That
doesn’t mean we should happily comply with every type add request from
users–most of the time, we will be able to say, “Just use ‘document’ or
’journalArticle’ or ‘bookSection’”–but there are some, like blog
entries, that really need to be in there to comfort users, even if
they’re no more than pointers to fallback types in CSL.

I recognize – and it’s good to keep in mind – that the function of a
type in an internal data model or in a formatting system might be
different than in a GUI.

But I still go back to the larger point: what is a type for Zotero, or
CSL? What is the logic by which you add them? Why is one type (blog
entry) necessary, but another (memo, press release, lab notes, binary
code, postcard) not?

If it’s just user-comfort, then what happens when you have three users,
each of whom are comfortable with different ways of “typing” their
records? Or even worse, what happens where you have three styles, where
each of them use different logics to define types?

I think the example of the “book” type is pretty instructive. It’s quite
common for users to ask on the Zotero forums "where’s the edited book"
type or some variant of that question. Once you explain to them how
contributors work, invariably you witness an “ah ha” moment in which it
all becomes crystal clear. Now a single type can cover what in some
models would require four of them: Book, EditedBook, OnlineBook, etc.
And the logic that distinguishes them is in fact the attributes of the
source, rather than the type.

I say this while having to admit we’re likely adding an EditedBook class
to the RDF ontology (though the reasons are different) :slight_smile:

I’m not saying any of this is easy, BTW; it’s a PITA!

Bruce

Scenario:

A community of medieval historians (or maybe just a lone scholar writing a
dissertation) develops a consistent way to cite medieval chants, using
various cataloguing paradigms developed over the years. It doesn’t use
people or dates at all, but four identifying fields: “Collection”, “Series”,
“Incipit”, “Line”.

They map Collection to container-title, Series to collection-title, Incipit
to title, and Line to locator (or comparable fields in the bib tool they are
using). They add a type to the Chicago style file and define the type’s
bibliographic and footnote formats.

With EndNote, they post and pass around the style file, and tell people what
to enter for the mappings in the Reference Types window. (Actually, I think
the mappings can be in a distributable file in EndNote now). The style file
gets dropped into a directory and without waiting for an EndNote software
release, they are (or he is) up and running.

What is the scenario with Zotero and CSL?

John

John P. McCaskey wrote:

Scenario:

A community of medieval historians (or maybe just a lone scholar writing a
dissertation) develops a consistent way to cite medieval chants, using
various cataloguing paradigms developed over the years. It doesn’t use
people or dates at all, but four identifying fields: “Collection”, “Series”,
“Incipit”, “Line”.

They map Collection to container-title, Series to collection-title, Incipit
to title, and Line to locator (or comparable fields in the bib tool they are
using). They add a type to the Chicago style file and define the type’s
bibliographic and footnote formats.

With EndNote, they post and pass around the style file, and tell people what
to enter for the mappings in the Reference Types window. (Actually, I think
the mappings can be in a distributable file in EndNote now). The style file
gets dropped into a directory and without waiting for an EndNote software
release, they are (or he is) up and running.

What is the scenario with Zotero and CSL?

But this is somewhat apples and oranges. Endnotes style file are
application-specific. Not so with CSL.

I think the trick in this circumstance is how Zotero handles the data.
It needs to be able to know how to map the properties, both to its own
internal model, and in turn to CSL’s. E.g. I’m really sure this is our
problem, is it?

As I was saying earlier, I except a good style to work regardless of
types and specific properties.

Bruce

But this is somewhat apples and oranges. Endnotes style file are
application-specific. Not so with CSL.

I thought the idea was that CSL would be application non-specific, that Zotero and Open Office and a new add-in for Word and some new word processor could all use CSL and that users could write new CSLs they would load-and-go across multiple applications.

No?

John

John P. McCaskey wrote:

But this is somewhat apples and oranges. Endnotes style file are
application-specific. Not so with CSL.

I thought the idea was that CSL would be application non-specific,
that Zotero and Open Office and a new add-in for Word and some new
word processor could all use CSL and that users could write new CSLs
they would load-and-go across multiple applications.

No?

Yes; that’s my point. CSL basically provides a stable (I hope!) target
model that different applications can map to. It’s up to the
implementations to understand how to do that mapping (well and us to
make that model clear).

Admittedly, this depends on somewhat smarter data formats than ones
where users are forced to use completely opaque “user” tags.

Bruce

I thought the idea was that CSL would be application non-specific,
that Zotero and Open Office and a new add-in for Word and some new
word processor could all use CSL and that users could write new CSLs
they would load-and-go across multiple applications.

No?

Yes; that’s my point. CSL basically provides a stable (I hope!) target
model that different applications can map to. It’s up to the
implementations to understand how to do that mapping (well and us to
make that model clear).

But how can the application know? The only person who knows that in the new CSL, with its new medieval music type, that collection-title maps to what historians of medieval music call “Collection” and collection-type maps to “Series” and title maps to “Incipit”.

The new CSL is unusable without that mapping. How will it get from the brain of the CSL writer into the user’s computer?

Will the user read the CSL and enter the map by hand? Load a file? Wait for a new software release? Petition a committee? Hack some JavaScript?

John

In Zotero, our plan is to allow users to define their own types with
some kind of semi-controlled vocabulary. If you create a “Medieval
Chant” mapping, you’ll be able to upload it to zotero.org, and other
users will be able to search for a “Medieval Chant” type and download
it (preferably all through the Zotero interface).

In this specific case where the CSL has type-specific formatting for
a user-defined type, it’s not clear what we ought to do. As long as
users can export their mappings, they can do things the EndNote way,
and distribute the CSL and type mapping together. Whether we’d want
to do this from a centralized repository is another question.

Simon

John P. McCaskey wrote:

But how can the application know? The only person who knows that in
the new CSL, with its new medieval music type, that collection-title
maps to what historians of medieval music call “Collection” and
collection-type maps to “Series” and title maps to “Incipit”.

The new CSL is unusable without that mapping. How will it get from
the brain of the CSL writer into the user’s computer?

Will the user read the CSL and enter the map by hand? Load a file?
Wait for a new software release? Petition a committee? Hack some
JavaScript?

I’m really not understanding the problem John. If the user wants to have
their data intelligible to other tools, they have to use standard
properties. So they don’t encode stuff like “incipit.” As you said, they
use “title”. Likewise with the style.

The user signals their intention by the choices they make in
representing their data. If Zotero wants to put some GUI sugar on top to
make this easy, then fine. But I don’t see how it’s any help to be
putting some mapping to a particular data representation in the CSL style.

Bruce

Simon Kornblith wrote:

Will the user read the CSL and enter the map by hand? Load a file?
Wait for a new software release? Petition a committee? Hack some
JavaScript?

In Zotero, our plan is to allow users to define their own types with
some kind of semi-controlled vocabulary. If you create a “Medieval
Chant” mapping, you’ll be able to upload it to zotero.org, and other
users will be able to search for a “Medieval Chant” type and download
it (preferably all through the Zotero interface).

All I can say is you guys better figure out answers to my questions
BEFORE you do this, or you will have chaos.

In this specific case where the CSL has type-specific formatting for
a user-defined type, it’s not clear what we ought to do. As long as
users can export their mappings, they can do things the EndNote way,
and distribute the CSL and type mapping together.

Ugh, this sounds really ugly.

As I said in the response to John, the data and the style need to be
self-contained. If you’re going to do a mapping, it should be between
GUI and internal model. E.g. in John’s example. you do:> On Aug 14, 2007, at 3:45 PM, John P. McCaskey wrote:

  • type: "Medieval Chant"
    title: "Incipit"
    collection-title: “Collection”

… and so forth. This is on the Zotero end of things. We need a
mechanism to create a URI for that type so that you can import and
export it reliably in the RDF.

I don’t see a problem on the CSL (as I mentioned in reply to John). We
just need to make sure we have a rich enough list of generic variables
that can be mapped to.

Whether we’d want to do this from a centralized repository is another question.

I really think it’s important to not assume centralized solutions.

Bruce

In this specific case where the CSL has type-specific formatting for
a user-defined type, it’s not clear what we ought to do. As long as
users can export their mappings, they can do things the EndNote way,
and distribute the CSL and type mapping together.

Ugh, this sounds really ugly.

As I said in the response to John, the data and the style need to be
self-contained. If you’re going to do a mapping, it should be between
GUI and internal model. E.g. in John’s example. you do:

  • type: "Medieval Chant"
    title: "Incipit"
    collection-title: “Collection”

… and so forth. This is on the Zotero end of things. We need a
mechanism to create a URI for that type so that you can import and
export it reliably in the RDF.

Yes, we do this already in the database for mapping, e.g., director
to author, and it is our plan to use these kinds of mappings with
custom types.

I don’t see a problem on the CSL (as I mentioned in reply to John). We
just need to make sure we have a rich enough list of generic variables
that can be mapped to.

The problem is that sometimes mapping to existing CSL types will
not be sufficient. This is John’s point, I believe. People will want
to be able to define a custom type with its own citation style.
EndNote lets them do this. I think it’s important that we find some
way to provide for this behavior.

Whether we’d want to do this from a centralized repository is
another question.

I really think it’s important to not assume centralized solutions.

For Zotero, we can assume there will be a centralized solution,
because we intend to build it. Besides, in this case, the
decentralized solution (medieval historians sending zip files to each
other) is far easier, but far more chaotic, than the centralized
solution.

Simon

Simon Kornblith wrote:

I don’t see a problem on the CSL (as I mentioned in reply to John). We
just need to make sure we have a rich enough list of generic variables
that can be mapped to.

The problem is that sometimes mapping to existing CSL types will
not be sufficient. This is John’s point, I believe. People will want
to be able to define a custom type with its own citation style.

Which goes back to my point: TYPES ARE NOT RELIABLE!!! A system that
relies on types is a broken system.

EndNote lets them do this. I think it’s important that we find some
way to provide for this behavior.

Endnote requires them to do this because it has a shitty – totally
flat – data model. And for those users, their data in essence then
becomes their own personal data, rather than freely exchangeable. I
know; I was once an Endnote user, and it frustrated me to no end these
limitations.

It would be a major mistake for Zotero or CSL to go down the same path
without really careful thought.

In any case, I think we can assume we can find a solution on extending
the type lists. I just think it’s seriously misguided to consider this
the most important priority ATM.

Whether we’d want to do this from a centralized repository is
another question.
I really think it’s important to not assume centralized solutions.

For Zotero, we can assume there will be a centralized solution,
because we intend to build it. Besides, in this case, the
decentralized solution (medieval historians sending zip files to each
other) is far easier, but far more chaotic, than the centralized
solution.

But CSL != Zotero, and I’ve said this before, but the place I want to
get to in a few years is where a journal site might host their own
styles and a user might only have to click a link to activate it, in
whatever application they happen to be using.

I believe the current design approach of CSL will work in this kind of
distributed context. But it won’t work if we end up with 200 different
types floating around and being regularly invented, and all styles are
dependent on those types. There has to be a more general foundation.

Bruce

Oops. Other messages flying while I was typing. I’ll send this anyway.

It’s main question for Bruce stands: How would you use CSL to write a book on medieval music?--------------------------------

I’m really not understanding the problem John. If the user wants to have
their data intelligible to other tools, they have to use standard
properties. So they don’t encode stuff like “incipit.” As you said, they
use “title”. Likewise with the style.

The user signals their intention by the choices they make in
representing their data. If Zotero wants to put some GUI sugar on top to
make this easy, then fine. But I don’t see how it’s any help to be
putting some mapping to a particular data representation in the CSL style.

Hmm. Let me try a different approach. Forget for now sharing CSLs.

How would you use CSL to code a style for medieval music for a book on the topic you are writing?

You’ve convinced me that you would not create three new variables, “collection”, “series”, and “incipit” (even if that’s the way you and everyone else catalogs medieval music). You would look at the canonical CSL variables and pick ones that seem to have similar functions, let’s say, container-title, collection-title, and title. OK.

Now you need a type. If you can’t make a new one without waiting for the CSL committee to bless it, you reuse “song”, since you are already using “musical score” for the non-medieval music you are citing. You code up the style, leaving off authors and dates, since medieval music identifies neither.

Now you want to enter the medieval “songs”. You cannot do it unless you remember how you mapped Collection, Series, and Incipit to container-title, collection-title and title. If you misremember, the citations will come out wrong.

The “GUI sugar” isn’t optional. It must get into the computer somehow, or you’ll need a yellow Post-It note on the screen to remind you – and be sure to get the right Post-It note for the right CSL file.

The sugar is specific to the CSL file you just created (and useless outside it). Why not put the mapping in that CSL file instead of another file (or Post-It note) that must always move around with the CSL file?

(The above scenario is normal day-in-the-life stuff for EndNote users. And EndNote’s early approach left room for all sorts of problems, even before you tried to share. I think they’ve redone this now.)

John

John P. McCaskey wrote:

How would you use CSL to code a style for medieval music for a book
on the topic you are writing?

You’ve convinced me that you would not create three new variables,
“collection”, “series”, and “incipit” (even if that’s the way you and
everyone else catalogs medieval music). You would look at the
canonical CSL variables and pick ones that seem to have similar
functions, let’s say, container-title, collection-title, and title.
OK.

Now you need a type.

Why do you need a type? Lay out the generic formatting. add some
conditional choose statements for the generic book, article and chapter,
and be done with it.

I’m not being flippant; am totally serious. One of the key principles of
CSL is a generic model and fallback logic. Book, article and chapter are
in fact proxies for three different abstract classes of resources: what
I earlier called monograph, part-in-serial, and part-in-monograph. This
has increasingly become less apparent, but it’s still central to the design.

When you define those patterns, a CSL engine is supposed to know that
those are the generic fallback definitions. The whole point of this
design is to decrease reliance on types, and so to simplify styles, and
make them more reliable.

I formatted a complete published book using a style with probably five
types in it, but whose data probably had 20. I got virtually no
complaints from the copy editors about my citations and bibliography. So
I know this works.

Type is NOT central to CSL formatting. You’re thinking like an Endnote
user :slight_smile:

If you can’t make a new one without waiting for
the CSL committee to bless it, you reuse “song”, since you are
already using “musical score” for the non-medieval music you are
citing. You code up the style, leaving off authors and dates, since
medieval music identifies neither.

Another principle of CSL is that if data isn’t present, it doesn’t get
printed. So you don’t per se need type-specific templates to get around
this. Just define generic logic for dealing with no dates and authors.

Now you want to enter the medieval “songs”. You cannot do it unless
you remember how you mapped Collection, Series, and Incipit to
container-title, collection-title and title. If you misremember, the
citations will come out wrong.

The “GUI sugar” isn’t optional. It must get into the computer
somehow, or you’ll need a yellow Post-It note on the screen to remind
you – and be sure to get the right Post-It note for the right CSL
file.

But what are you mapping to?

The sugar is specific to the CSL file you just created (and useless
outside it). Why not put the mapping in that CSL file instead of
another file (or Post-It note) that must always move around with the
CSL file?

(The above scenario is normal day-in-the-life stuff for EndNote
users. And EndNote’s early approach left room for all sorts of
problems, even before you tried to share. I think they’ve redone this
now.)

I’m starting to see now: you want user comments or annotations? If yes,
I’m still not convinced that’s particularly helpful, but it’s easy to
support technically and fairly non-intrusive. So it’s certainly
something we can discuss more.

But I think we need to first agree on some of the stuff up top before we
get down to this level of detail.

Bruce

You aren’t answering my question. How does a medievalist get the word “Incipit” on the computer screen next to an empty text box, enter a word in the text box, and know the word he entered will get treated in a CSL-defined citation as a title would?

It sounds like you are saying a medievalist shouldn’t want to do that. He should know that “Incipit” is like “Title” and just accept “Title”. And if he shares a CSL file that correctly treats the two similarly, he should just tell his friends he wrote the CSL so “Incipit” is like “Title”. Get a Post-It Note if you can’t remember.

Book, article and chapter are in fact proxies for three different abstract
classes of resources: what I earlier called monograph, part-in-serial, and
part-in-monograph.

(Actually, I always thought this kind of proxy thinking was one of EndNote’s big weaknesses.)

John

John P. McCaskey wrote:

You aren’t answering my question.

And you’re not answering mine: why do you insist that you need type?

How does a medievalist get the word
"Incipit" on the computer screen next to an empty text box, enter a
word in the text box, and know the word he entered will get treated
in a CSL-defined citation as a title would?

I’m saying what happens in some application UI is not my problem.

It sounds like you are saying a medievalist shouldn’t want to do
that. He should know that “Incipit” is like “Title” and just accept
"Title". And if he shares a CSL file that correctly treats the two
similarly, he should just tell his friends he wrote the CSL so
"Incipit" is like “Title”. Get a Post-It Note if you can’t remember.

No, I’m saying it’s up to the application to handle that mapping and the
associated UI (if there is one; there’s not currently).

I’m saying that Zotero or any other application damn well better not be
treating “incipit” internally as some unique property, any more than
it ought to treat “newspaper title” as a unique property. That if they
do, they better be tracking the mappings between them. Finally, that if
they want a mapping to a GUI, they should do it as I suggested:

title: "Incipit"

… or if you prefer internationalized XML:

Incipit

E.g. map the label to the internal property name (“title”), which in
turn has a pre-defined mapping to CSL (“title”). If you need to share
the mapping, host the config file at a URI and allow people to load them.

Note: this is exactly how MS configures their forms BTW. Their binding
is just an XPath expression to write out the XML.

Book, article and chapter are in fact proxies for three different
abstract classes of resources: what I earlier called monograph,
part-in-serial, and part-in-monograph.

(Actually, I always thought this kind of proxy thinking was one of
EndNote’s big weaknesses.)

It’s a little frustrating to hear you say this John. I’ve spent a few
years of my life working on this, all the while working with real data
and real manuscripts. You might give me the benefit of the doubt (as you
often do) that I might know what I’m doing here. If we just rely on type
everywhere, I’m telling you: things become more complicated, not less.

Bruce

You might give me the benefit of the doubt (as you often do) that I
might know what I’m doing here.

No insult intended. You’ve definitely earned the right to be given the benefit of the doubt on this, especially by a latecomer like me. My apologies.

You’ve answered my question: Have two files. One is the CSL. The other associates a label (or multiple labels for multiple languages) with each variable-type pair in the CSL file. Got it.

I think if I were writing an application to format using CSL (which I’m toying with), I’d figure out how to get Zotero to export its mappings into that mapping file and just use whatever labels they picked. That would make for a consistent UI cross-platform.

why do you insist that you need type?

I don’t really. I’m just trying to figure out how, as an end-user working with a small group of others, we’d create a specialized citation style we need, enter the data with minimum error even if we are on different platforms, and have the bibliography come out right. I’m praying this is possible so that my chums and I don’t have to use that god-awful EndNote.

John