Some questions I encountered trying to write citeproc-py

Hello all,

Questions/remarks:

a) RefType usage

apa.csl:

chicago.csl:



etc. etc.

This makes parsing more difficult because I have to check wether or
not refType is present. Can I suggest to always use the latter form
for apa.csl too:

b) Fallback refType

How to handle a reference for which no reftype is defined in the CSL?
E.g. a user quotes a review but the apa.csl doesn’t say how to format
it.

c) Name object

In Python I want to use a object to represent names.

class Name:
def init(self, givenName, surName, naturalOrder=“given-sur”,
vander="", prefix="", suffix=""):
self.prefix = prefix
self.givenName = givenName # E.g. “John” or “J.W.” or
"Johan Willem"
self.vander = vander # E.g. “vön” or “de” or "d’"
self.surName = surName # E.g. "Arcus"
self.suffix = suffix # E.g. "junior"
self.naturalOrder = naturalOrder # “given-sur” or “sur-given”

What was again the right nomenclature for those words like “vön” or
"van der" as in “Klaus vön Nierop” or “Jan van der Berg”?

Would this be sufficient to store/represent names properly?

d) How names are to be written

Perhaps we should in CSL define how to write names?

E.g.









It seems more flexible that way to me than to use “form=“short”” plus
it has the advantage that it is more obvious what is meant, as someone
just looking at a CSL won’t know what is meant with the short form.

e) Sorting names

Does “van der Berg” go with the “V” or the “B”? Dutch will sort it
with the “B”, the Flemish with the “V”.

Greetings,

Johan–
http://www.johankool.nl/

Hello all,

Questions/remarks:

a) RefType usage

apa.csl:

This makes parsing more difficult because I have to check wether or
not refType is present.

I was wondering when someone would complain about this :slight_smile:

Can I suggest to always use the latter form
for apa.csl too:

In the alt schema I was playing with:

So the substantive question is why do you have itemLayout as a child of
refType?

One of the reasons I hadn’t done this before was that I thought it
somewhat awkward. A reftyp element seems to say “formatting depends on
type” while adding the generic attribute seems to say “well no, not
really.”

A secondary question is if we want to revisit naming conventions and
make sure they’re consistent.

b) Fallback refType

How to handle a reference for which no reftype is defined in the CSL?
E.g. a user quotes a review but the apa.csl doesn’t say how to format
it.

Ah, this is critical to understand, and I need to include it in the
schema.

Article, chapter, and book are simultaneously fallbacks. This is why I
have required them.

In citeproc-xsl, every reference has two properties: class (monograph,
part-inMonograph, part-InSerial), and type.

When formatting, cp first looks for a type (say “article-journal”). If
it does not find one, it uses the fallback for its class.

What this means is that almost all references use the fallback. It
makes formatting more robust, and styles more compact.

c) Name object

In Python I want to use a object to represent names.

class Name:
def init(self, givenName, surName, naturalOrder=“given-sur”,
vander="", prefix="", suffix=""):
self.prefix = prefix
self.givenName = givenName # E.g. “John” or “J.W.” or
"Johan Willem"
self.vander = vander # E.g. “vön” or “de” or
"d’"
self.surName = surName # E.g. "Arcus"
self.suffix = suffix # E.g. “junior"
self.naturalOrder = naturalOrder # “given-sur” or
"sur-given”

What was again the right nomenclature for those words like “vön” or
"van der" as in “Klaus vön Nierop” or “Jan van der Berg”?

It’s called an articular.

Since you’re in a place where they are important, it’s up to you to
decide if they should there. I suspect in many cases they would be
folded into the family name, but then you need to account for that in
code.

Would this be sufficient to store/represent names properly?

I’d suggest either forename/surname or givenname/familyname. The first
is somewhat broader, though the second is how vcard handles it.

I’d change naturalOrder to sort_order, and maybe have a default of
family-given (or sur-fore).

Also, how would you handle organizational authors?

Finally, shouldn’t attributes use the underscore convention in Python?
So “given_name” rather than “givenName”?

d) How names are to be written

Perhaps we should in CSL define how to write names?

E.g.









It seems more flexible that way to me than to use “form=“short”” plus
it has the advantage that it is more obvious what is meant, as someone
just looking at a CSL won’t know what is meant with the short form.

This is one of those decisions I made to leave it up to implementors,
and I still feel that was a good decision.

I’m open to revisiting it though.

e) Sorting names

Does “van der Berg” go with the “V” or the “B”? Dutch will sort it
with the “B”, the Flemish with the “V”.

So Dutch sorting rules ignore the articular for sorting, but the
Flemish do not. This is among the reasons I don’t define naming and
name sorting in CSL :wink:

But in your software, you probably need some logic based on the locale
for sorting. It might be as a simple as a flag to use the articular
for sorting.

It just occurred to me that while I’ve studied how Western authors deal
with formatting Asian names, I’ve no clue how it would work in the
reverse!

Bruce

Questions/remarks:

a) RefType usage

This makes parsing more difficult because I have to check wether or
not refType is present.

I was wondering when someone would complain about this :slight_smile:

So the substantive question is why do you have itemLayout as a child of
refType?

Because you did it that way in nar.csl! :slight_smile:

One of the reasons I hadn’t done this before was that I thought it
somewhat awkward. A reftyp element seems to say “formatting depends on
type” while adding the generic attribute seems to say “well no, not
really.”

A secondary question is if we want to revisit naming conventions and
make sure they’re consistent.

I suppose. Haven’t really looked at it in detail yet.

b) Fallback refType

How to handle a reference for which no reftype is defined in the CSL?
E.g. a user quotes a review but the apa.csl doesn’t say how to format
it.

Ah, this is critical to understand, and I need to include it in the
schema.

Article, chapter, and book are simultaneously fallbacks. This is why I
have required them.

Thanks!

c) Name object

Would this be sufficient to store/represent names properly?

I’d suggest either forename/surname or givenname/familyname. The first
is somewhat broader, though the second is how vcard handles it.

Will use the given/family instead then.

I’d change naturalOrder to sort_order, and maybe have a default of
family-given (or sur-fore).

I was more thinking to use naturalOrder to store which way it ought to
be written, then let it be specified in CSL how they have to be
written using the choices: normal|reverse|family-given|given-family

This then would show up as:

normal:
Kim John (Asian guy named John)
John Doe

reverse:
John, Kim
Doe, John

family-given:
Kim, John
Doe, John

given-family:
John Kim
John Doe

Also, how would you handle organizational authors?

Not sure yet. I hadn’t yet thought of them.

Finally, shouldn’t attributes use the underscore convention in Python?
So “given_name” rather than “givenName”?

Can you tell I used to be programming in Objective-C? :wink:

d) How names are to be written

Perhaps we should in CSL define how to write names?

E.g.









It seems more flexible that way to me than to use “form=“short”” plus
it has the advantage that it is more obvious what is meant, as someone
just looking at a CSL won’t know what is meant with the short form.

This is one of those decisions I made to leave it up to implementors,
and I still feel that was a good decision.

I’m open to revisiting it though.

Does it makes sense to do something like
normal>reverse>family-given|given-family discussed above?

e) Sorting names

Does “van der Berg” go with the “V” or the “B”? Dutch will sort it
with the “B”, the Flemish with the “V”.

So Dutch sorting rules ignore the articular for sorting, but the
Flemish do not. This is among the reasons I don’t define naming and
name sorting in CSL :wink:

But in your software, you probably need some logic based on the locale
for sorting. It might be as a simple as a flag to use the articular
for sorting.

Could such a flag be added to the sorting instructions of CSL?

It just occurred to me that while I’ve studied how Western authors deal
with formatting Asian names, I’ve no clue how it would work in the
reverse!

I wouldn’t know either.

Thanks for the help!

Johan

I’d change naturalOrder to sort_order, and maybe have a default of
family-given (or sur-fore).

I was more thinking to use naturalOrder to store which way it ought to
be written,

Then call it “display_order”.

vcard, BTW, has a separate attribute called FN (formatted name), so
assumes the software doesn’t have to worry about this. I guess if you
were representing in objects, you’d have:

vcard,fn
vcard.n.given_name

then let it be specified in CSL how they have to be
written using the choices: normal|reverse|family-given|given-family

The problem is sort and display gets so wrapped up in locale and
language that it’s really difficult to burden CSL with these details.
This is why I have a generic attribute like “author-as-sort-order”.
This is saying to the software, “give me the sorted representation of
the name,” while otherwise it would be asking for the display version.

I think this is as it should be. This stuff is just too complicated
otherwise, because within one reference list, you can have different
sorting rules depending on the name.

Does it makes sense to do something like
normal>reverse>family-given|given-family discussed above?

I think the “author-as-sort-order” attribute should work?

But in your software, you probably need some logic based on the locale
for sorting. It might be as a simple as a flag to use the articular
for sorting.

Could such a flag be added to the sorting instructions of CSL?

But it’s locale-specific, isn’t it? Or is it also a property of a style?

Bruce

How about this, where a template without a name element is understood
as generic?

If that’s fine, then what to do about the number and label styles?

… or just:

The styles are a little different in how they work, so am not sure if
they should have the same XML representation (?).

Bruce

I like the structure best for the most
consistent makeup. Why did you btw changed from to
? The former seems more descriptive to me.

One small problem btw: is also to be understood as part of a
citation, so we need to use another tag.

Cheers,

Johan

I like the structure best for the most
consistent makeup. Why did you btw changed from to
? The former seems more descriptive to me.

I don’t know; just don’t like the compound names when they’re not
needed. I guess more aesthetics than anything.

It could also be:

... ...

… which is probably more elegant from an XML perspective.

I’ve no strong opinion on this.

One small problem btw: is also to be understood as part of a
citation, so we need to use another tag.

Actually, cs:number is a child of cs:locator (and pages, etc.). It says
"print the number for this locator." I guess I was thinking that number
in this context does similarly: “print the number for the citation”,
but if you guys disagree, then I’m obviously wrong.

Am open to alternatives.

Basically, we need to configure [1-3] and [jones99] type citations, so
that they feel consistent with the other classes (note and
author-date).

Bruce