update on particle, date patches

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

date: as I said, I don’t understand the “other” problem, and Frank
seemed not to think this is that critical an issue, so my druthers is
to leave it out.

Bruce

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

date: as I said, I don’t understand the “other” problem, and Frank
seemed not to think this is that critical an issue, so my druthers is
to leave it out.

By leaving it out, do you mean to leave out the “other” date-part?
That would be our choice, too.

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

date: as I said, I don’t understand the “other” problem, and Frank
seemed not to think this is that critical an issue, so my druthers is
to leave it out.

By leaving it out, do you mean to leave out the “other” date-part?
That would be our choice, too.

(We’ll also need a day-month form for localized dates, but I don’t
think that’s been proposed yet.)

No. The use case for the other part is absolutely clear (effectively,
sub-year components which are “other” than months and days), and I
don’t see any practical problem with retaining it.

Bruce

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

date: as I said, I don’t understand the “other” problem, and Frank
seemed not to think this is that critical an issue, so my druthers is
to leave it out.

By leaving it out, do you mean to leave out the “other” date-part?
That would be our choice, too.

No. The use case for the other part is absolutely clear (effectively,
sub-year components which are “other” than months and days), and I
don’t see any practical problem with retaining it.

Text that is not captured in the date elements by the parse can and
should be rendered. The point of dropping “other” from the schema
wouldn’t change that. It’s just that because the position of the
“other” element is fixed, and because its content is a dumb string
(and therefore not safe for formatting), there is no need to include a
date-part for it in the schema.

OK, I understand that. But that would mean we’d disallow applying
formatting (bold and such) to these components.

Bruce

Bruce

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

date: as I said, I don’t understand the “other” problem, and Frank
seemed not to think this is that critical an issue, so my druthers is
to leave it out.

By leaving it out, do you mean to leave out the “other” date-part?
That would be our choice, too.

No. The use case for the other part is absolutely clear (effectively,
sub-year components which are “other” than months and days), and I
don’t see any practical problem with retaining it.

Text that is not captured in the date elements by the parse can and
should be rendered. The point of dropping “other” from the schema
wouldn’t change that. It’s just that because the position of the
“other” element is fixed, and because its content is a dumb string
(and therefore not safe for formatting), there is no need to include a
date-part for it in the schema.

OK, I understand that. But that would mean we’d disallow applying
formatting (bold and such) to these components.

Yes. If there were a demand for seasons to be set in italics (for
e.g.), we could refine the date parse, and then introduce an element
that specifically applies to season names.

So I’m still lost on these two patches.

particle: not sure where we ended up with the discussion on the
parameters, nor why the “invert” thing got thrown in. I’ll wait on
Rintze’s proposed spec language and revised patch before doing
anything.

I’ll leave the talking to Rintze. But on the “invert” item, this is
just a renaming of name-as-sort-order. We reasoned that keeping it as
name-as-sort-order would be misleading, once the name-part sequence
for display was decoupled from that used for sorting.

I don’t think so. It would make sense to apply any formatting set on cs:date
to these unparsed affixes. For the ‘normal’ date-parts (year, month, day),
the formatting set to cs:date can be overridden by applying formatting
attributes on the cs:date-part elements.

Rintze

OK, I understand that. But that would mean we’d disallow applying
formatting (bold and such) to these components.

I don’t think so. It would make sense to apply any formatting set on cs:date
to these unparsed affixes.

Just a clarification (for purposes of the spec and such): “other”
parts are not necessarily exactly “unparsed” nor are they “affixes”.
In my implementation, dates have to conform to a strict data-type, and
the “other” part is defined (borrowing from RIS, BTW).

So I think we need to be explicit about the “other” content.

Bruce

OK, I understand that. But that would mean we’d disallow applying
formatting (bold and such) to these components.

I don’t think so. It would make sense to apply any formatting set on cs:date
to these unparsed affixes.

Just a clarification (for purposes of the spec and such): “other”
parts are not necessarily exactly “unparsed” nor are they “affixes”.
In my implementation, dates have to conform to a strict data-type, and
the “other” part is defined (borrowing from RIS, BTW).

So I think we need to be explicit about the “other” content.

It would be helpful to know what an other element can and cannot contain.

How would that work in real life? IMHO, the benefits of having a "unparsed"
prefix and suffix instead of a single “other” date-part seem to be:

  • Stuff like “circa 2000 (?)” with two “other” parts would be supported
  • The position of the “other” date-part doesn’t have to be set in the styles
    (i.e. before or after the main date, “circa 2000” or “2000 (?)”)
  • Current styles won’t have to be changed

Rintze

A string.

“2000///Fall”

… where “Fall” is the other string.

Note: I’m not saying this has to be enforced in CSL; just saying we
don’t want to introduce language that suggests otherwise (no pun
intended).

Bruce

In my implementation at least: “2000~”. So “circa 2000” won’t be valid.

Bruce

Are you hoping that Zotero will convert “circa 2000” to “2000~” for you? Or
should the content of the date-field in Zotero be constrained to strings
that can be parsed (like “2000~”)? Or will "circa " just be discarded when
the “circa 2000” string is send to the CSL processor (yielding “2000”)?

Are you hoping that Zotero will convert “circa 2000” to “2000~” for you?

Yes and no.

Yes, I think Zotero should be a little more strict on date entry,
storage, export.

Yes, I think part of that means it should explicitly support approximate dates.

No, I don’t care that much personally (and part of the reason I’m
writing the Python version is because I don’t always want to use
Zotero).

Or should the content of the date-field in Zotero be constrained to strings
that can be parsed (like “2000~”)? Or will "circa " just be discarded when
the “circa 2000” string is send to the CSL processor (yielding “2000”)?

That’s up to Dan and Simon, but I don’t think it’s too much to ask
them to send “2000~” (nor to store and export it that way).

Yes, I think Zotero should be a little more
strict on date entry, storage, export.
Yes, I think part of that means it should explicitly support approximate dates.

What do you think should be done on import though? Unfortunately (or
fortunately depending on your point of view) most
reference management software treats its data just as a bunch of
labeled fields which may contain arbitrary strings. If they
pipe that data through CSL and it gets lost in the process then they
will probably blame the CSL tool.

I imagine if you try to capture or explicitly deal with all the little
details people might want to include in these fields then
the spec could get very large.

Regards,
Robert.2009/9/7 Bruce D’Arcus <@Bruce_D_Arcus1>:

Yes, I think Zotero should be a little more
strict on date entry, storage, export.
Yes, I think part of that means it should explicitly support approximate dates.

What do you think should be done on import though?

There’s a chicken-and-egg problem here, and a reasonable question of
where (or whether) the normalization happens.

“Import” from where? If it’s from MARC data, you can parse c. and
original dates.

Unfortunately (or
fortunately depending on your point of view) most
reference management software treats its data just as a bunch of
labeled fields which may contain arbitrary strings.

I’m not so sure.

RIS dates are structured: YYYY/MM/DD///other.

If they pipe that data through CSL and it gets lost in the process then they
will probably blame the CSL tool.

And if it can’t sort their bibliography correctly, they’ll probably
also blame the CSL tool.

I imagine if you try to capture or explicitly deal with all the little
details people might want to include in these fields then
the spec could get very large.

I’m not willing to support every conceivable little eccentricity, but
I am willing to support:

approximate dates
ranges
BC

All of these are covered in the new EDTF effort from the LOC:

http://www.loc.gov/standards/datetime/

I’m also happy to support original dates (which we already have).

The only other thing is the “other” dates in RIS and CSL (the primary
use case being “Fall” and such).

Doesn’t that cover 99% of use cases?

Bruce

“Import” from where? If it’s from MARC data, you can parse c. and
original dates.

I was referring to your comments about whether Zotero (or similar
software) should store and export things like ‘circa’
in dates.

Doesn’t that cover 99% of use cases?

It covers everything that I can think of. As always I expect that
there will be a researcher somewhere who has some particular
requirement that is very important to them that we haven’t thought of.
I think that the list of ‘date features to support’ you mentioned is
a sensible
place to draw the line though.

Regards,
Robert.2009/9/7 Bruce D’Arcus <@Bruce_D_Arcus1>:

OK, let’s talk data and data exchange.

Zotero’s primary export format is RDF, which they’re now upgrading to
supporting BIBO. In either case, really, Dublin Core is the date
representation.

Let’s take a simple turtle representation of this RDF. We could do:

http://example.org/1 a bibo:Article ;
dcterms:issued “200 BC” .

…or:

http://example.org/1 a bibo:Article ;
dcterms:issued “circa 1345” .

… which are worthless (for sorting ,at least) without additional
processing, which may not be entirely reliable.

Or we could do:

http://example.org/1 a bibo:Article ;
dcterms:issued "-0200"xsd:gYear .

…or:

http://example.org/1 a bibo:Article ;
dcterms:issued "1345~"loc:edft .

These dates will sort correctly without modification,

We can also do this in HTML with RDFa …

c.1345

… where we start to allow full round-tripping of data, without funky
and unreliable text parsing.

Bruce