Datetime spec, please review

I’ll be away the next two weeks (and much of the week following). It would
be good if those of you who have time could review the specification:

http://www.loc.gov/standards/datetime/spec.html

and the bnf

http://www.loc.gov/standards/datetime/bnf.html

during the next three weeks.

I’d like to think that the spec is stable, but I suspect that there are some
errors/discrepancies in the bnf. BNFs are always better when more people
look at them.

–Ray---------- Forwarded message ----------
From: “Denenberg, Ray” rden@loc.gov
Date: Aug 12, 2011 3:32 PM
Subject: Datetime spec, please review
To: DATETIME@listserv.loc.gov

I’m glad to see that season made the final specification. I think that
CSL would be well-served by Level 1 with some elements of Level 2 (a
complete implementation of Level 2 might be too much to ask in the
near future). Specifically, the multiple dates (203, 204) and calendar
(207) elements of Level 2 have demonstrated demand in the citation
world; both have come up multiple times on the Zotero forums.

We talked about making EDTF a part of CSL; it looks like we’ll be able
to take Level 1 or Level 1+some bits of level 2 and make that happen.

We’ll probably need to agree on sorting for seasons and a calendar
vocabulary, both unspecified by the current document, but this looks
like a nice improvement, and pretty much exactly what we needed.

Avram

I’ve updated the edtf Ruby gem to reflect the latest draft. The parser is based on the BNF with minimal changes to resolve a few conflicts and it works for all extensions except 201 and 205. The multiple dates (203, 204) right now return a simple array, which is to say that whilst features such as earlier/later are parsed correctly, that information is not exposed by the API. Eventually, we will need a new class to wrap these sets of dates.

A few minor issues will have to be resolved (e.g., how to deal with open/unknown intervals or how to sort intervals and seasons), but overall you should be able to experiment with the format if you are so inclined. Simply, gem install and require ‘edtf’; you then parse EDTF with the Date.edtf class method.

Sylvester

I’ve updated the edtf Ruby gem to reflect the latest draft.

Awesome!

The parser is based on the BNF with minimal changes to resolve a few conflicts and it works for all extensions except 201 and 205. The multiple dates (203, 204) right now return a simple array, which is to say that whilst features such as earlier/later are parsed correctly, that information is not exposed by the API. Eventually, we will need a new class to wrap these sets of dates.

A few minor issues will have to be resolved (e.g., how to deal with open/unknown intervals or how to sort intervals and seasons), but overall you should be able to experiment with the format if you are so inclined. Simply, gem install and require ‘edtf’; you then parse EDTF with the Date.edtf class method.

So does this suggest, then, that you’re happy with the spec? I tried
very much to be a voice for keeping it more-or-less easy-to-implement
(and therefore cutting features of questionable benefit that might not
be).

Now’s the time to get in feedback.

Bruce

Is the following correct according to the spec (note end “to” date for
the interval)?

irb(main):014:0> d = Date.edtf(‘1984/1985’)
=> #<EDTF::Interval:0x45932c @from=#<Date: 1984-01-01
(4891401/2,0,2299161)>, @to=#<Date: 1985-01-01 (4892133/2,0,2299161)>>
irb(main):015:0> d.from.to_s
=> "1984-01-01"
irb(main):016:0> d.to.to_s
=> “1985-01-01”

Bruce

Well spotted! Reading the spec again it’s probably more like 1984-01-01T00:00:00 to 1985-12-31:23:59:59. The Interval class will have to be fleshed out in the future. Right now it is a very thin wrapper around a Range.

Sylvester

It just occurred to me that I may have misunderstood your question slightly. I implemented Interval so as to stay very close to regular Ranges in order to support checks such as whether or not a given date falls into the interval. For that reason, we need the ‘from’ and ‘to’; however, the Interval class will remember that the start and end date were specified using ‘year precision’; this is important if we want to generate an EDTF string from the Interval and may be important for other checks. So, in other words, the from/to are not intended to be a direct mapping of the EDTF string, rather the Interval object is the mapping and will eventually need to expose a better API to all the features provided by EDTF intervals.

Sylvester

I’m happy with the spec so far. I’ve used the BNF for now, but it might be quicker to implement the parser with regular expressions (and then rule out some border cases programmatically) – or, if you know a parser generator for javascript, we might be able to quickly port the parser, too.

Sylvester

I don’t, but a quick google search turned up this:

https://github.com/jessesielaff/Riff

??

Bruce

This may just be ideal; I am using racc myself, so my grammar file should work right away.

Reviewing the EDTF draft, I just noticed the distinction between uncertain
and approximate dates. Unfortunately, this got mixed up a bit in CSL 1.0 (we
have an is-uncertain-date conditional, but the spec gives an example of its
use for an approximate date). We could introduce an additional
is-approximate-date conditional, if desired.

Rintze

To be honest, I don’t really understand the practical difference
between the two, or how we would represent them, beyond “c.2000”.

Thoughts?

Bruce

http://www.loc.gov/standards/datetime/spec.html includes definitions of the
two. Anyway, it’s not a big problem, but ideally we should have used
is-approximate-date instead of is-uncertain-date.

Rintze

As far as I understand it, the practical distinction is that approximate (~) dates express a certain degree of confidence (e.g., 2001~ means c. 2001, it might have been 2000 or 2002 but in any case it is close to 2001), whereas uncertain (?) dates could be farther off (e.g., 1838? could mean 1838, or was it 1383?).

Where bibliographic references are concerned I think only approximate dates are actually used. It wouldn’t hurt to support both, though. In any case, I agree with Rintze that it would probably be a good idea to introduce the is-approximate-date conditional.

Best,
Sylvester