I’m not sure about adding an additional category as well as “raw”. As I understand it, CSL is agnostic as to how dates are presented, simply requiring a processor to render day, month (with season as a substitute if not rendering day) and year.
As things stand the input format is not really specified. It happens (if I understand it right) that
citeproc-js envisages an array form (
date-parts), a raw form
raw, a literal form
literal and also a
circa flags. But none of that is required by CSL which is agnostic as to data input format. (In fact I’m not sure that the treatment of literal dates and fallback behaviour with raw dates, sensible as it is, is even sanctioned …)
Can one not say that
raw effectively means “This thing here is a date which I understand: see if you can make sense of it. If you can, treat it as a date”. At that point it seems to me that it’s open to any processor to decide whether it will handle raw, and if so how generously or not. It could have a fantastically indulgent parser which would understand, say AD XIII Kal Mai MMDCCLXXI AUC if it wanted, and produce
[ [2019, 4, 19] ] … or not.
In which case, there’s no particular reason to have a separate field for EDTF, any more than one needs a separate field for, say, “11 August 2018” or “August 11, 2018”. All one needs is an offer to recognize certain formats, and an order of priority for parsing ambiguous formats like “4/4/2019”. I’m not sure one even needs to modify an EDTF parser, does one? Can’t one just say “If this parses as valid EDTF, then it will be treated as EDTF”, if it doesn’t, it may (or may not) parse in some other way. After all parsing a date is not usually an expensive thing, so one can fail and fail repeatedly. The only thing one needs is a clear set of rules about what preference will be given to ambiguous forms.