Sort collation question

A user on the Zotero forums has reported a sort result that apprently
differs from an example given in MLA Handbook 7th ed. The cause is
inclusion of a space in the sort, so that “De Quincey” (treating the De as
part of the last name) sorts as:

De Quincy
Deniehy

If the space were ignored, the order would be:

Deniehy
De Quincey

This behaviour appears to be a feature of the English Unicode collation.
I’ve scratched around a little for information on preferred sort methods,
but haven’t found anything to add to the MLA example cited by user iselim
in the discussion linked above.

Stripping spaces from the processor sort keys would align behaviour with
the MLA example, but I don’t know what general expectations are for sorting
of space characters (and hyphens…).

Can anyone offer clues on the specifics of bibliographic sort requirements,
and whether they vary across styles and locales?

Frank - your impulse to look into broader (unicode) sorting collations
seems to me right. I don’t want to get into rolling our solutions for
this kind of thing.

Bruce

In French, for the purpose of collation, you don’t take into account apostrophes, hyphens and whitespaces. Diacritics are not used too (acute, grave, circumflex accents, diaeresis/trema, cedilla), except in presence of homographs (e.g.: cote/côte/coté/côté ; also, whitespace before hyphen: vice versa/vice-versa)
See: http://www.cuy.be/orthotypo/classem_alphab.htm#regles (in French but there’s an example which makes the “rules” easier to understand)> Date: Sat, 10 Nov 2012 15:31:19 +0900