Field and Entry mapping for CiteProc implementations

Hello all,

I just sent a mail to Erik Wilde to ask if he is okay with us taking
two xml files from his ShaRef project (http://dret.net/projects/
sharef/). One contains TeX to unicode mappings, which I needed for
BibTex import. The other is a xml file that maps entries from bibtex
to internal representations. Since all implementations need to have
such mappings, I thought that it would be best to compile them into a
single xml file that contains all mappings from external formats to
the ones which CSL uses/expects. The file contains both mappings for
entry types and for fields.

I’ll add them to the repository when I get the go-ahead from Erik. His
version already contained those needed for BibTex and two versions (7
and 8) of EndNote xml.

By putting this mapping in a central xml file we can ensure that
different implementations of CiteProc all map the fields the same way.
This ensures that similar output is much more likely.

When changing the bibmap.xml file towards the entry types and fields
in CSL.rnc I already run into quite some question marks. So I’ll
surely need some help in deciding on the correct mappings.

I’ll ping the list when I’ve added the files.

Cheers,

Johan—
http://www.johankool.nl/

Johan Kool wrote:

I just sent a mail to Erik Wilde to ask if he is okay with us taking
two xml files from his ShaRef project (Erik Wilde's Project List
sharef/). One contains TeX to unicode mappings, which I needed for
BibTex import. The other is a xml file that maps entries from bibtex
to internal representations. Since all implementations need to have
such mappings, I thought that it would be best to compile them into a
single xml file that contains all mappings from external formats to
the ones which CSL uses/expects. The file contains both mappings for
entry types and for fields.

This is a good idea. You might also look at the code for bibutils, which
has similar sorts of mappings in (C) hashes.

Could be an excellent source for documentation, too.

There may be some problems in mapping more complex data representations
to what are flat key-values though.

Bruce

The latest version of the Zotero BibTeX translator also has similar
mappings. We removed a few problematic TeX<->Unicode mappings that were
causing, for example, all ‘F’ characters to be imported as the ‘degree
Fahrenheit’ symbol and all potential ligatures (e.g. ‘ff’) as single
characters, so you might want to check your tables for similar issues.

https://www.zotero.org/svn/extension/branches/1.0/scrapers.sql (800KB
file, but search for ‘BibTeX’)