journal file and short titles

Re: the question of naming styles, I recall that there are sort of
standard abbreviations for journals. Those might be sensible to use
then.

But I’m not aware of any nice web service that give you these.

So I went back to a large csv file I downloaded from the OCLC way
back. It’s basically their list of periodicals. I first converted to
simple XML, and have this sort of thing:










I then have an XSLT that will process it a bit and yield:

<Periodical issn="0001-4273">
   <title>Academy of Management Journal</title>
   <title>ACADEMY OF MANAGEMENT JOURNAL</title>
   <title>Academy of Management journal.</title>
   <title>Acad Manage J</title>
</Periodical>

An interesting trick is that most of the entries from the MEDLINE
database are shortened titles. So I can just use the XSLT to derive
abbreviated titles. It generally works, but right now I’m getting
stuff like:

<Periodical issn="0001-320X">
   <shortTitle>Aberdeen University review.</shortTitle>
   <shortTitle>Aberdeen Univ Rev</shortTitle>
   <shortTitle>Aberdeen University Review</shortTitle>
</Periodical>

Anyone know a good regular expression is distinguish an abbreviated
title? Looking for "J " is easy, but not reliable.

With a clean file, it could be a good lookup table, since I’d rather
not have to think about the business of abbreviation myself!

BTW, I was contemplating trying to break up the file (which is about
7 MB) based on database since that’s a sort of proxy for broad areas,
but am not so sure.

Bruce