MultiMarkDown & CiteProc

Hello,

I have a question about the usage of CiteProc for using it in
combination with MultiMarkDown. I was thinking that using CiteProc to
format citations would be quite easy since MultiMarkDown does use
quite easily identifiable output in html.

(citekey)

or

(#)

or

and

where the list should go.

So I would like to keep the html-file created by MultiMarkDown as is,
and only apply CiteProc to the above. I understand I need to add
these elements to in-driver.xsl by adding their XPath equivalents:

//span[@class=‘markdowncitation’]/@id,
//span[@class=‘externalcitation’]/@id,
//span[@class=‘notcited’]/@id"/

And adding the XHTML namespace:

xmlns:xhtml=“http://www.w3.org/1999/xhtml

Am I correct so far?

But I am a bit lost to understand how the document/… .xsl should
look like. It should be quite straightforward I’d think, but I’m not
so sure yet how. I would think I need to create a new xsl file here,
I named it mmd-xhtml.xsl. How would I instruct CiteProc to process
the citations and leave the rest of the html alone?

I am btw no trying to start from a MultiMarkDown text document, but
from its output: the xhtml file, because I’d guess that it’s much
easier to do since at that point we have a valid xhtml file, and
CiteProc also already generates xhtml output.

Thanks!

Johan—
http://www.geo.vu.nl/~jkool/

Am I correct so far?

Yes.

But I am a bit lost to understand how the document/… .xsl should
look like. It should be quite straightforward I’d think, but I’m not
so sure yet how. I would think I need to create a new xsl file here, I
named it mmd-xhtml.xsl. How would I instruct CiteProc to process the
citations and leave the rest of the html alone?

Well, it’s not so straightforward AFAIK. You need to create a series of
templates, where you do stuff like:

<xsl:template match=“xhtml:p”>

<xsl:template match=“xhtml:q | xhtml:span”>
<xsl:copy-of select=“.”/>
</xsl:template>

The second just copies the content of the nodes it matches.

Bruce

Good morning!

I made some progress on getting CiteProc to work with MultiMarkDown
generated xhtml. I am a litte confused about one thing though. I
would think that I’d call <xsl:call-template name=“cp:format-
citation”> on the place where I want the citation to appear. But how
does CiteProc know which reference it should put there? Should I not
somehow pass a long an id for that?

I was thinking this was done in the in-driver.xsl but adding

//xhtml:span[@class=‘markdowncitation’]/a/@href, //xhtml:span
[@class=‘externalcitation’]/a/@id

had no effect. The citations in MMD XHTML look like this btw:

(1)
(Tilly2000a)</

(I still need to figure out how to delete the # in the
markdowncitation, but it should have worked for the externalcitation
already.

mmd-xhtml.xsl:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform” version=“2.0”
xmlns:xdoc=“XSLTdoc - A Code Documentation Tool for XSLT - Main Page
xmlns:xs=“http://www.w3.org/2001/XMLSchema
xmlns:mods=“Metadata Object Description Schema (MODS)” xmlns=“http://www.w3.org/
1999/xhtml”
xmlns:xhtml=“XHTML namespace
xmlns:xi=“XInclude
xmlns:cs=“Citation Style Language - Citation Style Language
xmlns:cp=“http://purl.org/net/xbiblio/citeproc
xmlns:exist=“http://exist.sourceforge.net/NS/exist
exclude-result-prefixes=“xdoc xhtml mods xs cp cs exist xi”>
<xsl:import href=“…/citeproc.xsl”/>
<xsl:output method=“xhtml” omit-xml-declaration=“yes” encoding=“utf-8”
indent=“yes”/>
<xdoc:doc type=“stylesheet”>
xdoc:shortStylesheet to transform XHTML-MMD (MultiMarkDown) to
XHTML.</xdoc:short>
xdoc:authorJohan Kool</xdoc:author>
xdoc:copyright2006, Johan Kool</xdoc:copyright>
xdoc:detail

MultiMarkDown (MMD) formatted XHTML comes in, formatted XHTML
goes out.


</xdoc:detail>
</xdoc:doc>
<xsl:param name=“include-bib”>yes</xsl:param>
<xsl:param name=“bibdb”>flatfile</xsl:param>
<xsl:param name=“bibinfile”>rdfdata.xml</xsl:param>

<!-- Make a complete (almost) copy of the xhtml file -->
<!-- FIXME: make the copy also include DOCTYPE etc. -->
<xsl:template match="@*|node()">
   <xsl:copy>
	  <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
</xsl:template>

<!-- Format and insert citations -->
<xsl:template match="//xhtml:span[@class='markdowncitation']">
	<span class="citeproccitation">
		<xsl:call-template name="cp:format-citation">
			<xsl:with-param name="output-format" select="'xhtml'"/>
		</xsl:call-template>
	</span>
</xsl:template>
<xsl:template match="//xhtml:span[@class='externalcitation']">
	<span class="citeproccitation">
		<xsl:call-template name="cp:format-citation">
			<xsl:with-param name="output-format" select="'xhtml'"/>
		</xsl:call-template>
	</span>
</xsl:template>

<!-- Replace MMD bibliography section with CiteProc's equivalent -->
<xsl:template match="//xhtml:div[@class='bibliography']">
	<div class="bibliography">
		<h2>References</h2>
		<xsl:call-template name="cp:format-bibliography">
			<xsl:with-param name="output-format" select="'xhtml'"/>
		</xsl:call-template>
	</div>
</xsl:template>

</xsl:stylesheet>

Cheers,

Johan

I suspect the problem is the latter steps. After the template is
called, there are other places in the code where it looks for the
citation id. Just do a grep for db:biblioref.

I’ll have to think if there’s an elegant workaround for this. It might
involve first transforming the xhtml citation to an internal docbook
one. Not sure how to do that though.

But you could always just add the xhtml stuff to the relevant places
for now.

Bruce

I’ll have to think if there’s an elegant workaround for this. It
might involve first transforming the xhtml citation to an internal
docbook one. Not sure how to do that though.

Wouldn’t it be better (and easier) to change it so that the call to
format-citation template requires to have the id passed along too?
Just as right now it needs to know the kind of output?

So it would look something like this:

<xsl:template match="//xhtml:span[@class='externalcitation']">
		<xsl:call-template name="cp:format-citation">
		    	<xsl:with-param name="refid" select="./a[1]/@id" />
			<xsl:with-param name="output-format" select="'xhtml'"/>
		</xsl:call-template>
</xsl:template>

And then to use the refid param when it’s needed in the functions
called by the template for the citation instead of looking it up in
each place again? That seems more appropriate to me than to transform
to docbook internally first. Also, it would make CiteProc less
dependent on DocBook and so more open for use in other file formats too.

But you could always just add the xhtml stuff to the relevant
places for now.

Sounds messy. Shouldn’t we better try what I propose above?

JohanOp 15-mrt-2006, om 14:32 heeft Bruce D’Arcus het volgende geschreven:

I’ll have to think if there’s an elegant workaround for this. It
might involve first transforming the xhtml citation to an internal
docbook one. Not sure how to do that though.

Wouldn’t it be better (and easier) to change it so that the call to
format-citation template requires to have the id passed along too?
Just as right now it needs to know the kind of output?

So it would look something like this:

<xsl:template match=“//xhtml:span[@class=‘externalcitation’]”>
<xsl:call-template name=“cp:format-citation”>
<xsl:with-param name=“refid” select=“./a[1]/@id” />
<xsl:with-param name=“output-format” select=“‘xhtml’”/>
</xsl:call-template>
</xsl:template>

And then to use the refid param when it’s needed in the functions
called by the template for the citation instead of looking it up in
each place again? That seems more appropriate to me than to transform
to docbook internally first. Also, it would make CiteProc less
dependent on DocBook and so more open for use in other file formats
too.

Yeah, good point. The problem is that you can’t just pass the id. In
some cases there are local modifications, so that you’d need to pass
full elements. That would be a possibility, where you’d basically
transform the xhtml citations into those docbook biblioref elements,
and pass those.

But you could always just add the xhtml stuff to the relevant places
for now.

Sounds messy. Shouldn’t we better try what I propose above?

Yeah. Do you feel comfortable experimenting with what I suggest above?
I’m a little busy this week.

If not, I’ll take a look when I get a chance.

Bruce

And then to use the refid param when it’s needed in the functions
called by the template for the citation instead of looking it up
in each place again? That seems more appropriate to me than to
transform to docbook internally first. Also, it would make
CiteProc less dependent on DocBook and so more open for use in
other file formats too.

Yeah, good point. The problem is that you can’t just pass the id.
In some cases there are local modifications, so that you’d need to
pass full elements.

What kind of modifications are you referring too? If you are talking
about various citations like author-only or such, then should it not
be possible to deal with those by introducing a third option?

E.g.

<xsl:template match="//xhtml:span[@class='externalcitation']">
		<xsl:call-template name="cp:format-citation">
		    	<xsl:with-param name="refid" select="./a[1]/@id" />
		    	<xsl:with-param name="type" select="./@type" />
			<xsl:with-param name="output-format" select="'xhtml'"/>
		</xsl:call-template>
</xsl:template>

where type defaults to normal when not set?

Yeah. Do you feel comfortable experimenting with what I suggest
above? I’m a little busy this week.

My knowledge of XSLT is still quite limited, but experimenting with
this should be alright. With svn it’s always possible to go back a
version after all! :slight_smile:

JohanOp 15-mrt-2006, om 15:59 heeft Bruce D’Arcus het volgende geschreven:

What kind of modifications are you referring too? If you are talking
about various citations like author-only or such,

In my case, year only. But yes.

then should it not
be possible to deal with those by introducing a third option?

E.g.

    <xsl:template match="//xhtml:span[@class='externalcitation']">
                    <xsl:call-template name="cp:format-citation">
                            <xsl:with-param name="refid" select="./a[1]/@id" />
                            <xsl:with-param name="type" select="./@type" />
                            <xsl:with-param name="output-format" select="'xhtml'"/>
                    </xsl:call-template>
    </xsl:template>

where type defaults to normal when not set?

No, that won’t work.

Remember, a citation may contain more than one reference. Because of
that, it’s important to be able to sort them within the citation. One
can’t do that without processing all of them at once, and if one does
that, one has to pass full elements to properly track which one has
any modification.

Bruce

In other words, you’d end up with something like:

<xsl:with-param name=“bibliorefs” select="$bibliorefs" as=“element()+”/>

… where $bibliorefs is a variable that:

a) in the case of docbook, just passes on the existing elements
b) otherwise, creates the docbook nodes

Bruce

I think it would help me understand this is you could explain why the
docbook nodes solve these problems in a way that a MultiMarkDown
friendly representation of the citation doesn’t seem to be able to
do …

–J

It’s not so much that there’s something special about the docbook
representation (though it certainly is richer than what can be achieved
in XHTML; it’s purpose-designed after all), but rather that there has
to be some standardized code because internal processing depends on it.
This is why Johan can’t get it working onw; because citeproc is not
looking for XHTML source.

Bruce

BTW, I wonder if it’d worth talking to Fletcher about changing the
"markdowncitation" class to something more generic?

Bruce