Promoting links In HTML output

As some of know, in my own python implementation I’ve been using an internal
model that is essentially an HTML subset + RDFa. The idea is to dump this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make output
smarter if at least processors output links where appropriate.

Is this something we can put in the spec as a recommendation?

Bruce

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

Bruce

As some of know, in my own python implementation I’ve been using an internal
model that is essentially an HTML subset + RDFa. The idea is to dump this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make output
smarter if at least processors output links where appropriate.

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

I agree that processors should be able to output active links and
internal links in HTML output. As you say, it should probably be
optional to include the links, in case an application or user wants
the output unadorned, but it’s definitely a good idea, and I’d be in
favor of including it as a recommendation in the spec.

Since there are no CSL directives which would include URLs, does that mean you are referring to cases where URLs are specified in the input? That is to say, the issue is about whether or not a processor should strip tags, leave them, or even embellish them in some way?

Sylvester

PGP.sig (195 Bytes)

As some of know, in my own python implementation I’ve been using an
internal
model that is essentially an HTML subset + RDFa. The idea is to dump
this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make
output
smarter if at least processors output links where appropriate.

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

Since there are no CSL directives which would include URLs, does that mean
you are referring to cases where URLs are specified in the input? That is to
say, the issue is about whether or not a processor should strip tags, leave
them, or even embellish them in some way?

Am not really following you. But I mean where a doi or url variable prints
output.On Feb 28, 2011 4:23 AM, “Sylvester Keil” <@Sylvester_Keil> wrote:

On Feb 28, 2011, at 2:12 AM, Bruce D’Arcus wrote:

On Wed, Feb 23, 2011 at 10:00 AM, Bruce D’Arcus <@Bruce_D_Arcus1> wrote:

Sylvester


Free Software Download: Index, Search & Analyze Logs and other IT data in
Real-Time with Splunk. Collect, index and harness all the fast moving IT
data
generated by your applications, servers and devices whether physical,
virtual

Frank Bennett <@Frank_Bennett> writes:

As some of know, in my own python implementation I’ve been using an internal
model that is essentially an HTML subset + RDFa. The idea is to dump this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make output
smarter if at least processors output links where appropriate.

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

I agree that processors should be able to output active links and
internal links in HTML output. As you say, it should probably be
optional to include the links, in case an application or user wants
the output unadorned, but it’s definitely a good idea, and I’d be in
favor of including it as a recommendation in the spec.

I do not have any strong opinion on the subject. I happen to prefer
unadorned output, but implementing what Bruce is suggesting would be
easy in citeproc-hs. I too would prefer to make it an option, but that
is not trivial for the pandoc side, and so I need to see what the pandoc
users think about it. Maybe it could be the pandoc default behavior (it
would be trivial to write a script which post-process the pandoc output
to remove the links).

So this pertains only to URL and DOI Variables which are rendered by the processor? Now I understand it. To answer your question, I would think this is certainly a useful feature, but perhaps it should be kept optional or the recommended behaviour only, especially for output formats such as LaTeX where a proper link may not be what is intended?

Sylvester

PGP.sig (195 Bytes)

Yes, that’s what I have in mind.

I just think it’s silly that we have HTML output for the web without links :slight_smile:

Bruce

1 Like

As some of know, in my own python implementation I’ve been using an internal
model that is essentially an HTML subset + RDFa. The idea is to dump this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make output
smarter if at least processors output links where appropriate.

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

Since there are no CSL directives which would include URLs, does that mean you are referring to cases where URLs are specified in the input? That is to say, the issue is about whether or not a processor should strip tags, leave them, or even embellish them in some way?

Am not really following you. But I mean where a doi or url variable prints output.

So this pertains only to URL and DOI Variables which are rendered by the processor? Now I understand it. To answer your question, I would think this is certainly a useful feature, but perhaps it should be kept optional or the recommended behaviour only, especially for output formats such as LaTeX where a proper link may not be what is intended?

Yes, that’s what I have in mind.

I just think it’s silly that we have HTML output for the web without links :slight_smile:

I agree, and given that the PHP version is pretty much only going to
be used for HTML output, I’m thinking that it might be useful to build
some directives into the CSL spec that could be used to wrap any field
with an <a …> and maybe be able to set a “variable” attribute
on the tag so that you could pull the URL from the incoming data
stream. For example, I’m putting links on author names and titles, so
maybe you could have something like…

Ron.

While I can see the attractiveness, I also see a number of problems:

  1. styles would need to get rewritten

  2. would be tailored for a particular output format

  3. even with that, yields pretty particular output (what if I want to do RDFa?)

To go back to your example, is there anything that prevents you from
adding links to author names now?

Bruce

I agree, and given that the PHP version is pretty much only going to
be used for HTML output, I’m thinking that it might be useful to build
some directives into the CSL spec that could be used to wrap any field
with an <a …> and maybe be able to set a “variable” attribute
on the tag so that you could pull the URL from the incoming data
stream. For example, I’m putting links on author names and titles, so
maybe you could have something like…

While I can see the attractiveness, I also see a number of problems:

  1. styles would need to get rewritten

  2. would be tailored for a particular output format

  3. even with that, yields pretty particular output (what if I want to do RDFa?)

To go back to your example, is there anything that prevents you from
adding links to author names now?

Points well taken, and no there is nothing preventing me from adding
links. I am doing it, but that bit of code is specific to the Drupal
CiteProc module rather than being generalized in the “bitbucket”
CiteProc.php code.

Ron.

Random thought on followup …

Frank Bennett <@Frank_Bennett> writes:

As some of know, in my own python implementation I’ve been using an internal
model that is essentially an HTML subset + RDFa. The idea is to dump this as
is to get round-trippable output.

But this isn’t going to be for everyone. Still, it would help make output
smarter if at least processors output links where appropriate.

To be a little more specific, in order of importance:

  1. if there’s an HTTP URL in the output, wrap it in an a element
  2. if there’s a DOI in the output, wrap it in an a element, but use a
    link value with a base of ‘http://dx.doi.org/’ so it resolves
  3. on in-text citations, link to the respective anchor in the
    bibliography (might enable some fancy Javascript to display the
    contents on hover?)

I agree that processors should be able to output active links and
internal links in HTML output. As you say, it should probably be
optional to include the links, in case an application or user wants
the output unadorned, but it’s definitely a good idea, and I’d be in
favor of including it as a recommendation in the spec.

I do not have any strong opinion on the subject. I happen to prefer
unadorned output, but implementing what Bruce is suggesting would be
easy in citeproc-hs. I too would prefer to make it an option, but that
is not trivial for the pandoc side, and so I need to see what the pandoc
users think about it. Maybe it could be the pandoc default behavior (it
would be trivial to write a script which post-process the pandoc output
to remove the links).

In many ways, citations work like footnotes. So just as pandoc
produces id anchors and bi-directional links for footnotes, it could
do the same for citations. So lightweight, something like:

(Doe, 1999)

[the bib reference]

You then also have enough structure for additional styling, or
javascript effects.

With RDFa, you can also think about:

[the bib
reference]

… where the citations becomes nodes in a graph.

Bruce

I’m not sure if it’s wise to bump such an old thread, but it seems to be still at the top of the search results on the forums. Mostly I want to connect the issues on the schema tracker with the decade-link discussion about link syntax. Should we be having that discussion here, or on the tracker?