Why term-set?

Johan_Kool2 · August 23, 2006, 3:12pm

Hello,

Why are some of the terms in the locales files wrapped in term-set?
Doesn’t make it easier. I liked the previous style much better.

Johan—
http://www.johankool.nl/

Bruce_D_Arcus1 · August 23, 2006, 3:22pm

That was Simon’s design.

I think a term-set refers to a collection of role or locator terms.
E.g. in those cases, you aren’t doing a simple 1:1 mapping of a label
to a term (as the simple term elements), but generally conditional
lookup. The old approach didn’t account for that, and so would have
broken I think.

I’ve not looked at it yet closely, but it seemed fine to me. What’s the
problem, and what’s the (better) solution, given my comments above?

Bruce

Johan_Kool2 · August 23, 2006, 3:46pm

My problem is that I now have to figure out in which set a term
occurs. E.g. is it in locators or in roles or in months or none of
these? Before I would simply put the string “month-01-short” into a
simple function that did the look up. Very simple.

I’d say we either go the full way and do it like this:

page pages paragraph paragraph p pp ¶ ¶¶

so, do it the whole way, or go back to flat. This halfway in between
is just annoying. Perhaps it should be <long|

<single|multiple> instead.

Johan

Bruce_D_Arcus1 · August 23, 2006, 3:57pm

I’m going to let Simon respond here, and see if he can address your
concerns.

Bruce

Simon_Kornblith · August 23, 2006, 4:46pm

I chose the approach because it’s extensible. “Long” and "short"
won’t necessarily cover everything. We also need “verb” for contributors
(e.g., “Edited By”), which poses a possible schema issue because we
shouldn’t have “verb” for locators. It’s also conceivable that we’ll
discover we need other term-sets in the future, and defining a new term-set
is much simpler than altering the schema to accommodate a new type of term.
and are data-dependent, but term-sets are
style-dependent, so the paradigm appears consistent to me.

To implement term-set in my code, I simply added an additional "term-set"
argument to the function that retrieves locale terms. In this fashion, it’s
exceedingly simple to map something like
into a term, easier than if there were separate / tags. Rather
than doing:

var month = getTerm(“month-01-short”);

I do:

var month = getTerm(“month-01”, “months-short”);

I suppose the term-set isn’t absolutely necessary for months, since no part
of the style directly references them, but I figured I might as well put
them there for consistency. If you disagree, feel free to take them out
(although there shouldn’t be any difference in complexity). We could also
change “month-01” to “01”; there’s no reason “month” has to be in the term
name once it’s in a term-set.

Hope this clears things up.

Simon

Bruce_D_Arcus1 · August 23, 2006, 5:03pm

I chose the approach because it’s extensible. “Long” and “short”
won’t necessarily cover everything. We also need “verb” for contributors
(e.g., “Edited By”), which poses a possible schema issue because we
shouldn’t have “verb” for locators.

Actually, that’s no problem in RELAX NG. Just define separate patterns fro each.

It’s also conceivable that we’ll
discover we need other term-sets in the future, and defining a new term-set
is much simpler than altering the schema to accommodate a new type of term.

This is important. The less we have to hard-code in the schema, the
easier things will be long-term.

and are data-dependent, but term-sets are
style-dependent, so the paradigm appears consistent to me.

To implement term-set in my code, I simply added an additional “term-set”
argument to the function that retrieves locale terms. In this fashion, it’s
exceedingly simple to map something like
into a term, easier than if there were separate / tags. Rather
than doing:

var month = getTerm(“month-01-short”);

I do:

var month = getTerm(“month-01”, “months-short”);

Good of you to give the example code here. Getting rid of the “month”
makes it even easier, since you can just pull the value from the data
do

getTerm(month, “months-short”)

I suppose the term-set isn’t absolutely necessary for months, since no part
of the style directly references them, but I figured I might as well put
them there for consistency. If you disagree, feel free to take them out
(although there shouldn’t be any difference in complexity). We could also
change “month-01” to “01”; there’s no reason “month” has to be in the term
name once it’s in a term-set.

+1

Bruce

Johan_Kool2 · August 24, 2006, 2:16pm

I’ve updated locales_nl.xml to reflect the proposed changes. Would
this be an acceptable format?

Johan

<?xml version="1.0" encoding="UTF-8"?> in ibid benaderd in voorbereiding Referenties en uit pagina pagina's p pp redacteur redacteurs red reds vertaler vertalers vert verts januari jan februari feb maart maa april apr mei mei june jun juli jul augustus aug september sep oktober okt november nov december dec

Bruce_D_Arcus1 · August 26, 2006, 4:35pm

Simon; thoughts?

Bruce

Simon_Kornblith · August 26, 2006, 4:44pm

There’s no purpose to a tag if there are no separate sets of
terms. If we do go this route, we should get rid of it.
In its current form, this specification would need an extra tag to handle
verb roles (e.g., Edited By). While this is not too big of a deal, it
underscores the lack of extensibility inherent in the approach.
For me, at least, this approach is no simpler and probably harder to code
than the term-set approach.

An alternative that would address these concerns is:

in
ibid

pagina
pagina’s

p
pp

And then:

I’m not completely sure of the benefits this approach has over the current
one, but I’m happy to go along with it if it would make things easier for
others.

Simon

Bruce_D_Arcus1 · August 26, 2006, 4:54pm

There’s no purpose to a tag if there are no separate
sets of
terms. If we do go this route, we should get rid of it.

In its current form, this specification would need an extra tag to
handle
verb roles (e.g., Edited By). While this is not too big of a deal, it
underscores the lack of extensibility inherent in the approach.

Let me see if I understand the differences. Johan’s version is:

	<term-set name="roles">
		<term name="editor">
			<long>
				<single>redacteur</single>
				<multiple>redacteurs</multiple>
			</long>
			<short>
				<single>red</single>
				<multiple>reds</multiple>
			</short>
		</term>

Your’s does not have a separate elements for the different forms, but
instead has compound set names? E.g.

	<term-set name="roles">
	   <term name="editor">
	     <single>redacteur</single>
	     <multiple>redacteurs</multiple>
	   </term>
	   <term name="editor-short">
	     <single>red</single>
	     <multiple>reds</multiple>
	   </term>

Is that the difference?

An alternative that would address these concerns is:

in
ibid

pagina
pagina’s

p
pp

And then:

I agree this is better for the more structured approach.

I’m not completely sure of the benefits this approach has over the
current
one, but I’m happy to go along with it if it would make things easier
for
others.

Johan, care to elaborate on why we need a more structured approach?
E.g. what are the practical benefits?

Bruce

Simon_Kornblith · August 26, 2006, 5:03pm

Actually, mine defines separate term-sets which are then referenced from the
label:

editor editors translator translators ed eds tran trans

And then:

But the basic principle is the same.

The benefit I can see to the more structured approach is that if a style
tries to get a form of a term that a given locales.xml file doesn’t support,
it can use a different form of the same term, rather than rolling over to
the English version. I’m not sure how much help this is, because either way
there’s a pretty big discrepancy between the desired output and the real
output, but it’s worth some thought.

Simon

Johan_Kool2 · August 27, 2006, 8:15pm

Hello,

To answer a few remarks/questions.

if a style tries to get a form of a term that a given locales.xml
file doesn’t support, it can use a different form of the same term,
rather than rolling over to the English version.

That would be very odd and confusing behaviour. If I see English text
appearing in my output I know what’s wrong: localization is missing.
If I see text appearing I didn’t expect, it could be either a missing
translation or a misconfigured style. That sounds very confusing to
work with.

There’s no purpose to a tag if there are no separate
sets of
terms. If we do go this route, we should get rid of it.

In its current form, this specification would need an extra tag to
handle
verb roles (e.g., Edited By). While this is not too big of a deal, it
underscores the lack of extensibility inherent in the approach.

My problem is ao that it is unhandy to have to construct a special
string to look up a text. Perhaps we could do by using the term-set
as before, but by splitting it up into 2 attributes. Or perhaps on
term-set instead of term.

	<term-set name="roles">
	   <term name="editor" form="long">
	     <single>redacteur</single>
	     <multiple>redacteurs</multiple>
	   </term>
	   <term name="editor" form="short">
	     <single>red</single>
	     <multiple>reds</multiple>
	   </term>

This is also much more flexible, but allows simpler ways to find
something in the file. Is that an idea?

Johan

Bruce_D_Arcus1 · August 28, 2006, 12:58am

Yes, it’s an idea. I’m pretty much agnostic about which is better
though.

Any other opinions?

Bruce

Simon_Kornblith · August 28, 2006, 3:42pm

if a style tries to get a form of a term that a given locales.xml
file doesn’t support, it can use a different form of the same term,
rather than rolling over to the English version.

That would be very odd and confusing behaviour. If I see English text
appearing in my output I know what’s wrong: localization is missing.
If I see text appearing I didn’t expect, it could be either a missing
translation or a misconfigured style. That sounds very confusing to
work with.

Then it seems like, from a feature standpoint, there’s no advantage to a
more structured approach over the current one.

There’s no purpose to a tag if there are no separate
sets of
terms. If we do go this route, we should get rid of it.

In its current form, this specification would need an extra tag to
handle
verb roles (e.g., Edited By). While this is not too big of a deal, it
underscores the lack of extensibility inherent in the approach.

My problem is ao that it is unhandy to have to construct a special
string to look up a text. Perhaps we could do by using the term-set
as before, but by splitting it up into 2 attributes. Or perhaps on
term-set instead of term.
redacteur redacteurs red reds
This is also much more flexible, but allows simpler ways to find
something in the file. Is that an idea?

Again, the is useless if you’re going to put the form on each
term.

I still don’t exactly see where the difficulty exists in implementing term
sets, and I wonder if you’ve misunderstood the concept. You should know
exactly what term-set to look in at all times from the “term-set” attribute
on . (I think I only updated Chicago and APA, but it’s trivial to
add the attribute to the other styles.) You should never have to loop
through all of the sets. What makes the approach you gave above any easier
to implement? I am trying to think of an XML API that would make this
difficult, but I can’t.

Simon

Bruce_D_Arcus1 · August 28, 2006, 4:01pm

Yeah, I think what he’s suggesting would mean in fact this:

… while your suggestion was just:

I agree the difference is trivial. I guess in the absence of a clear
reason why the latter is problematic, I’d stay with the second.

I suppose one advantage to the former is that the schema is then a
little simpler (because there’s then less “name” options).

Bruce

Johan_Kool2 · August 28, 2006, 9:22pm

Ok. Keep it the way it is then. It seems odd to me to be
concatenating attributes into one attribute, but well… it’s not
such a big a deal that I am going to care that much about it.

Cheers,

Johan

Bruce_D_Arcus1 · August 28, 2006, 9:47pm

I’m going to look at which of the two options is easier to implement
(and maintain) in the schema, and then decide based on that.

Bruce

Simon_Kornblith · August 30, 2006, 6:30am

I’ve changed the format to something that should satisfy Johan’s objections
while further simplifying the schema. Now, we simply have:

<term name="page">
  <single>page</single>
  <multiple>pages</multiple>
</term>
<term name="page" form="short">
  <single>p</single>
  <multiple>pp</multiple>
</term>

Labels are now:

All we need in the schema is:

attribute form { token }?

Or, if we want to formalize the what’s currently in the XML file within the
schema:

attribute form { “short” | “verb” }?

As in other places in CSL (e.g., contributors and titles), form defaults to
long, then switches to short if specified. I figured that this makes more
sense if we need a term in the short form ( makes
more sense than ). If there are any complaints,
I’m willing to revert to the old version.

Simon

Johan_Kool2 · August 30, 2006, 9:17am

Thanks Simon,

I think it looks good this way. Esp. vs. .

Johan

Bruce_D_Arcus1 · August 30, 2006, 9:49am

Labels are now:

How do you indicate the short verb form of contributors?

All we need in the schema is:

attribute form { token }?

Or, if we want to formalize the what’s currently in the XML file
within the
schema:

attribute form { “short” | “verb” }?

All of this stuff should be formalized. If there’s a reason to allow
extension independent of explicit extension in the schema (not the case
here I believe), then we build in an extension points like so:

define a pattern for the options; say “label-forms”
add an extension pattern like so:

labels-forms = “short” | “long” | label-forms.extension
label-forms-extension = notAllowed

A custom schema can then override the latter with custom terms.

There are few places in CSL where this is necessary though.

Bruce

Topic		Replies	Views
Short roles/verb roles/role capitalization CSL Development	11	268	August 15, 2006
CSL Questions CSL Development	60	550	March 6, 2007
csl changes CSL Development	19	281	February 10, 2006
Bringing back up language CSL Development	7	293	April 21, 2011
Pluralization of ordinary item field terms CSL Development	16	338	February 6, 2011

Why term-set?

Related topics