adding count-min, count-max; cleaning up condition pattern?

OK, per previous discussion, I’m about to two new attributes to the
schema “condition” pattern:

attribute count-min { xsd:positiveIinteger }
attribute count-max { xsd:positiveIinteger }

Effectively, when we have just a variable on a condition, this would
be a shortcut for including ‘count-min=“1”’.

Now, my question is, do we need to revisit the existing schema
definition for “condition” (see below)? Right now, the following would
be valid:

Do we intend this to be valid, and understand what it means? If yes,
then how should I add the new attributes? If no, what sorts of
constraints would we like to add (if any)?

Bruce

condition =

## If the entry is of a given type, this is true
attribute type {
  list { cs-types+ }
}?,

## If a given variable exists, this is true
attribute variable {
  list { all-variables+ }
}?,

## If a given variable contains numeric data, this is true
attribute is-numeric {
  list { all-variables+ }
}?,

## If a given variable contains a date, this s true
attribute is-date {
  list { all-variables+ }
}?,

## The position of a citation. Whenever position="ibid-with-locator"
## is true, position="ibid" is also true, and whenever position="ibid"
## is true, position="subsequent" is also true
attribute position {
  list { ("first" | "subsequent" | "ibid" | "ibid-with-locator")+ }
}?,

## If text inside an <if disambiguate="true"> block can be used to
## differentiate two otherwise identical citations, it will be added.
## If the citations remain identical after its addition, it will not
## be added.
attribute disambiguate { xsd:boolean }?,

## A conditional on the locator for this specific entry
attribute locator {
   list { cs-terms.locator+ }
}?,
attribute match { "all" | "any" | "none" }?

My answers …

OK, per previous discussion, I’m about to two new attributes to the
schema “condition” pattern:

attribute count-min { xsd:positiveIinteger }
attribute count-max { xsd:positiveIinteger }

Effectively, when we have just a variable on a condition, this would
be a shortcut for including ‘count-min=“1”’.

Now, my question is, do we need to revisit the existing schema
definition for “condition” (see below)? Right now, the following would
be valid:

Do we intend this to be valid, and understand what it means?

To be honest, I don’t know what the above means, and I think it’s too loose.

OTOH, I’m not exactly sure how we should change it, so in the absence
of specific feedback, I guess I’d suggest leaving it as is.

If yes, then how should I add the new attributes?

I think given the above conclusion, I should just add them as
additional choice options.

If no, what sorts of constraints would we like to add (if any)?

As I say above, am not sure; need feedback.

Bruce

My answers …

OK, per previous discussion, I’m about to two new attributes to the
schema “condition” pattern:

attribute count-min { xsd:positiveIinteger }
attribute count-max { xsd:positiveIinteger }

Effectively, when we have just a variable on a condition, this would
be a shortcut for including ‘count-min=“1”’.

I didn’t follow the discussion that generated these two attributes -
which might be useful somehow, if you wanted the opinion of someone
jumping in and willing to implement the schema, without knowing the
reasons that contributed in shaping it, in a strongly typed language,
say Haskell. In which case I would ask “count-min and max what?”, if I
weren’t going to look a bit dumb with all my silly questions about all
these new attributes.

But here I’m just speculating… :slight_smile:

Now, my question is, do we need to revisit the existing schema
definition for “condition” (see below)? Right now, the following would
be valid:

Do we intend this to be valid, and understand what it means?

To be honest, I don’t know what the above means, and I think it’s too loose.

OTOH, I’m not exactly sure how we should change it, so in the absence
of specific feedback, I guess I’d suggest leaving it as is.

I don’t know what that means, but it seems to me a pretty valid
statement, which could actually mean something within some piece of
code: a set of conditions that must be false. All them seem to have a
way of being false and true, I would add.

If yes, then how should I add the new attributes?

I think given the above conclusion, I should just add them as
additional choice options.

If no, what sorts of constraints would we like to add (if any)?

As I say above, am not sure; need feedback.

I think it should be clear what can be counted and how.

Andreaa

Not silly; this is a good question.

The use case involved counting authors. See:

https://sourceforge.net/mailarchive/message.php?msg_id=1211795114.17660.49.camel%40piura

It might suggest that if we do add this, we constrain it to authors.

Bruce

Maybe it could be useful to count page ranges, volume numbers, and
similar. And we could also add an additional “count-eq” attribute to
match a specific number, even though I do not have an actual example
in mind where this could be useful. But from the point of view of the
implementation that wouldn’t be difficult.

Just thinking aloud.

Andrea

So the question, then, for everyone:

  1. do we add count-eq?

  2. do we constrain the count attributes to specific variables? If yes,
    which ones?

  3. is the pattern a list of variables, or a single variable? E.g. is
    the following valid?

…?

Bruce

The use case involved counting authors. See:

https://sourceforge.net/mailarchive/message.php?msg_id=1211795114.17660.49.camel%40piura

It might suggest that if we do add this, we constrain it to authors.

Maybe it could be useful to count page ranges, volume numbers, and
similar. And we could also add an additional “count-eq” attribute to
match a specific number, even though I do not have an actual example
in mind where this could be useful. But from the point of view of the
implementation that wouldn’t be difficult.

So the question, then, for everyone:

  1. do we add count-eq?

  2. do we constrain the count attributes to specific variables? If yes,
    which ones?

  3. is the pattern a list of variables, or a single variable? E.g. is
    the following valid?

The ambiguity that this would introduce kind of makes me nervous. You
might get different behaviour from different implementations if, say,
mixed parameters are specified. Can the variable to be counted be set
in a separate attribute, like counted-variable=?

Frank

  1. is the pattern a list of variables, or a single variable? E.g. is
    the following valid?

The ambiguity that this would introduce kind of makes me nervous. You
might get different behaviour from different implementations if, say,
mixed parameters are specified. Can the variable to be counted be set
in a separate attribute, like counted-variable=?

How is that any better, particularly when we consider that I proposed
the existing if/variable pattern is simply a shorthand for
if/variable/count-min=“1”?

The real problem with ambiguity, it seems to me, is how you describe
counting authors where there are substitutions (of, say, editors).

Bruce

  1. is the pattern a list of variables, or a single variable? E.g. is
    the following valid?

The ambiguity that this would introduce kind of makes me nervous. You
might get different behaviour from different implementations if, say,
mixed parameters are specified. Can the variable to be counted be set
in a separate attribute, like counted-variable=?

How is that any better, particularly when we consider that I proposed
the existing if/variable pattern is simply a shorthand for
if/variable/count-min=“1”?

The real problem with ambiguity, it seems to me, is how you describe
counting authors where there are substitutions (of, say, editors).

?

The real problem with ambiguity, it seems to me, is how you describe
counting authors where there are substitutions (of, say, editors).

That doesn’t help. The case is where sorting is based on the number of
authors. I guess this would work, though:

Bruce

The real problem with ambiguity, it seems to me, is how you describe
counting authors where there are substitutions (of, say, editors).

That doesn’t help. The case is where sorting is based on the number of
authors. I guess this would work, though:

shows the ambiguity in the syntax.
If an entry has a title and two authors, is this true (because the countable
variable satisfies its condition and the non-countable variable is present)
or false (because the non-countable variable is counted as only one item).

Counting a non-countable variable doesn’t make sense, but can catch
attempts to do so with the validator? If not, I’m not sure it’s a good idea to
blur the meaning of variable=. That’s why I thought using a separate attribute
for the countable vars might help avoid confusion.

shows the ambiguity in the syntax.
If an entry has a title and two authors, is this true (because the countable
variable satisfies its condition and the non-countable variable is present)
or false (because the non-countable variable is counted as only one item).

How is the title “non-countable”? Granted, there will only ever be 1.

Counting a non-countable variable doesn’t make sense, but can catch
attempts to do so with the validator? If not, I’m not sure it’s a good idea to
blur the meaning of variable=. That’s why I thought using a separate attribute
for the countable vars might help avoid confusion.

With RELAX NG, we have quite a bit of flexibility to constrain these
sorts of things.

Bruce

shows the ambiguity in the syntax.
If an entry has a title and two authors, is this true (because the countable
variable satisfies its condition and the non-countable variable is present)
or false (because the non-countable variable is counted as only one item).

How is the title “non-countable”? Granted, there will only ever be 1.

Only that, that there will never be more than 1. So if I understand
correctly, the idea is that count-min and count-max are applied to
each variable in the list, so with two authors present, the example
would be false with match=“all”, true with match=“any”, and false with
match=“none”. OK, that’s clear. Sorry for being dense.

For edge cases, should count-max=“0” be an infinite value? Also,
should count-min be constrained to a value of 1 or greater? If it’s
allowed to be 0, it would reverse the meaning of match=“none”, etc.
(again that would be a senseless parameter, but it might be good for
the validator to protect us from ourselves).

Counting a non-countable variable doesn’t make sense, but can catch
attempts to do so with the validator? If not, I’m not sure it’s a good idea to
blur the meaning of variable=. That’s why I thought using a separate attribute
for the countable vars might help avoid confusion.

With RELAX NG, we have quite a bit of flexibility to constrain these
sorts of things.

I’ll be surprised if this one is possible. The constraints imposed on
one attribute can depend on whether or not a separate attribute is
present in the same element? How do you specify that in the rnc file?

Counting a non-countable variable doesn’t make sense, but can catch
attempts to do so with the validator? If not, I’m not sure it’s a good idea to
blur the meaning of variable=. That’s why I thought using a separate attribute
for the countable vars might help avoid confusion.

With RELAX NG, we have quite a bit of flexibility to constrain these
sorts of things.

I’ll be surprised if this one is possible. The constraints imposed on
one attribute can depend on whether or not a separate attribute is
present in the same element?

That’s not a problem in RNG (or Schematron).

How do you specify that in the rnc file?

As an example, you do:

start = element test { a, c | b, d }

a = attribute x { “one” }
b = attribute x { “two” }
c = attribute y { “three” }
d = attribute y { “four” }

… which would mean that these are both valid:

… but this is not:

Granted, this can get a little tedious in a complex schema, and it’s
not supported in other schema languages (like, say, XSD).

It’s stuff like this, BTW, that shows how fantastically cool RNG is!
Not only really easy to write and maintain in the compact syntax, but
also really powerful.

Bruce

Counting a non-countable variable doesn’t make sense, but can catch
attempts to do so with the validator? If not, I’m not sure it’s a good idea to
blur the meaning of variable=. That’s why I thought using a separate attribute
for the countable vars might help avoid confusion.

With RELAX NG, we have quite a bit of flexibility to constrain these
sorts of things.

I’ll be surprised if this one is possible. The constraints imposed on
one attribute can depend on whether or not a separate attribute is
present in the same element?

That’s not a problem in RNG (or Schematron).

How do you specify that in the rnc file?

As an example, you do:

start = element test { a, c | b, d }

a = attribute x { “one” }
b = attribute x { “two” }
c = attribute y { “three” }
d = attribute y { “four” }

… which would mean that these are both valid:

… but this is not:

Granted, this can get a little tedious in a complex schema, and it’s
not supported in other schema languages (like, say, XSD).

It’s stuff like this, BTW, that shows how fantastically cool RNG is!
Not only really easy to write and maintain in the compact syntax, but
also really powerful.

That’s impressive!