Modular iXML grammars
Modularity is an iXML feature that’s still very much in the experimental stage. My first attempt is available in NineML version 3.3.2.
I’ve been feeling like I need to get back to my iXML implementation. I’ve been (and will continue to be) distracted by other things, but I do have some ideas that I’m eager to try out.
On Saturday, I decided to take a stab at modularity, following along from the ideas that Steven presented in Modular ixml at MarkupUK this year.
In brief, the idea is that instead of having to copy-and-paste rules between grammars, you should be able to refer to them. In the interest of space, consider a toy example. Suppose you’ve defined a grammar for numbers:
number = decimal | hexadecimal .
decimal = digit+ .
hexadecimal = hexdigit+ .
-digit = ["0"-"9"] .
-hexdigit = digit | ["A"-"F" | "a"-"f" ] .
And now you’re working in another grammar where you need a part number:
partnumber = ???
It would probably be silly to reuse decimal
from the numbers grammar for this
purpose in practice, but if you needed a credit card number or an ISBN, it
might make a lot of sense.
The idea of modularity is that in your part number grammar you can use productions from another grammar:
ixml version "1.1-nineml" .
+uses decimal from "numbers.ixml" .
partnumber = -decimal .
There are more details in the documentation, but in brief, the implementation:
- Manages the transitive closure of the nonterminals referenced from the nonterminals you import.
- Renames nonterminals when necessary to avoid collisions.
- Uses aliases to preserve the serialization for renamed nonterminals.
For the moment, you have to have an iXML version declaration that declares
“1.1”. And you have to use the --modular
flag on CoffeePot (or enable the
modularity option in the API):
$ coffeepot -g:partno.ixml --modular 1234
<partnumber>1234</partnumber>
I’ve made slightly different choices from the design presented at MarkupUK: I’ve made the URIs into strings
and I’ve put a full stop at the end of the declarations. (I’ve also allowed you
to omit the +shares
declaration.)
If the Community Group takes up the feature in earnest, we’ll have to come to consensus on these points (and all the others).
At the moment, you can combine productions from several grammars, but you can’t change them. I was asked about an override mechanism, like the one provided by grammar combination in RELAX NG, and I think it’s worth considering.
Why 3.3.2?
Funny you should ask. It’s not technically backwards incompatible with 3.2.9, but it’s got a major new, experimental feature, so I decided to go to 3.3.0. It was 3.3.0 yesterday, but there was some real weirdness in the build on the continuous integration server. It was 3.3.1 around lunch time, but the change in the Sonatype design for publishing to Maven Central is just the gift that keeps on taking. It finally published all the right things in all the right places at 3.3.2.