Invalid base URIs

Volume 3, Issue 40; 29 Nov 2019

When is an invalid base URI not an invalid base URI?

When is an invalid base URI not an invalid base URI? When it’s in a test suite. Ok, to be fair, it is still an invalid base URI, but, but, but… Test suites are always challenging.

The XProc 3.0 editors have been tearing through the remaining issues, pressing hard to get the specs finished. Achim has been turning out test cases as fast as we can make decisions. (Thank you, Achim!)

One of the decisions we made recently was that invalid base URIs are errors in some contexts. The goal here is interoperable behavior. Given that some implementations will raise errors for them, because the foundations that the implementation is built on will, making them errors assures consistent behavior.

There are tests for this now. As a particular example, one of the tests is a pipeline which contains beauty:

<p:document xml:base="/%gg/" href="file.xml" />

That’s just flatly an error. It fails and it should fail.


Except to format the test suite, I need to process that document: the test suite report includes several pretty-printed forms of each test. Constructing the pretty-printed form involves making an element node:

addStartElement(node, newName, node.getBaseURI());

Guess what happens when you call node.getBaseURI() on that element? Go on, have a guess.

It’s probably possible to detect and work around these cases in the pipeline or the stylesheets that process the documents, but I didn’t relish the challenge. Instead, I cheated.Or, if you prefer, I reprogrammed the simulation. Starting with version 1.1.30, XML Calabash sports an extension with the uninspired name ignore-invalid-xml-base.

I don’t recommend using it. But if you do, it catches the exception thrown in this case and simply drops the base URI on the floor. This is unquestionably the wrong thing (in the sense that the resulting data model is not a faithful representation of the document), but it gets the test suite processor running again.