so

XML Calabash & Saxon 9.8

Volume 2, Issue 4; 25 Jan 2018

Making XML Calabash (1.1!) work with Saxon 9.8.

In the past few weeks, several folks have requested a version of XML Calabash 1.1 that works with Saxon 9.8. I confess I hadn’t given that upgrade a lot of thought. I’m using Saxon 9.8 in my nascent XProc 3.0 implementation, but that’s still pre-release quality.

It turns out, if you take the XML Calabash 1.1.16 jars and throw Saxon 9.8 at them, it ain’t pretty. It doesn’t run. It doesn’t even compile because of underlying API changes.

Thankfully, Saxonica haven’t changed that much. It only took half an hour or so and a little poking about in the docs to make it compile.

And then I ran the tests. So. Much. Red. About 400 of the tests that I run in junit for sanity checking purposes failed. On further inspection, I decided that two issues were responsible for almost all the failures: missing namespaces and a “system IDs are immutable” error.

It turned out that my tree builder included an explicit use of NamespaceReducer. The namespace reducer, as its name suggests, reduces namespaces. In fact, it discards all “unused” ones. At least, that’s what it does in Saxon 9.8. In earlier releases of Saxon, it appears not to do that. I don’t recall why I added it. XML Calabash cavalierly ignores the “no user servicable parts inside” warnings on various bits of Saxon, unscrews all the tamperproof screws, and takes those covers right off. I’ve cut myself on sharp edges once or twice. The namespace reducer is, I expect, an artifact of some attempt to fix something nasty. Hypothesis: it’s unnecessary and harmless in earlier releases.

I’m much less sanguine about the system IDs issue. XML Calabash attempts to preserve the base URI of documents that flow through XSLT. It does this by setting the system identifier on the resulting document:

xformed.getUnderlyingNode().setSystemId(sysId);

Except that’s an error in Saxon 9.8. Possibly not unreasonably so. I really don’t know how significant this is. In the short term, I simply removed it. That doesn’t cause any tests in the test suite to fail, but that may be more a reflection of the test suite than the feature.

It could very well be that some pipelines will have problems because documents have lost their base URI. (Short, concrete test cases that demonstrate this bug, if it is one, would be appreciated.)

Those two changes fixed all but two of the failing tests.

One of the two remaining tests is an explicit check for a “2.0” processor. Saxon 9.8 reports that it’s a 3.0 processor so, you know, fair play. I fixed that test to check for a “3.0” processor.

The other test was checking that attempting to match on a namespace node was an error. I was relying on Saxon to object, but it isn’t an error in XSLT 3.0, so it doesn’t. I just commented that test out.

That took care of all the obvious errors. Along the way, I also replaced some deprecated methods for setting whitespace stripping.

TL;DR: you can get an EXPERIMENTAL release of XML Calabash 1.1.16 for Saxon 9.8 from github or Maven. It appears to pass the test suite reasonably well, but I will not be surprised if there are problems, especially in pipelines that care about the base URI of transformed documents.

If you try it, I would like to know if you have problems or not.

If you go sticking it into production, well…your gun, your bullet, your foot.

I will try to be responsive to error reports. I am committed to trying to make it work as well as XML Calabash 1.1.16 works with earlier Saxon releases.

Note: you should consider this the EOL announcement for Saxon 9.6 support. If that's a problem, y'all let me know.

Please provide your name and email address. Your email address will not be displayed and I won’t spam you, I promise. Your name and a link to your web address, if you provide one, will be displayed.

Your name:

Your email:

Homepage:

Do you comprehend the words on this page? (Please demonstrate that you aren't a mindless, screen-scraping robot.)

What is nine times five?  (e.g. six plus two is 8)

Enter your comment in the box below. You may style your comment with the CommonMark flavor of Markdown.

All comments are moderated. I don’t promise to preserve all of your formatting and I reserve the right to remove comments for any reason.