The Monitoring API
That sure is a dull title. If I cared about “engagement”, I’d find a way to stick “interactive XProc pipeline debugger” or “Schematron assertions in pipelines” into it. But I don’t, really. Care that is. ’s all true, though.
A few days ago, I added a tracing feature to XML Calabash. The idea is to capture a record of all the steps that execute, in the order they run, and how long they take. It can also capture the documents that flow between the steps. This offers both information about performance and the ability to do a sort of post mortem of what a pipeline did.
If you got the wrong results, you can examine all of the steps, their inputs and outputs, and hopefully figure out what’s wrong. I suppose, equally, if you got the right results, you can look at how that happened too!
The next day, I was wondering if I could build some sort of UI that would make it easier to review the trace output: let you step through, show you the documents, etc. Then it struck me: why not just do that in XML Calabash? The tracing tool is almost a debugger!
So I wrote an interactive debugger. You can fire up a pipeline in the debugger, step through the execution, set break points, examine options, and even change options. You can set breakpoints not just on steps, but also on documents. And you can change them too! It could turn out that it’s just a party trick, but it feels like something potentially really useful.
I’ve been thinking a lot about how to make pipelines easier to use.Penance for the abysmal error reporting in my 1.0 implementation? You might think that, I could’t possibly comment. Being able to debug pipelines interactively could be really useful, but you have to know that there’s something wrong with the pipeline so you know it needs debugging. If you have a pipeline that produces a thousand web pages or a whole book, “is it right?” can be a hard question to answer with confidence. That got me thinking about assertions. What if you could say: after this step executes, the following conditions must be true about the document(s) it produced.
I spent five minutes thinking about an assertion language before my brain pointed out that I already have one of those; Schematron is integrated into the pipeline engine. There’s even a convenient API for just this sort of thing because that’s how the test driver works.
So I added Schematron assertions. You stick the actual Schematron in
p:pipeinfo
elements and then refer to them in the pipeline. You can fire them
on inputs or outputs. Assertion failures can be treated either as warnings or
fatal errors.
There’s still room for some obvious improvements:
- The existing API can’t test non-XML (or, more precisely, non-XDM-node) results. You can’t write a Schematron assertion against a map output or a binary document, for example. I have no idea what level of support Schematron has for that kind of thing.
- Today, you can only test documents that flow between steps. I don’t think it will be too difficult to extend it so that an assertion can be made on the options to a step or on a the value of a variable.
- The ability to load a schema from a file would also be a useful feature, I think.
Share and enjoy. These features are all in alpha7, published today. (But as I said before, you want the most recent release, whatever that is when you read this.)
Fair warning: all of these features are on the bleeding edge of implementation. If you push them, you’ll probably find bugs. Please report them.
P.S. What does any of this have to do with the Monitoring API? Good question. By the time I’d implemented an API for tracing, an API for debugging, and an API for Schematron assertions, it was clear that there was a common API. That’s the Monitoring API. It means new ideas along these lines are relatively easy to implement and it means there’s the potential for users to add their own monitoring features. (Though the API for doing that doesn’t exist yet!)