so…Norm's musings. Make of them what you will.https://so.nwalsh.com/feed/fulltext.xml2021-06-18T17:10:37.276ZNorman WalshReleases. Lots of releases.https://so.nwalsh.com/2021/06/18-release2021-06-18T17:10:37.276Z2021-06-18T17:10:37.276Z

Release the Krak…all the things! I’ve been pushing a bunch of related, if not exactly dependent, projects forward. I think I’ve pushed new releases of all of them now.

Volume 5, Issue 6; 18 Jun 2021

Release the Krak…all the things! I’ve been pushing a bunch of related, if not exactly dependent, projects forward. I think I’ve pushed new releases of all of them now.

org.xmlresolver:xmlresolver:3.0.1beta3

The centerpiece is the work I’ve been doing on the XML Resolver. I summarized that work recently so I won’t repeat it here. The resolver has hundreds of unit tests, but writing a sample application functioned as a good integration test and flushed out a few more bugs. I’m pretty confident of the current beta release, but I’d love to hear from someone else who’s tried it.

XML Resolver SampleApp version 3.0.1beta3

I released the sample application that you can use to try out the new features. It will do well-formed or validating parsing, RELAX NG and/or XML Schema validation, and XSLT transformations. In any combination you’d like.

The distribution is a self-contained, complete application and the project README file includes a full set of sample recipes to try out. Suggestions for more most welcome.

https://xmlresolver.org/

I’ve updated the XML Resolver website with new documentation for the 3.x release. At the moment, this includes updated documentation about the features of the resolver and the JavaDoc for the source code.

I haven’t yet tried to rework the various sources of information about catalog resolvers in general, but it’s on my list.

org.docbook:schemas-docbook:5.2b10a4

A few years ago, I pulled together a Maven release of DocBook 5.1. I’m not actually sure how I did that; I’ve learned a lot about Maven Gradle since then and that may have been a one off. In response to a bug report and also because I wanted to provide a catalog file that the XML Resolver could find automatically, I reworked the build process to produce a Maven release of DocBook 5.2b10a4. (That’s the current test release for the latest beta release of what will be DocBook 5.2.)

Note that I’ve changed the artifact ID; this was the recommended approach because I changed the APIs.

❺ Gradle RELAX NG validate and translate plugins

In order to build and publish a DocBook schema release, there are a bunch of transformations that have to be done, some XSLT, some Trang, and a bunch of validations with Jing. That used to be done with a nice XProc pipeline, but I haven’t got my XML Calabash 3.0 release far enough along yet so I punted back to doing them the hard way.

But the hard way (running each process as a JavaExec task) was slow and inconvenient.

I learned a lot about writing Gradle extensions from reviewing (and patching) Eero Helenius’ saxon-gradle plugin. I thought I could adapt the ideas in that plugin to do plugins for Jing and Trang. Along the way, I also worked out how to call the Jing APIs directly instead of the now moribund “Multi-Schema Validator” written back in the Sun Microsystems days by Kohsuke Kawaguchi of Jenkins fame.

Aside from some weirdness with a lot of tasks running in parallel, they work really well and have made the whole build much faster.

DocBook xslTNG 1.5.0

The XML Resolver isn’t just useful for validation, it’s useful for anything that reads URIs, including stylesheets. I wanted a stylesheet release that would include catalogs that could be found automatically, so I reworked the DocBook xslTNG release. Along the way, I fixed a few bugs and implemented a feature I saw in the online JATS documentation.

DocBook: The Definitive Guide, 15 June 2021 (I think I forgot to bump the version number.)

When Tommie spoke at MarkupUK about tag sets, I noticed that the marginal table of contents on the JATS documentation included a search feature. (This wasn’t relevant to Tommie’s excellent talk, it’s just something I happened to notice.) I didn’t see any reason why the xslTNG marginal table of constents (the “persistent ToC”) couldn’t have that too!

So I implemented it and rebuilt The Definitive Guide. The persistent ToC button doesn’t appear on the home page for the guide. I’m not sure if that’s a bug or not.

❽ Gradle saxon-xslt plugin

I’ve also published my fork of Eero’s Saxon XSLT plugin. I’m still hoping this is a transient plugin and that my patches back to the upstream project will be accepted. But I wanted to be able to use my version directly in build scripts without special casing, so I pushed it to the plugin registry.

Towards XML Resolver 3.0.0!https://so.nwalsh.com/2021/06/03-xml-resolver2021-06-03T18:36:33.976Z2021-06-03T18:36:33.976Z

I’ve pushed a snapshot release of XML Resolver 3.0.0. No, really, I actually mean it.

Volume 5, Issue 5; 03 Jun 2021

I’ve pushed a snapshot release of XML Resolver 3.0.0. No, really, I actually mean it.

Shortly after I did the 2.0.0 release, I was motivated to do a bunch more work on the XML Resolver. (This is partly in support of a couple of projects for my day job; more about those in the near future, I hope.)

I did a serious clean up of the way catalog files are actually managed. The “clever idea” I had way back when, when I forked from the Apache resolver: just load the catalogs as XML DOM instances and navigate around to find catalog entries, was not, actually, I think, very clever. So I’ve replaced that with a proper back end data structure.

I could almost make that work without changing the public API, but it was kind of lame. Instead, I’m just going to take the hit and admit that my 2.0.0 release was premature. There’s nothing wrong with it, but I’ve changed the API again. I thought about just sticking with 2.x and making the breaking change in 2.2.0 (who, if anyone, would notice?) but that doesn’t sit well philosophically and there are a lot of integers. What’s another one between friends?

Once I started writing tests for the new release, I decided I wanted to write a “getting started” repository to demonstrate the new features. And once I started doing that, I got all sorts of ideas for things that should be possible. Most of them were easily supported by the new data structures, so I feel pretty good about everything, really.

Here’s what’s new, in a nutshell. (These are all features; you can disable them if you wish.)

Loading catalogs from the classpath. The work I did on classpath: and jar: URIs meant it became easy to package up some schemas, like DocBook say, in a JAR file, stick them on the classpath, and point to a catalog in that jar file. No more unpacking schema distributions, just point to the resources in the JAR file!

But then, I thought, if you can point into the JAR file, what about if the resolver just automatically found the catalog? Stick the JAR file on your classpath (e.g., declare a dependency in your build tool) and you’re done. Can it all just work, fast and seamlessly?

Yes, I think it can. The XML Resolver now automatically adds any file with the name /org/xmlresolver/catalog.xml on your classpath to the end of your catalog list.

Compare http: and https: catalog entries transparently. The web used to be http:, then the villains moved in and we all switched to https:. Trouble is, it’s easy to copy and paste the old http: URIs in catalogs and it’s easy to copy and paste the new https: URIs into documents.

For a few years, I’ve been conscientiously creating catalog entries for both:

<uri name="http://example.com/thing" uri="/path/to/thing"/>
<uri name="https://example.com/thing" uri="/path/to/thing"/>

That’s just annoying. And not a good general solution anyway since there are plenty of read-only catalogs around (on the web, published in standards, etc.) that only use the http: URIs.

So now I just ignore the distinction in uri comparisons in catalogs. Yes, it’s technically possible for those to be different documents (and if you’ve done that, you can turn this feature off!), but it’s overwhelmingly the case that they’re just aliases and that http: redirects to https: anyway.

I want to be clear: this has no impact on the actual retrieval of documents. This is just about how the system identifier or URI in your document is compared against the system identifeir or URI entry in the catalog.

Mask jar URIs. Most entity resolver APIs are defined to say that if resolution succeeds, the base URI of the resource returned is the base URI of the actual, local resource. This greatly simplifies things because subsequent relative URIs can be resolved against the local resource directly.

Resolve http://example.com/thing to /path/to/thing. Now if thing makes a relative URI reference to otherthing, that gets resolved to /path/to/otherthing automatically, no catalog entry required.

However, the Java URI class does not treat jar: or classpath: URI schemes as hierarchical,

Marginal note:And even if it did, I’m not sure the relevant RFCs support resolution of jar: URIs in the way that you need for them to work as hierarchical URIs anyway. No hating on java.net.URI here.

so any subsequent attempts to resolve relative URIs will fail. To fix this, the XML Resolver lies. If the URI is a jar: or classpath: URI, it returns the locally resolved resource, but leaves the base URI unchanged.

This does mean that you’ll need a more complete catalog. If you want the relative reference to otherthing to work, you’ll have to have a catalog entry for http://example.com/otherthing because that’s what the process will attempt to retrieve. (In most cases, I’ve found that the “rewrite” catalog entry make this pretty easy.)

Support alternate catalog loaders. By design, the resolver doesn’t report errors or raise exceptions for invalid or missing catalogs. You don’t want your app crashing in production because someone made a typo in a catalog.

On the other hand, it’s really easy to make typos and telling folks they ought to validate the catalogs they publish only gets you so far. Now there’s a property you can set which tells the resolver to use a validating loader. It raises an exception if there’s an error. That’s the first thing to try if you think catalog resolution isn’t working!

(I’ve also reworked the logging so that it’s easier to get logs out and the log messages are, I hope, a little clearer about what the resolver looked for in the catalog and what it decided to return.)

Actually get the RDDL document parsing right. I don’t know why I care. I don’t think RDDL ever got that widely deployed, but I think it’s a neat idea. I wrote more tests and fixed more bugs. I think it actually works now, for what its worth.

More tests. There are now seven hundred some odd tests instead of, I dunno, eleven or something. I’m a lot more confident that this release is doing the right thing. And the “getting started” repository functions as a nice set of integration tests.

The XML Resolver 3.0.0 release includes a “data” JAR file that provides a lot of common W3C resources. It’s a separate JAR file so that you don’t have to download it or put it on your class path, but if you do, it’ll just work automatically. This should make it really easy to avoid the ten second delay imposed by www.w3.org if you attempt to get popular DTDs or schemas from there. (This is, not coincidentally, the same, or a very small superset, of the W3C resources that have historically been included with Saxon.)

As I said, I’m really pleased with how this has come together. I’m working on a new DocBook schemas release to take advantage of these features and finishing up the “getting started” repository that let’s you play with it in a “real” application.

Here’s hoping it makes things easier for you!