Towards XML Resolver 2.0.0
I’ve pushed a snapshot release of XML Resolver 2.0.0.
This post has been replaced.
I’m having one of those rat-hole days that occur in programming every so often. I wanted to upgrade a thing, but that meant I had to upgrade another thing, where I found a bug, which led to another thing that I wanted to upgrade but where I discovered I hadn’t implemented a feature, which led, which led, which led…
Ultimately (I hope!), I wound up looking at XML Resolver where I wanted to add a small feature. And where I found a bunch of unfinished API work started seven months ago. ☹
I finished the API work, added my small feature, looked at the open issues list, and decided to close a couple of those while I was in the neighborhood. (This led me to RFC 2397…where I found a bug because there’s always a deeper rat hole.)
On the plus side, closing one of those issues, supporting classpath:
uris, will (I think!) greatly simplify the next project up the stack
and may come in handy in Saxon. (I’m hoping to get the XML Resolver
library into the next major release of Saxon in place of the decades
old Apache libraries.)
The API work I started back in October changes some of the public
facing APIs. I’ve added a generic ResolverFeature
type to make the
API a little more type safe. In practice, I haven’t changed any APIs
that I think are widely used, but in deference to the principle of
marking API changes with major version numbers, I’ve decided to call
the new version 2.0.0.
Given that I was making one potentially breaking change, I decided this was a good time to make another. The resolver class can be configured with either system properties or a properties file. In version 1.x, if a property file is used, the values specified in that file always take precedence.
This means that you can’t selectively override settings for a single
application by specifying a system property. That seems wrong. In 2.x,
if a property is specified in both places, the system property wins.
(I added a property, prefer-property-file,
to preserve the former
behavior but the new behavior is the default.)
Closely related is the question of how to add additional catalogs for a specific project. Just because I want to add the DocBook stylesheets to the catalog for this set of transformations doesn’t mean I want to completely replace all of the catalogs you already have configured!
To make this easier, the 2.0 release adds a new system property,
xml.catalog.additions
, and a new property file key,
catalog-additions
. Both properties take a list of catalog files.
Those catalog files will be added to the list defined by the normal
catalog files properties.
Finally, the two issues I closed were support for data:
URIs and
support for classpath:
URIs.
Data URIs are defined by RFC 2397. They let you “inline” some content directly in the URI. For example, this catalog entry:
<uri name="http://example.com/example.xml"
uri="data:application/xml;base64,PGRvYz5JIHdhcyBhIGRhdGEgVVJJPC9kb2M+Cg=="/>
maps the URI http://example.com/example.xml
to a short XML
document defined by that data URI (It’s <doc>I was a data URI</doc>
).
I’m not sure this is going to be very useful, but it wasn’t hard to do.
Classpath URIs are more interesting. As near as I can tell, they’re defined somewhat informally by the Spring framework. The classpath in Java is a list of directories and/or JAR files. Subject to some constraints imposed by the class loader (which I’m not going to discuss), you can search and load files from those directories or JAR files (a JAR file is just a ZIP file for our purposes here).
What this means is practice is that you can ship a resource
in the JAR file with your application or library and then retrieve it
with a classpath:
URI.
For example, this catalog entry:
<uri name="http://example.com/example.xml"
uri="classpath:path/example-doc.xml"/>
maps the URI http://example.com/example.xml
to a document with the
path path/example-doc.xml
on the classpath. Searches always begin at
the root of the classpath segments, so path/example-doc.xml
and
/path/example-doc.xml
are equivalent.
The classpath:
URI returns theThe resolver also supports classpath*:
but since it’s defined as
concatenating the resources identified, it’s of comparatively little
use in the XML case. first matching file and you (the application writer) can’t control
what else is on the classpath or what order it’s in, so you’d be
better of with a longer, more likely unique name such as
org/docbook/xslTNG/resources/catalog.xml
.
It is now possible to use classpath:
URIs in the catalog. It is also
possible to use classpath:
URIs in the catalog list.
This will, I hope, allow me to put a catalog file in the DocBook
xslTNG distribution which maps URIs to other resources in the
distribution jar file. I will then use xml.catalog.additions
to point
to that catalog file in projects that use DocBook and everything will
“just work”.
We shall see, as I climb my way back up out of this maze of rat holes.