Updated XInclude for Saxon
A recent, but only tangentially related, stylesheet issue persuaded me that my XInclude implementation was flawed. I’ve fixed that.
I’ve just shipped version 5.0.0 of my Saxon XInclude processor in the usual ways. At the level of the stylesheet extension function, this is a drop-in replacement for earlier versions, but it has different behavior in one case (see below) and the internal Java APIs have changed in some backwards-incompatible ways so I bumped the major version number.
What’s different is the way documents that don’t declare a language are handled. Consider:
<doc xmlns:xi="http://www.w3.org/2001/XInclude";
xml:lang="en">
<xi:include href="xx.xml" fragid="element(/1/1)"/>
</doc>
where xx.xml
is:
<chap><p>Something.</p></chap>
With “language fixup” enabled, previous versions of the extension function would produce:
<doc xmlns:xi="http://www.w3.org/2001/XInclude";
xml:lang="en">
<p>Something.</p>
</doc>
But I now think that’s wrong. To see why, consider this example:
<doc xmlns:xi="http://www.w3.org/2001/XInclude";
xml:lang="en">
<xi:include href="cy.xml" fragid="element(/1/1)"/>
</doc>
where cy.xml
is:
<chap xml:lang="cy"><p>Rhywbeth.</p></chap>
Here, I hope, it’s uncontroversial that the correct result is:
<doc xmlns:xi="http://www.w3.org/2001/XInclude";
xml:lang="en">
<p xml:lang="cy">Rhywbeth.</p>
</doc>
The fact that the included elements came from a Welsh language document has to be preserved. I have come to the conclusion that the absence of a language declaration must also be preserved. So the correct result for my first example is
<doc xmlns:xi="http://www.w3.org/2001/XInclude";
xml:lang="en">
<p xml:lang="">Something.</p>
</doc>
I think that’s justified by the specification and it’s what version 5.0.0 produces.
I will make the same change to my XProc step(s) when I next publish them (it’s essentially the same code base, but packaged a bit differently).