XProc tip: extracting a document
A bonus MarkupMonday post.
Suppose you have a document with this structure:
<outer-wrapper>
<wrapper>
<!-- comment -->
<doc/>
</wrapper>
</outer-wrapper>
And suppose you want to extract the content of wrapper as an XML document.
If you flex your XSLT muscles, it looks easy:
<p:with-input select="/outer-wrapper/wrapper/node()"/>
But the semantics of select in XProc are that it constructs a sequence of
documents, one for each item selected. That’s usually what you want in XProc
because documents flow between steps.
In this case, that means you get an initial text document containing the newline
and spaces inside the wrapper, followed by a document containing a comment
(not well-formed XML, but not forbidden by the data model), followed by another
text document, then the doc document, etc.
I scratched my head over this one for a few minutes. Eventually, I wound up with this:
<p:unwrap match="/wrapper">
<p:with-input select="/outer-wrapper/wrapper"/>
</p:unwrap>
The select constructs a document with the root element wrapper and
the p:unwrap step removes it, leaving the contents of the element in a single
document.
Is there a better way?