so

XProc tip: extracting a document

Volume 10, Issue 10; 02 Mar 2026

A bonus MarkupMonday post.

Suppose you have a document with this structure:

<outer-wrapper>
  <wrapper>
    <!-- comment -->
    <doc/>
  </wrapper>
</outer-wrapper>

And suppose you want to extract the content of wrapper as an XML document.

If you flex your XSLT muscles, it looks easy:

<p:with-input select="/outer-wrapper/wrapper/node()"/>

But the semantics of select in XProc are that it constructs a sequence of documents, one for each item selected. That’s usually what you want in XProc because documents flow between steps.

In this case, that means you get an initial text document containing the newline and spaces inside the wrapper, followed by a document containing a comment (not well-formed XML, but not forbidden by the data model), followed by another text document, then the doc document, etc.

I scratched my head over this one for a few minutes. Eventually, I wound up with this:

<p:unwrap match="/wrapper">
  <p:with-input select="/outer-wrapper/wrapper"/>
</p:unwrap>

The select constructs a document with the root element wrapper and the p:unwrap step removes it, leaving the contents of the element in a single document.

Is there a better way?

#MarkupMonday #XProc