so

XProc tip: extracting a document

Volume 10, Issue 10; 02 Mar 2026

A bonus MarkupMonday post.

Suppose you have a document with this structure:

<outer-wrapper>
  <wrapper>
    <!-- comment -->
    <doc/>
  </wrapper>
</outer-wrapper>

And suppose you want to extract the content of wrapper as an XML document.

If you flex your XSLT muscles, it looks easy:

<p:with-input select="/outer-wrapper/wrapper/node()"/>

But the semantics of select in XProc are that it constructs a sequence of documents, one for each item selected. That’s usually what you want in XProc because documents flow between steps.

In this case, that means you get an initial text document containing the newline and spaces inside the wrapper, followed by a document containing a comment (not well-formed XML, but not forbidden by the data model), followed by another text document, then the doc document, etc.

I scratched my head over this one for a few minutes. Eventually, I wound up with this:

<p:unwrap match="/wrapper">
  <p:with-input select="/outer-wrapper/wrapper"/>
</p:unwrap>

The select constructs a document with the root element wrapper and the p:unwrap step removes it, leaving the contents of the element in a single document.

Is there a better way?

#MarkupMonday #XProc

Please provide your name and email address. Your email address will not be displayed and I won’t spam you, I promise. Your name and a link to your web address, if you provide one, will be displayed.

Your name:

Your email:

Homepage:

Do you comprehend the words on this page? (Please demonstrate that you aren't a mindless, screen-scraping robot.)

What is five times three?   (e.g. six plus two is 8)

Enter your comment in the box below. You may style your comment with the CommonMark flavor of Markdown.

All comments are moderated. I don’t promise to preserve all of your formatting and I reserve the right to remove comments for any reason.