so

Documents that format themselves

Volume 4, Issue 33; 05 Jul 2020

The xsl:evaluate instruction is interesting.

I won’t bury the lede:

<xsl:template match="processing-instruction('eval')">
  <xsl:evaluate xpath="string(.)"
                context-item="."
                namespace-context="."/>
</xsl:template>

What happened was, I was writing a template to support a processing instruction when I had a thought. My thought was the template above. Maybe everyone else has already had this thought and I’m just slow on the uptake.

Having written that template, I immediately wrote this paragraph into the document I was writing:

<para>This is paragraph number:
<?eval count(preceding::*:para intersect ../..//*:para)?>.</para>

It formatted as

This is paragraph number: 14.

I don’t know what to think about that.

Don’t get me wrong, I’m a fan of xsl:evaluate and I’ve used it elsewhere to great effect. Consider, for example, the task of breaking a large document up into a set of web pages. Before the evaluate instruction, that was controlled by four or five different parameters and if you wanted any exceptions, you had to roll up your sleeves and write some XSLT code. With the evaluate instruction, it’s a couple of short lists of XPath expressions: make anything that matches one of these expressions into a separate page, unless it also matches one of these other expressions, in which case don’t.

But just “run this arbitrary XPath expression” feels different.

This problem is hardly new or unique to XSLT. It occurs in most languages with an eval instruction and most (dynamic) languages have, or eventually have, an eval function.

I guess it’s the way that the eval processing instruction turns ordinary documents into programs that I find particularly interesting. I don’t tend to think of my documents as programs. But they’re markup, of course, so they can be anything I want them to be.

Web mentions

Comments

If the document can contain arbitrary expressions, and the stylesheet evaluates them, does this expose the stylesheet to attack?

—Posted by Simon Dew on 06 Jul 2020 @ 09:47 UTC #

Perhaps. You can “only” evaluate XPath expressions, so you can't do completely arbitrary things. And you can't construct nodes. Well, except that you might be able to call extension functions, at which point all bets are off.

Please provide your name and email address. Your email address will not be displayed and I won’t spam you, I promise. Your name and a link to your web address, if you provide one, will be displayed.

Your name:

Your email:

Homepage:

Do you comprehend the words on this page? (Please demonstrate that you aren't a mindless, screen-scraping robot.)

What is nine plus five?  (e.g. six plus two is 8)

Enter your comment in the box below. You may style your comment with the CommonMark flavor of Markdown.

All comments are moderated. I don’t promise to preserve all of your formatting and I reserve the right to remove comments for any reason.