Random musings on non-XML pipelines

Volume 1, Issue 16; 31 Oct 2017

More than 140 (or even 280) characters about pipelines in text or graphs.

Yesterday, I tweeted about the fact that I have (still!) been thinking with non-XML syntaxes for XProc. In particular, about a text-based non-XML syntax, but also about the possibility of building a pipeline from a more explicit description of the graph. (Pipelines are directed acyclic graphs.) Graph description is something that RDF (or linked data if you prefer) is good at.

I followed up with a 140 character tweet containing a (working!) example of my current text syntax.

pipeline {
    identity {
      source from: "pipe.xpl"

There followed some brief discussion about the relative difficulty of doing it well. In particular Phil proposes that the linked data version would be hard work. Phil knows a whole lot more about linked data than I do (he has chaired the linked data course at XML Summer School!) but I can’t resist a little doodling.

I think what follows is plausibly an RDF version of the pipeline above expressed in turtle:

@prefix : <> .
@prefix s: <> .
@prefix p: <> .

:pipeline a s:Pipeline .

:pipeline s:outputs [ a s:Port ;
        s:name "result" ] .

:identity_a a s:AtomicStep ;
    s:name "a" ;
    s:type p:identity ;
    s:with-input [ s:port "source" ;
        s:from [ a s:Document ;
            s:uri "pipe.xpl" ] ] .

:pipeline s:pipeline ( :identity_a ) .

(It does not, alas, fit in 140 or even 280 characters. Quite.)

It’s obviously going to get much more complicated as I’ve expressed only a tiny slice of the XProc semantics. But it doesn’t look intractable. I don’t think it’d be easy to author pipelines by hand by describing the graph, but it might be a useful intermediate form. Got some crazy idea for a pipeline syntax but don’t feel like writing an LL or LR parser for it and integrating that into the pipeline engine? [Insert unsolicited advertisement for REx Parser Generator here —ed] Write a transformation that expresses the graph behind your syntax and run that!

Anyway. More than 280 characters on the topic. And archived on the open, public web.

Oh, by the way, here’s an XProc 3.0 version of the pipeline we’ve been examining:

<p:declare-step xmlns:p=""
  <p:output port="result"/>
  <p:identity name="a">
    <p:with-input port="source" href="pipe.xpl"/>

If you haven’t been following along, you’ll see that we’ve separated input declaration from binding using input/with-input and href on the input is a shortcut for a nested p:document.

And if you haven’t been following along, please do!