Know when to give up

Volume 3, Issue 16; 01 Sep 2019

Published by Norman Walsh

Sometimes it’s quicker (and probably necessary) to start over.

Plan to throw one away; you will, anyhow.
— Fred Brooks

It’s interesting, though only tangential to the subject I have in mind, that Fred Brooks recanted the famous and oft-cited quotation that I’ve reproduced above. In the twentieth anniversary edition of The Mythical Man-Month (now almost 25 years old!), he says:

This I now perceived to be wrong, not because it is too radical, but because it is too simplistic. The biggest mistake in the “Build one to throw away” concept is that it implicitly assumes the classical sequential or waterfall model of software construction.

I think the Agile crowd would agree. A significant school of thought in modern software development favors incremental improvement and constant refactoring, often supported by test or behavior driven development. Go team!

That being said…

The impetus for this posting is some recent work on JAFPL. JAFPL is the core pipeline engine that supports my recent work on XML Calabash to support XProc 3.0.

JAFPL is a (pipeline)-language-agnostic, data-format-agnostic data-flow-processing engine. It provides a bunch of primitives (inputs, outputs, loops, choices, exception handling, etc.). You construct a data flow graph with the API and you run that graph. Practically speaking, you have to implement some actual data processing steps too, but the goal is that it should be a useful engine for pipeline processing even if you don’t care about XProc (or even XML).

XML Calabash interprets XProc 3.0 pipelines and constructs the appropriate graph. It also implements the XProc 3.0 data processing steps, so running the pipeline does what you expect.

My new implementation is written mostly in Scala. I’m still relying on Saxon, so I’m still on the JVM. I elected to use the Akka toolkit to implement the pipeline engine. Using actors and message passing seemed like a really solid foundation for building a multithreading pipeline processor.

That initial JAFPL implementation was my first serious attempt to build a message passing system and my first exposure to the Akka framework.

In the couple of years since I started that work, I’ve fixed bugs, added features, and built an XProc implementation on top of it that passes about three quarters of the current test suite.

There were things I didn’t like about it. I’d implemented a kind of hack early on, in an attempt to work around some issues I was having with the sequencing of messages. Rewriting it was a task on the back burner; I wanted to get to parity with MorganaXProc in the test suite first!

And then, roughly a month ago, I turned my attention to this failing test: ab-for-each-003. In brief, the test loops over three paragraphs. In the loop, it processes the first and last paragraphs differently from the other paragraphs.

That test failed. And it failed badly. The gory details aren’t important: it was another message sequencing issue. After three utterly frustrating weeks of trying to make it work, I decided it couldn’t be done. I gave up—not on passing that test, but on the pipeline implementation, which I concluded was fatally flawed.

It took me about three days to rewrite it. Three. Days.

I had made bad design choices in my first attempt. I don’t say that pejoratively. Bad design choices are what happen when you try something new and you don’t really understand the consequences of the choices you’re making.

Making bad choices is how you learn to make better ones.

Experience, as the aphorism goes, is what you get when you were expecting something else. With a couple more years of experience under my belt, and three weeks of intense labor to hammer home the consequences of my prior bad choices, it was infinitely easier to start over than it ever would have been to do anything else.

A couple of things strike me, after the fact:

It was really difficult to see that I needed to give up.
At the point when I gave up, I had no idea how long it would take to rewrite. I was worried it would be weeks, at least.

To the first point, I’ve been reminded that it’s important to consider a problem at more than one scale: is the bug in this line? Is this function well implemented? Does this module make sense? Is the overall design fit for purpose?

To the second point: gather as much information as you can before you make estimates. A small amount of effort gathering information can really pay off.

In this case, once I’d given up, I sat down to sketch out a design that would work. I applied my better understanding of the system as a whole and my specific observations of some fundamentally insoluble problems in the current design, and made different design choices. Better ones, I’m sure. Good ones, I hope.

Gone is the hack I alluded to earlier. The new design distributes control across the graph (steps with subpiplines are directly responsible for their subpipelines) and introduces more messages to coordinate the sequencing.

Time will tell if these are also bad choices.

In the meantime, the core engine is a little under 3,000 lines of code and I’m still pleased and optimistic about this design. Oh, and it passes⊕ “Usually passes.” I’ve seen evidence of at least one more subtle timing bug. all of the unit tests, so I’m also confident that it’s at least as good as the previous implementation.

If I may be permitted to misquote Baz Luhrmann:

Write tests. If I could offer you only one tip for the future, more tests would be it. The long term benefits of tests have been proved by scientists whereas the rest of my advice has no basis more reliable than my own meandering experience.

I plan to write more about JAFPL and XProc 3.0 in the coming months.