NineML tools version 2.0.3
Hot on the heels of the first 2.x release of my Invisible XML tool suite, I’ve pushed a small update.
CoffeeFilter, CoffeePot, and CoffeeSacks versions 2.0.3 have been published. There are two small fixes in here.
First, a byte-order-mark (BOM) on UTF-8 inputs is ignored by default. The BOM is unnecessary on UTF-8 input and is not recommended by Unicode. Unfortunately, some Windows systems include it by default. Dealing with the BOM in a grammar is tedious and ugly, so NineML just ignores it unless you tell it not to.
Note that the BOM, U+FEFF, is a “zero width no-break space” if it occurs anywhere other than at the beginning of the file, so it can’t just be discarded anywhere it occurs. If you concatentate several UTF-8, BOM-encumbered files together with a program that doesn’t interpret them as a BOM and remove them, you’ll be stuck with them in your file.
Second, the serialization of Invisible XML documents has been changed to align with XSLT and XQuery Serialization 3.1. This means that control characters will be escaped if they occur in the output. This is not a universally good thing, but it appears to be necessary.
Hat tip to Gunther Rademacher for pointing out this problem.
Share and enjoy.