Printing with CSS

Volume 6, Issue 7; 25 Jul 2022

I’ve tried to improve the CSS print output from the DocBook xslTNG stylesheets. Getting the final PDFs with open-source tools is still a challenge.

Yesterday, I released version 1.8.0 of the DocBook xslTNG Stylesheets. There are some bug fixes in the release, but I also spent time reworking how the CSS stylesheet links are managed and trying to improve the print output (hence 1.8.0 and not 1.7.2).

There are about ~17K lines of XSLT in the stylesheets.Full disclosure: AntennaHouse graciously extends a complimentary license to me for the purpose of experimentation and publishing open source standards documents. I am grateful for their support. Those transform DocBook into well-structured HTML 5 suitable for styling with CSS. In the past, I’ve also tried to support print rendering with XSL-FO. This provides a free path to PDF through FOP in addition to commercial solutions.

In the XSLT 2.0 stylesheets for DocBook, I never managed to get comprehensive, reliable XSL FO transformations working. In xslTNG, I’m reluctant to even start. There isn’t enough similarity in the transformations to expect a lot of code reuse and I don’t really fancy trying to maintain another ~17k lines of XSLT to support XSL-FO.

In theory, and to a lesser extent in practice, it should be possible to do this with HTML+CSS. Paged media support in CSS has come a long way.

The HTML that comes out of docbook.xsl isn’t quite print-ready because the stylesheets assume that the result will be rendered in a browser. Some elements, like annotations, are supported with interactive JavaScript that isn’t appropriate for paged media and footnotes are managed as end notes with explicit links because browsers don’t have footnotes.

The print.xsl transformation extends docbook.xsl to post-process the HTML that’s produced so that it’s print-ready. Footnotes are moved inline, annotations are transformed into footnotes, and links are reformatted a bit. This HTML, transformed with a tool that understands CSS paged media styles, can produce PDF.

AntennaHouse (version 7.2) and Prince (14.3), commercial formatters, both do a very nice job. (My page design skills with CSS are perhaps a bit crude;In particular, as you’ll see in the PDF linked below, some of the borders and backgrounds on inline elements that work well online work less well in print, especially when they break across lines. I’m also not convinced that I’ve sorted out all the fonts correctly, though I did mange to get fallbacks working so that all of the Unicode symbols render. but very nice results are clearly possible). Weasyprint (3.9.0), which is free, seems to do a respectable job, but doesn’t understand a bunch of modern CSS constructions such as compound conditions in media queries, shadow properties, and calc(). You can even make Chrome (103.0.5060.134) run headless to produce PDF but it’s…less successful. There are some Node.js solutions, like paged.js, that might work as well, but I didn’t succeed in formatting with them in the short amount of time I devoted to this weblog posting.

One of my motivations for working on the print stylesheets was to produce nice PDF versions of the XProc specifications which are almost final. (The tinkering I did to make that work was part of the input to version 1.8.0, I haven’t republished them with 1.8.0 yet.)

Until I get the XProc specification PDFs online, the closest example PDF I have to hand is the DocBook: xslTNG Reference version 1.8.0.

Hopefully, users who have relied on FOP in the past, can find a suitable HTML+CSS replacement. Other suggestions most welcome.