header image news page

High speed PDF generation with boxes and glue

I have been playing with electronic invoices (especially the German and French ZUGFeRD/Factur-X format) for a while now. The speedata Publisher has support for ZUGFeRD since 2017.

The next step is to make (my other PDF generation software) boxes and glue work with electronic invoices. For that, I use the new command line interface for boxes and glue (which I will describe in a future blog post).

There only a few things that need to be done to make this work:

from Patrick Gundlach |

Version 5 Released

Almost five years after releasing version 4.0 of the speedata Publisher, it is time for version 5.0. This is a short announcement, since most changes have been covered before.

The speedata Publisher looks the same since its beginning, but there have been major changes under the hood since version 4:

  • Integration of HarfBuzz for left to right and right to left text layout, even mixed (bidirectional) typesetting. Support for all kinds of fonts and font features.
  • Full accessibility support.
  • A rewritten XML and XPath 2.0 parser.
  • Attachment of ZUGFeRD invoices.
  • Greatly enhance enhance various subsystems such as HTML, MetaPost integration and paragraph builder.
  • SAAS available with the speedata Pro plan.

There have been a few entries here in the blog (4.16, 4.18, 4.20)

from Patrick Gundlach |

speedata Publisher and AI

Don’t worry, the speedata Publisher will not gain any half-baked, annoying artificial intelligence features. This post is about chatting with some of the more famous AI chats (ChatGPT, Google Gemini, DeepSeek). I have tried to find out how good these AI chats know the speedata Publisher and if they can help users to generate a layout.

There is still a long road ahead, but read on. A big warning upfront: None the following outputs work. Do not use them. I post abbreviated transcripts of the chats.

from Patrick Gundlach |

Electronic invoices

The following text is written from a German perspective. This should apply to all parts of the European Union, although with likely differences in other countries.

Since the beginning of 2025, all companies in Germany have had to send and receive electronic invoices (with overriding deadlines and a few exceptions). This affects me too, so I asked myself (out of curiosity), what is an electronic invoice and how can I process it?

from Patrick Gundlach |

Release stable version 4.20

After more than half a year, 150 commits and 40 minor releases, I have now published version 4.20, which is hopefully the last stable version before the “big five”.

from Patrick Gundlach |

New XPath variables semantics

Something that was a bit inconsistent for a while is variable assignments of element structures.

Take this snippet for example:

<SetVariable variable="myvar">
	<Element name="Foo">
		<Attribute name="attr" select="'foo1'"/>
	</Element>
	<Element name="Bar">
		<Attribute name="attr" select="'bar1'"/>
	</Element>
</SetVariable>
from Patrick Gundlach |

Microtypography and font expansion

The speedata Publisher is built on top of LuaTeX, a typesetting software with a strong focus on typography. Therefore it inherits all the possibilities offered by the layout engine. One of the older features is the ability to stretch or shrink glyph widths on demand to allow even better line breaking.

TeX already uses the famous line breaking algorithm that optimizes the appearance of a full paragraph as a whole, not a line-by-line approach that might lead to large inter-word spacing.

The ability to stretch or shrink glyphs by a little amount can lead to many more possible line breaks and thus enhancing the overall visual outcome of a paragraph. The famous Gutenberg 42 line bible (B42) has a similar approach to line breaking.

from Patrick Gundlach |

speedata Publisher and accessibility

Version 4.19.8 has a completely written module for creating tagged PDF for accessibility.

First, you need to set the output format to PDF/UA with

<PDFOptions format="PDF/UA" />

Tagged PDF also needs an outline of the structure of the document. See the previous blog entry for more details on tagged PDF. To summarize: you create an description of what is visible in the PDF. Each visible part gets tagged with a structure type (“this is a label for a table of contents link”, “this is a heading” and so on).

from Patrick Gundlach |