header image news page

speedata Publisher and AI

Don’t worry, the speedata Publisher will not gain any half-baked, annoying artificial intelligence features. This post is about chatting with some of the more famous AI chats (ChatGPT, Google Gemini, DeepSeek). I have tried to find out how good these AI chats know the speedata Publisher and if they can help users to generate a layout.

There is still a long road ahead, but read on. A big warning upfront: None the following outputs work. Do not use them. I post abbreviated transcripts of the chats.

from Patrick Gundlach |

Electronic invoices

The following text is written from a German perspective. This should apply to all parts of the European Union, although with likely differences in other countries.

Since the beginning of 2025, all companies in Germany have had to send and receive electronic invoices (with overriding deadlines and a few exceptions). This affects me too, so I asked myself (out of curiosity), what is an electronic invoice and how can I process it?

from Patrick Gundlach |

Release stable version 4.20

After more than half a year, 150 commits and 40 minor releases, I have now published version 4.20, which is hopefully the last stable version before the “big five”.

from Patrick Gundlach |

New XPath variables semantics

Something that was a bit inconsistent for a while is variable assignments of element structures.

Take this snippet for example:

<SetVariable variable="myvar">
	<Element name="Foo">
		<Attribute name="attr" select="'foo1'"/>
	</Element>
	<Element name="Bar">
		<Attribute name="attr" select="'bar1'"/>
	</Element>
</SetVariable>
from Patrick Gundlach |

Microtypography and font expansion

The speedata Publisher is built on top of LuaTeX, a typesetting software with a strong focus on typography. Therefore it inherits all the possibilities offered by the layout engine. One of the older features is the ability to stretch or shrink glyph widths on demand to allow even better line breaking.

TeX already uses the famous line breaking algorithm that optimizes the appearance of a full paragraph as a whole, not a line-by-line approach that might lead to large inter-word spacing.

The ability to stretch or shrink glyphs by a little amount can lead to many more possible line breaks and thus enhancing the overall visual outcome of a paragraph. The famous Gutenberg 42 line bible (B42) has a similar approach to line breaking.

from Patrick Gundlach |

speedata Publisher and accessibility

Version 4.19.8 has a completely written module for creating tagged PDF for accessibility.

First, you need to set the output format to PDF/UA with

<PDFOptions format="PDF/UA" />

Tagged PDF also needs an outline of the structure of the document. See the previous blog entry for more details on tagged PDF. To summarize: you create an description of what is visible in the PDF. Each visible part gets tagged with a structure type (“this is a label for a table of contents link”, “this is a heading” and so on).

from Patrick Gundlach |

What is PDF? Part 5 - Metadata

This part of the mini-series on PDF will be about metadata. Metadata is not visible in the document when printed, but only useful for curious human beings and some software that needs a PDF in a special format (like electronic invoices in the ZUGFeRD format). Metadata is always “about this document”, so it changes from document to document and must be applied individually.

from Patrick Gundlach |

Release speedata Publisher 4.18

Last week I have released the new stable version 4.18 with some big internal changes, which hopefully don’t affect your documents. Very intensive testing has been done and all test files (currently more than 220 and some bigger production documents) run fine with the new version.

New XML/ XPath parser

This internal change is to make the new XML and XPath parser the default parser, which is a complete rewrite and is more robust than the old one.

from Patrick Gundlach |