header image news page

What is PDF? Part 4 - interactive features


PDF allows a lot of different interactive features such as web hyperlinks, jumping to a different location in the document, notes, video playback, JavaScript programs and many more. In this part I cover some of the basic features (bookmarks and annotations).

from Patrick Gundlach |

What is PDF? Part 3 – Vector graphis

Vector graphics

In the third part I cover vector graphics. You can also include PNG and JPEG images in the PDF, which will be covered in a later part of the PDF introduction.

from Patrick Gundlach |

What is PDF? Part 2 – Fonts

This is part 2 of a mini-series on PDF.

Part 1 – PDF syntax and file structure
Part 2 – Fonts
Part 3 - Vector graphics
Part 4 - Interactive features

Please note that all of these examples are created manually. If you wish to experiment with the examples, you can do so yourself. For more information, visit https://github.com/speedata/fixxref which provides a small program that supports manual PDF editing.

In the previous article in this series, I introduced the basic structure of a PDF file and how to create a PDF file using a text editor.

My goal in this post is to add some text to the PDF (using the included fonts). I should mention that you can find all the details in the PDF specification. There is one for 1.7 (recommended, very readable) and for 2.0 (register to download, few PDF viewers support 2.0 at the time of writing).

Writing text

For this introduction, I don’t want to complicate things. So I will use one of the PDF viewer’s built-in fonts, a so-called “standard 14” font. These are Courier, Courier-Bold, Courier-BoldOblique, Courier-Oblique, Helvetica, Helvetica-Bold, Helvetica-BoldOblique, Helvetica-Oblique, Symbol, Times-Bold, Times-BoldItalic, Times-Italic, Times-Roman, ZapfDingbats.

from Patrick Gundlach |

Debugging PDF files

While developing the speedata Publisher, I have to create PDF instructions to draw shapes, create accessibility data structures and embed files for example. For boxes and glue, I have to create a PDF file from scratch. But once in a while I make mistakes and the PDF file cannot be displayed in the viewer. Then I need to look into the PDF file and check manually where the problem is. For example Adobe Acrobat shows a message:

from Patrick Gundlach |

Page shuffling

Yesterday I came across a Reddit question:

“I have a pdf where the pages somehow got warped into page 2 then 1 then 4 then 3 then 6 then 5…etc Everything is right except the even pages are a step ahead of the odds.”

This is very easy to do with the speedata Publisher:

<Layout xmlns="urn:speedata.de:2009/publisher/en"
    <Record element="data">
        <SetVariable variable="fn" select="'fivepages.pdf'" />
        <SetVariable variable="cp" select="sd:number-of-pages($fn)" />
        <Loop select="$cp div 2 " variable="i">
            <PlaceObject row="0mm" column="0mm">
                <Image file="{$fn}" page="{$i * 2}" />
            <ClearPage />
            <PlaceObject row="0mm" column="0mm">
                <Image file="{$fn}" page="{$i * 2 - 1}" />
            <ClearPage />
            <Case test="sd:odd($cp)">
                <PlaceObject row="0mm" column="0mm">
                    <Image file="{$fn}" page="{$cp}" />
from Patrick Gundlach |

Reduce PDF file size

The speedata Publisher has a new (pro) feature to reduce the file size of the resulting PDF. This works by setting a maximum DPI value for bitmap images (PNG and JPG).

from Patrick Gundlach |

Markdown and a layout-quine

The speedata Publisher has now (version 4.17.11) basic support for markdown, an easy to use markup language.

As an example of markdown formatting, this snippet creates a level 1 heading and a simple bullet list:

# A title

* one
* anotherone
* three

Using markdown with the speedata Publisher is very easy. There is a new layout function called sd:markdown()

from Patrick Gundlach |

Mixing Go and Lua

The speedata Publisher is built on top of LuaTeX, a TeX variant that has an integrated Lua (5.x) implementation. Lua is a language specifically designed for integration in a host software. Most of the code of the speedata Publisher is written in Lua, but there are some (non-typesetting) parts where Lua is not much fun to write:

  • Unicode handling
  • regular expressions
  • XML parsing
  • Resource (URL, file) access

All of these “modules” are written in Go and compiled into a library that is loaded during runtime (dll, shared object). Now the big question is: how to access the library?

from Patrick Gundlach |

New logging backend

The version 4.17.1 introduces a completely new logging backend. The idea is to have a short summary of the speedata Publishing run (name of the PDF file, error and warning count and version number) and a more detailed protocol file.

When you run the speedata Publisher on a “hello, world” layout, you will get output like this:

Run speedata publisher 4.17.1 (Pro)
Finished with 0 errors and 0 warnings
Output written on publisher.pdf (1 pages, 3071 bytes)
Transcript written to publisher-protocol.xml
Total run time: 65.976541ms

The publisher-protocol.xml file (which replaces the old publisher.protocol) contains more information, depending on the log level. Normally (without warnings or errors) this file is more or less empty:

from Patrick Gundlach |