from Patrick Gundlach |

What is PDF? Part 2 – Fonts

Categories: Development

This is part 2 of a mini-series on PDF.

Part 1 – PDF syntax and file structure
Part 2 – Fonts
Part 3 - Vector graphics
Part 4 - Interactive features

Please note that all of these examples are created manually. If you wish to experiment with the examples, you can do so yourself. For more information, visit https://github.com/speedata/fixxref which provides a small program that supports manual PDF editing.

In the previous article in this series, I introduced the basic structure of a PDF file and how to create a PDF file using a text editor.

My goal in this post is to add some text to the PDF (using the included fonts). I should mention that you can find all the details in the PDF specification. There is one for 1.7 (recommended, very readable) and for 2.0 (register to download, few PDF viewers support 2.0 at the time of writing).

Writing text

For this introduction, I don’t want to complicate things. So I will use one of the PDF viewer’s built-in fonts, a so-called “standard 14” font. These are Courier, Courier-Bold, Courier-BoldOblique, Courier-Oblique, Helvetica, Helvetica-Bold, Helvetica-BoldOblique, Helvetica-Oblique, Symbol, Times-Bold, Times-BoldItalic, Times-Italic, Times-Roman, ZapfDingbats.

Using a font requires several steps.

First, I create an indirect object that references the built-in font:

5 0 obj
<<
    /Type     /Font
    /Subtype  /Type1
    /BaseFont /Helvetica
>>

Then I create a resource entry for each page in the PDF that uses the font. This resource entry maps the font to a name:

/Resources <<
    /Font <<
        /F1 5 0 R
    >>
>>

Now, in the content stream of the page, I can refer to this font with the name /F1, for example:

BT
  /F1 12 Tf
  100 100 Td
  (Hello, world) Tj
ET

Everything between BT and ET is related to text display. What it does is load the font /F1 in size 12, set the cursor to (100,100) and print the greeting. Tj is a PDF operator that displays the text in the preceding string.

The complete PDF now looks like this (source):

%PDF-1.6
%··

1 0 obj
<<
    /Type /Catalog
    /Pages 2 0 R
>>
endobj

2 0 obj
<<
    /Type /Pages
    /Kids [ 3 0 R ]
    /Count 1
>>
endobj

3 0 obj
<<
    /Type /Page
    /MediaBox [ 0 0 200 200 ]
    /Contents 4 0 R
    /Parent 2 0 R
    /Resources << /Font << /F1 5 0 R  >>  >>
>>
endobj

4 0 obj
<<
    /Length 50
>>
stream
BT
/F1 12 Tf
100 100 Td
(Hello, nice world) Tj
ET
endstream
endobj

5 0 obj
<<
    /Type     /Font
    /Subtype  /Type1
    /BaseFont /Helvetica
>>
endobj
xref
0 6
0000000000 65535 f
0000000016 00000 n
0000000074 00000 n
0000000146 00000 n
0000000297 00000 n
0000000401 00000 n
trailer <<
    /Size 6
    /Root 1 0 R
>>
startxref
488
%%EOF

Text operators

There are many different text operators, for example, to set the font size and leading, the rendering mode, the word spacing, and so on.

To put the text on two lines, you must first set a leading character, then use the T* or ' operator. string ' is a shorthand for T* string Tj (full PDF):

BT
/F1 12 Tf
10 100 Td
12 TL
(Hello) Tj
(nice world)'
ET

renders text in two lines, flush left.

Rendering mode

You can get some rendering effects by setting the rendering mode with Tr. For example, text rendering mode 1 makes text appear as an outline:

This is created with

BT
/F1 12 Tf
10 100 Td
12 TL
0.4 w
1 Tr
(Hello) Tj
(nice world)'
ET

which sets the line width to 0.4 pt and the rendering mode to 1 (full PDF).

Kerning

Real-world applications usually apply kerning to the font, which means that characters are moved a little closer together or a little further apart, depending on the visual appearance. For example, the letter combinations T and e or V and a should be moved closer together. Words in all caps may benefit from carefully placed spaces between letters (source):

BT
/F1 12 Tf
10 100 Td
15 TL
(Va Te) Tj
T*
[ (V) 60 (a T) 120 (e) ] TJ
T*
[ (A) -120 (W) -120 (A) -95 (Y) ] TJ
ET

In this example I use two different text display operators, Tj for regular display and TJ for display with kerning. The latter requires a preceding array of strings and integers that are 1/1000 of the text space width.

There are many other ways to style text. The PDF specification handles all possible ways to use the text operators.