from Patrick Gundlach |

Mixing Go and Lua

Categories: Development, speedata Publisher

The speedata Publisher is built on top of LuaTeX, a TeX variant that has an integrated Lua (5.x) implementation. Lua is a language specifically designed for integration in a host software. Most of the code of the speedata Publisher is written in Lua, but there are some (non-typesetting) parts where Lua is not much fun to write:

  • Unicode handling
  • regular expressions
  • XML parsing
  • Resource (URL, file) access

All of these “modules” are written in Go and compiled into a library that is loaded during runtime (dll, shared object). Now the big question is: how to access the library?

This is a bit lengthy, I apologize in advance. And very, very technical.

Step 1: compiling Go to a “C library”

Let me start with a simple example. Lua lacks regular expressions, for example in a replace() function. strings.gsub() only handles simple patterns. Therefore I create a simple Go package/shared library that just calls the standard library ReplaceAllString() function from the regexp package. Run go mod init replacer to create a module and then add the following file. This is a Go source file with cgo commands, and thus looks a bit strange compared to normal Go code.

File: main.go.

package main

import (
	"regexp"
)

import "C"

//export sdReplace
func sdReplace(text *C.char, re *C.char, replace *C.char) *C.char {
	reC, err := regexp.Compile(C.GoString(re))
	if err != nil {
		return C.CString("")
	}
	goText := C.GoString(text)
	goReplace := C.GoString(replace)
	ret := reC.ReplaceAllString(goText, goReplace)
	return C.CString(ret)
}

func main() {}

Since this should interact with other C like software, I use the cgo string type *C.char in the function arguments and in the return type instead of regular Go strings. I could have used the regular Go strings, but these are a struct containing a pointer and a length, all other callers would need to convert the arguments to and from a Go string struct first.

To compile this code into a library, run this (on macOS):

go build -buildmode=c-shared  -o replacer.so main.go

The result is a file called replacer.so in the current folder which contains a Go runtime and the exported function sdReplace (see the //export line in the code above the function definition). To see if the symbol is in the library, you can use nm:

$ nm replacer.so | grep sdRe
0000000000083690 t __cgoexp_247cbbe884f7_sdReplace
00000000000835b0 t _main.sdReplace
0000000000083724 T _sdReplace

cgo is documented at https://pkg.go.dev/cmd/cgo.

Step 2: access the Go functions

To access the functions in the library, you need some wrapper code. There are many different ways to create such a wrapper, I’d say the easiest is to use LuaFFI. Luckily LuaTeX has FFI support built in.

LuaFFI

File: myscriptffi.lua

local ffi = require("ffi")

ffi.cdef[[
    extern char *sdReplace(const char* text, const char* re, const char* replace);
]]

local splib = ffi.load("./replacer.so")

local function replace(text, rexpr, repl)
    local ret = splib.sdReplace(text,rexpr,repl)
    return ffi.string(ret)
end


tex.sprint(-2,replace("banana","a","o"))

and the .tex file in this case would be

\directlua{
    dofile("myscriptffi.lua")
}

\bye

which creates a PDF file with bonono inside. You need to run luajittex with shell escape enabled:

luajittex --shell-escape replace.tex

What happens here is that the Lua function replace() calls the sdReplace() library function and LuaFFI does all the conversions (from Lua strings to C strings and back) that are needed. This is really cool, so why does this post not end here? The LuaJIT extension has some hard limitations, so it is not usable for me.

Without LuaFFI

Working without LuaFFI is much more work, but if you read until the end, it is totally worth it. A simple wrapper file that replaces LuaFFI would be this C file:

File: glue.c

#include <lauxlib.h>
#include <lua.h>
#include <lualib.h>

#include <stdlib.h>
/* auto generated from Go */
#include "replacer.h"


static int lua_replace(lua_State *L) {
	const char* text = luaL_checkstring(L,1);
	const char* re = luaL_checkstring(L,2);
	const char* replace = luaL_checkstring(L,3);

    char* ret = sdReplace(text,re,replace);
    lua_pushstring(L,ret);

    free(ret);
	return 1;
}


static const struct luaL_Reg myfuncs[] = {
	{"replace", lua_replace},
	{NULL, NULL},
};

int luaopen_glue(lua_State *L) {
  lua_newtable(L);
  luaL_setfuncs(L, myfuncs, 0);
  return 1;
}

This uses the Lua C API and creates a new Lua library named “glue” with one function (replace()). This function gets three string arguments from the stack (positions 1, 2 and 3), calls the sdReplace() function from the library (which is in the global name space) and pushes the return value (a string) back onto the stack. As you might have guessed from the last sentence, the Lua C-API is stack based. To call functions or create tables within the Lua C-API, you push values onto the stack and call API methods that takes values from the stack.

Compile the above code with (again macOS) clang:

clang -dynamiclib -fPIC -undefined dynamic_lookup \
   -o glue.so glue.c -I/opt/homebrew/opt/lua@5.4/include/lua

Now run a Lua script for example like this:

package.loadlib("replacer.so","*")
re = require("glue")

print(re.replace("banana","a","o"))

which prints bonono.

Excursion: using swig

Instead of writing the wrapper above, you could write a small swig module:

File: glue.i

%module glue
%{
/* Includes the header in the wrapper code */
extern char *sdReplace(char* text, char* re, char* replace);
%}

%rename (replace) sdReplace ;

extern char *sdReplace(char* text, char* re, char* replace);

Run swig -lua glue.i to crate a file glue_wrap.c which can be compiled as above:

clang -dynamiclib -fPIC -undefined dynamic_lookup  \
   -o glue.so glue_wrap.c -I/opt/homebrew/opt/lua@5.4/include/lua

I don’t use swig, this section is for entertainment purpose only.

Avoiding the extra wrapper library

This gets ugly, but I promise: this adds much flexibility!

So far I have created two libraries: one with the Go code and one with the Lua glue. But it is possible to put all the necessary code into one library and gain flexibility. The idea is to write the Lua functions (static int lua_replace(lua_State *L) above) in Go.

Please start with a clean directory, run go mod init replacer and create two files

File: main.go

package main

/*

#include <lauxlib.h>
#include <lualib.h>


#cgo CFLAGS: -I/opt/homebrew/opt/lua@5.4/include/lua
*/
import "C"
import "regexp"

//export sdReplace
func sdReplace(L *C.lua_State) C.int {
	textC := C.lua_tolstring(L, C.int(1), nil)
	text := C.GoString(textC)

	reC := C.lua_tolstring(L, C.int(2), nil)
	re := C.GoString(reC)

	replC := C.lua_tolstring(L, C.int(3), nil)
	repl := C.GoString(replC)

	compiledRe, err := regexp.Compile(re)
	if err != nil {
		return 0
	}
	ret := compiledRe.ReplaceAllString(text, repl)
	C.lua_pushstring(L, C.CString(ret))
	return 1
}

func main() {}

and main.c:

#include <lauxlib.h>
#include <lualib.h>

extern int sdReplace(lua_State *L);

static const struct luaL_Reg myfuncs[] = {
	{"replace", sdReplace},
	{NULL, NULL},
};


int luaopen_replacer(lua_State *L) {
  lua_newtable(L);
  luaL_setfuncs(L, myfuncs, 0);
  return 1;
}

The main.c contains the code that Lua is looking for to initialize the library. To compile the two files you have to tell the linker to ignore the missing symbols (on Windows you have to link the result with a Lua library):

export CGO_LDFLAGS="-undefined dynamic_lookup"
go build -buildmode=c-shared  -o replacer.so .

This compiles the main.go file and all the .c files in the current directory.

The Go function func sdReplace(L *C.lua_State) C.int has the required signature for the Lua API. The Go function is responsible for accessing the Lua stack. With cgo, this is very easy. Just add a C. in front of every C function. That’s it.

Interleaving Go and Lua code

The code above already interleaves Go and Lua code, which makes the integration very powerful. Now I’d like to extend the library to read a CSV file and create a set of nested Lua tables on the fly.

This simple CSV file (data.csv)

"green","a color"
"three","The third number"
"Miller","Some name"

should be turned into this table hierarchy:

{
    1: { 1: "green", 2: "a color" },
    2: { 1: "three", 2: "The third number"},
    3: { 1: "Miller", 2: "Some name"}
}

The first step is to extend the main.c file

#include <lauxlib.h>
#include <lualib.h>


extern int sdBuildCSVTable(lua_State *L);
extern int sdReplace(lua_State *L);

static const struct luaL_Reg myfuncs[] = {
	{"replace", sdReplace},
	{"build_csv_table", sdBuildCSVTable},
	{NULL, NULL},
};


int luaopen_replacer(lua_State *L) {
  lua_newtable(L);
  luaL_setfuncs(L, myfuncs, 0);
  return 1;
}

to register the build_csv_table Lua function and connect it to the Go function sdBuildCSVTable().

The second step is to extend the main.go file:

//export sdBuildCSVTable
func sdBuildCSVTable(L *C.lua_State) C.int {
	filenameC := C.lua_tolstring(L, C.int(1), nil)
	r, err := os.Open(C.GoString(filenameC))
	if err != nil {
		fmt.Println(err)
		return 0
	}
	csvReader := csv.NewReader(r)
	records, err := csvReader.ReadAll()
	if err != nil {
		fmt.Println(err)
		return 0
	}

	C.lua_createtable(L, C.int(len(records)), 0)
	for i, rows := range records {
		C.lua_pushinteger(L, C.longlong(i+1))

		C.lua_createtable(L, C.int(len(rows)), 0)
		for j, cell := range rows {
			C.lua_pushinteger(L, C.longlong(j+1))
			C.lua_pushstring(L, C.CString(cell))
			C.lua_rawset(L, -3)
		}
		C.lua_rawset(L, -3)
	}
	return 1
}

A table in (C-) Lua is created by pushing an empty table onto the stack, then the index (a string or an integer for example) and the value. After the three items have been pushed to the stack, a call to lua_rawset() gets the two top values and adds them to the table given by the second argument of lua_rawset(), in our case the third last entry on the stack (-3). The index and the value entries are taken off the stack and discarded. The return 1 at the end tells Lua that the function returns one argument (the CSV table.)

re = require("replacer")

a = re.build_csv_table("data.csv")
for i, row in ipairs(a) do
    for j, cell in ipairs(row) do
        print(i,j,cell)
    end
end

prints, as expected:

1       1       green
1       2       a color
2       1       three
2       2       The third number
3       1       Miller
3       2       Some name

Conclusion

It gets messy and ugly, but powerful. I use wrapper functions to hide the C.lua_... functions. This makes the code much more readable.

type LuaState struct {
	l *C.lua_State
}

func newLuaState(L *C.lua_State) LuaState {
	return LuaState{L}
}

func (l *LuaState) pushString(str string) {
	cStr := C.CString(str)
	C.lua_pushstring(l.l, cStr)
	C.free(unsafe.Pointer(cStr))
}

allows me to write something like this:

l := newLuaState(L)
l.pushString("Hello, world")

which is still not idiomatic Go code, but much nicer on the eye. And it is still very close to the Lua API, so that everyone who knows the Lua API can understand the code without deeper Go knowledge.