Go Build It

September 27, 2024

A Go gopher on a Docker container whale. Generated with ChatGPT.

If you know me a little, you know I'm a big fan of Go and have been for a while. My first Go code on GitHub goes back 11 years 🤯. I like its focus on (to quote Dave Cheney) ... simplicity, readability, clarity, productivity, but ultimately they are all synonyms for one word – maintainability. Maintainability implies that people other than the original author can take care of and work with the code. And the people who have to work with the code are not necessarily full-time Go developers who keep up to date with the latest and greatest changes in the world of Go. They may even be frontenders whose web applications need to use your service but want to test it locally.

That's why I usually default to having the full build pipeline of my Go binaries in a Dockerfile together with a docker-compose.yaml file. Yes, running a docker build causes some overhead compared to running the plain Go commands, but it brings a number of benefits:

Simplicity

Running the whole service on a local machine is just a simple

docker compose up --build

away. The only thing you need to have installed on your system is docker. Non-Go team members don't even need to have the Go compiler installed on their machine.

Portability

Whether you develop on macOS, Linux or Windows, the docker container and services at a certain commit will behave exactly the same since Linux-based containers can run on any of the three host operating systems. You can be quite confident the build will succeed and the container will run properly in production if you built and tested it locally.

Reproducibility

The Dockerfile defines the full environment that constitutes the build. It includes the Go version through the base docker image, the files that are included in the container, ... The image built on one system will be exactly the same on other systems. You can be quite confident the build will succeed and the container will run properly in production if you built and tested it locally. (Given that you check in all the new or modified files of course, it happens to the best of us).

Rememberability

While the Go commands are quite simple, many projects do end up with a Makefile of sorts to bundle and standardize those commands. No need to remember all the pesky -ldflags parameters or the syntax to include race testing and test coverage, it's all nicely packaged under

docker compose build --target=test .

Yeah, yeah, ok, it has benefits, but what does that Dockerfile look like?

The basics

The initial version is very simple. Set up the Golang base image (Debian bookwork with Go 1.23 at the time of writing), copy all the Go files into the image, run go build and set the binary as command to run when the container starts. Nothing much to it.


FROM golang:1.23-bookworm

WORKDIR /work

COPY go.mod go.sum main.go ./
COPY pkg/ ./pkg

RUN go build -v -o app .

EXPOSE 8080

CMD ["/work/app"]

This version has a big drawback, though. At 40s build time for my little benchmark project, it's slow as hell. Every time a Go file changes, the whole Go build will start again from scratch. No Docker layer build caching, and no Go module caching. All the compilation speed Go is known for is lost, because the docker build invalidates all the caches on every run. Let's fix that.

Docker build caching

With a simple three-line change, the situation can improve quite a bit already. Instead of copying all files at once and relying on go build to download the dependencies, they can be downloaded in an earlier step with go mod download. With the assumption that dependencies don't change as often as regular code, docker will cache the step with the dependencies if there were no changes to the go module files.


FROM golang:1.23-bookworm

WORKDIR /work

COPY go.mod go.sum ./
RUN go mod download

COPY main.go ./
COPY pkg/ ./pkg

RUN go build -v -o app .

EXPOSE 8080

CMD ["/work/app"]

With this setup, the first build still takes 39s. A second build without any change to the dependencies, but only a simple code change, clocks in at 35s. Not the big speedup we were hoping for. The majority of the time is actually spent on compiling all the packages from scratch on every build.

Go build and module caches

Outside of docker, the Go build command uses a caching directory to store the compiled parts of your application before they get assembled into the final binary. These directories get populated in the go build layer, which means they get invalidated with the rest of that layer on any code change. Luckily, docker provides a way out in the form of cache mounts.

Cache mounts are a way to specify a persistent cache location to be used during builds. The cache is cumulative across builds, so you can read and write to the cache multiple times. This persistent caching means that even if you need to rebuild a layer, you only download new or changed packages.

That definitely sounds promising. The example on the docker website only mentions the location of the Go module cache, but this mechanism can be used for the Go build cache as well. This is what the Dockerfile looks like with those two directories cached:


FROM golang:1.23-bookworm

WORKDIR /work

COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go mod download

COPY main.go ./
COPY pkg/ ./pkg

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go build -v -o app .

EXPOSE 8080

CMD ["/work/app"]

The first run still clocks in at 39s, but for a new run with only a one-line change in a Go file, the build time suddenly drops down to a mere 2s! Now we're getting somewhere!

Multistage Docker build

The build is at a point where it's really fast, but the image is also HUGE. It includes all the whole compiler chain, all the source files, as well as dependencies and build caches in the first two versions. And a huge base layer! Just look at the sizes for my little demo project.


▶ docker image ls
REPOSITORY                 TAG            IMAGE ID       CREATED              SIZE
varivoor                   v3             5f6aebf85f9c   28 seconds ago       869MB
varivoor                   v2             99fa5699c4ab   About a minute ago   1.3GB
varivoor                   v1             7e9cc5f09231   2 minutes ago        1.3GB

The first two images are 1.3GB large, the third one just a little over 600MB smaller. Downloading an 800MB docker image can take a while, especially when the customers are waiting for the service to come back online. And again Docker provides a way out, this time in the form of multistage builds:

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image.

That sounds great. In the example project, there's only the Go binary that needs to be copied over. After the go build command, there's a new stage called production and a COPY statement to copy the binary from the build stage to the production stage. There are no template files or static assets that need to be included, but these could easily be copied into the final base image too.


FROM golang:1.23-bookworm AS build

WORKDIR /work

COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go mod download

COPY main.go ./
COPY pkg/ ./pkg

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go build -v -o app .

FROM gcr.io/distroless/static-debian12 AS production

COPY --from=build /work/app /

EXPOSE 8080

CMD ["/app"]

By default, docker build will build all the stages that are required to build the last stage in the Dockerfile, in the order that satisfies the dependencies between the stages. In this case, that means the final target of docker build is the production stage, which will also receive any tags if they're specified.


▶ docker image ls
REPOSITORY                 TAG            IMAGE ID       CREATED          SIZE
varivoor                   v4             59860bd237ca   32 seconds ago   26.2MB
varivoor                   v3             5f6aebf85f9c   25 minutes ago   869MB
varivoor                   v2             99fa5699c4ab   26 minutes ago   1.3GB
varivoor                   v1             7e9cc5f09231   27 minutes ago   1.3GB

Well, well, would you look at that. By copying only the final binary into a minimal base image, the final image size shrunk by 97%!

A note on distroless: Distroless is a project by Google to create the most minimalistic images possible. The smallest one, static which was used in the example above, is just 2MB large and contains little more than timezone and certificate data. If you want to be able to run a shell for debugging purposes, add the :debug tag to the image name. Because these images are so minimal, they may also not be suited for your use case. If your binary must be built with CGO_ENABLED=1 (because you use sqlite for instance), then you will have to use distroless/base instead of distroless/static.

Test Layer

As mentioned in the intro of this post, I also like to bundle other Go commands into the docker image to make the CI build pipelines easier. The Dockerfile below splits the actual go build into its own layer that builds on a basestage. This stage can then be reused by a new test stage as well. This test stage can be run by executing docker build --target=test .. The Docker build process will only run the test target and its base dependency and ignore the build and production stages, avoiding unnecessary work during the test cycle.


FROM golang:1.23-bookworm AS base

WORKDIR /work

COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go mod download

COPY main.go ./
COPY pkg/ ./pkg

FROM base AS test

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go vet -v ./...

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go test -race -v ./...

FROM base AS build

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go build -v -o app .

FROM gcr.io/distroless/static-debian12 AS production

COPY --from=build /work/app /

EXPOSE 8080

CMD ["/app"]

Extracting test coverage

Again, this method has a bit of a usability drawback when you try to get test coverage data out. And the solution is non-trivial. Some of the obvious solutions don't work. The COPY command only copies from the build context into the image, not the other way around. A --mount=type=bind mount similar to the cache mount does not work, because anything that gets written during the build process gets discarded. And it's totally understandable why, from a security perspective. A Dockerfile downloaded from the internet could mount just any directory and start writing harmful content to the system. The only way I've found so far is to build a minimal image with just the test coverage files and then out that image to a local directory instead of bundling it. With the Dockerfile below, instead of passing the test stage, we're passing the coverage stage and exporting it to a test folder: docker build --target=coverage --output=test .


FROM golang:1.23-bookworm AS base

WORKDIR /work

COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go mod download

COPY main.go ./
COPY pkg/ ./pkg

FROM base AS test

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go vet -v ./...

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go test -race -v -coverprofile=./cover.out -covermode=atomic ./...

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go tool cover -html ./cover.out -o ./cover.html

FROM scratch AS coverage

COPY --from=test /work/cover.out /cover.out
COPY --from=test /work/cover.html /cover.html

FROM base AS build

RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=cache,target=/root/.cache/go-build/ \
    go build -v -o app .

FROM gcr.io/distroless/static-debian12 AS production

COPY --from=build /work/app /

EXPOSE 8080

CMD ["/app"]

Well, that's it for this article. A step-by-step breakdown of how I make sure my Docker builds for Go applications are as fast, small and ergonomic as I want to get them at the moment1. And they're mightily reusable, just copy them into your own project, adjust the Go files that get copied in the base stage, and you're good to go!

Footnotes

This post only covers the Docker side of making a small image. Obviously, the majority of that image is now made up of the application binary. There are plenty of ways to make this binary smaller as well, reducing the overall image size even more. Making the binary smaller comes with a lot of tradeoffs to consider, which is out of the scope for this post. Let me know if you want me to cover this in a future post!