Use cache for building Go and JS components

Description

Problem

We know that components share a good number of dependencies. Currently we are building every client and server component in a separate docker container. That poses a problem since every subsequent container does not reuse the build cache nor the dependency cache from the previous one. Therefore, if we leverage cache, the build time could be reduced further.

Solution

Try to reuse the same docker container to build all dependencies

  • Go - leverage go module and go build cache provided by Go tooling out of the box

  • NodeJS - persist yarn and/or npm cache between the builds

Optionally

  • fix the absence of GNU-compatible build ids when running go build

  • use a public AWS S3 bucket for persisting the dependency cache, or

  • use a docker image which would be used as a “cache container” for rpmbuild:3 based containers

While the absence of build ids is not a severe problem, it can be easily fixed (solution provided here) to see with less build warnings.

For feature builds, we may want to consider saving the module cache and the build cache to an AWS S3 bucket, so that subsequent builds can use it and run faster.
Also, consider saving dev-latest’s module cache to AWS S3 so that the first run of a feature build can run faster.

Implementation details

There is one particular nuance related to using cache when running our github workflows. We used to leverage the very popular and battle-tested actions/cache action for caching GO build and module artifacts. However, as per their documentation, one single repository can only have up to 10GB of caches. This means that the size of all caches combined (i.e. all branches) is limited to that size. It is actually a problem, since the cache size in every single build (or branch) of PMM Server is 3.2GB as of June 2024, which will lead to fast cache eviction. Therefore, the cache mechanism provided by the github action mentioned above is not suitable for our github workflows.

A much better solution would be to persist the the entire docker cache volume to a remote storage and then use the cache from storage before running a build. We considered the following options for cache storage:

  • AWS S3 bucket

    • cons:

      • additional cost

      • higher latency compared to ghcr.io

  • Docker Image registry, i.e. docker.io

    • cons:

      • additional cost

      • higher latency compared to ghcr.io

  • Github container registry, i.e. ghcr.io

    • pros:

      • no additional cost

      • low latency (same network)

The first two have the obvious downside of generating additional costs, so the last one will be implemented given that we have some storage space on ghcr.io, which is included in our current price plan with github.com.

How to test

None

How to document

None

Activity

Show:

Alex Demidoff June 13, 2024 at 3:43 PM
Edited

This is what Rebuild the binaries step looks like once it can use cache:

It is way faster since Go can fully reuse the modules and the build cache.

Details

Assignee

Reporter

Priority

Needs QA

Yes

Needs Doc

No

Planned Version/s

Story Points

Smart Checklist

Created February 6, 2024 at 1:25 PM
Updated October 28, 2024 at 8:54 AM