Use cache for building Go and JS components
General
Escalation
General
Escalation
Description
How to test
None
How to document
None
Activity
Show:

Alex Demidoff June 13, 2024 at 3:43 PMEdited
This is what Rebuild the binaries step looks like once it can use cache:
It is way faster since Go can fully reuse the modules and the build cache.
Details
Details
Assignee

Reporter

Priority
Needs QA
Yes
Needs Doc
No
Planned Version/s
Story Points
5
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created February 6, 2024 at 1:25 PM
Updated October 28, 2024 at 8:54 AM
Problem
We know that components share a good number of dependencies. Currently we are building every client and server component in a separate docker container. That poses a problem since every subsequent container does not reuse the build cache nor the dependency cache from the previous one. Therefore, if we leverage cache, the build time could be reduced further.
Solution
Try to reuse the same docker container to build all dependencies
Go - leverage go module and go build cache provided by Go tooling out of the box
NodeJS - persist
yarn
and/ornpm
cache between the buildsOptionally
fix the absence of GNU-compatible build ids when running
go build
use a public AWS S3 bucket for persisting the dependency cache, or
use a docker image which would be used as a “cache container” for
rpmbuild:3
based containersWhile the absence of build ids is not a severe problem, it can be easily fixed (solution provided here) to see with less build warnings.
For feature builds, we may want to consider saving the module cache and the build cache to an AWS S3 bucket, so that subsequent builds can use it and run faster.
Also, consider saving dev-latest’s module cache to AWS S3 so that the first run of a feature build can run faster.
Implementation details
There is one particular nuance related to using cache when running our github workflows. We used to leverage the very popular and battle-tested actions/cache action for caching GO build and module artifacts. However, as per their documentation, one single repository can only have up to 10GB of caches. This means that the size of all caches combined (i.e. all branches) is limited to that size. It is actually a problem, since the cache size in every single build (or branch) of PMM Server is 3.2GB as of June 2024, which will lead to fast cache eviction. Therefore, the cache mechanism provided by the github action mentioned above is not suitable for our github workflows.
A much better solution would be to persist the the entire docker cache volume to a remote storage and then use the cache from storage before running a build. We considered the following options for cache storage:
AWS S3 bucket
cons:
additional cost
higher latency compared to ghcr.io
Docker Image registry, i.e. docker.io
cons:
additional cost
higher latency compared to ghcr.io
Github container registry, i.e. ghcr.io
pros:
no additional cost
low latency (same network)
The first two have the obvious downside of generating additional costs, so the last one will be implemented given that we have some storage space on ghcr.io, which is included in our current price plan with github.com.