-
Notifications
You must be signed in to change notification settings - Fork 1
fix: BuildKit cache + local base image mirroring for builds #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Without image-manifest=true, BuildKit's registry cache stores layer references pointing to external registries (e.g., docker.io) rather than copying the actual layer blobs into the cache image. This causes cache misses in ephemeral BuildKit instances (like our builder VMs) because the layers aren't available locally. With image-manifest=true, BuildKit creates a proper OCI image manifest with all layer blobs stored in the registry, enabling cache hits even in fresh BuildKit instances. This fixes the issue where the global cache (populated by admin builds) wasn't providing cache hits for tenant builds - the first deployment for each tenant was re-downloading all base image layers from Docker Hub. Co-authored-by: Cursor <cursoragent@cursor.com>
Adds a unit test that reproduces the production issue where hypeman fails to pre-pull BuildKit cache images. The test creates a mock OCI layout with BuildKit's cache config mediatype (application/vnd.buildkit.cacheconfig.v0) and verifies that unpackLayers fails with the expected error. This test documents the root cause: umoci expects standard OCI config mediatype but BuildKit cache exports use a custom mediatype. Co-authored-by: Cursor <cursoragent@cursor.com>
BuildKit exports cache with a custom mediatype (application/vnd.buildkit.cacheconfig.v0) that can't be unpacked by standard OCI tools like umoci. This caused errors when pushing cache images to the registry: config blob is not correct mediatype application/vnd.oci.image.config.v1+json: application/vnd.buildkit.cacheconfig.v0 The fix skips the ext4 conversion step for cache/* repos since: 1. Cache images are not runnable containers 2. BuildKit imports them directly from the registry 3. There's no need to unpack or convert them locally Co-authored-by: Cursor <cursoragent@cursor.com>
The repo parameter passed to triggerConversion includes the Host header
prefix (e.g., "10.102.0.1:8083/cache/global/node"). The previous check
only used HasPrefix("cache/") which would never match.
Now checks for both patterns:
- HasPrefix("cache/") for edge case without host
- Contains("/cache/") for normal case with host prefix
Co-authored-by: Cursor <cursoragent@cursor.com>
Add auto-mirroring of base images for admin builds and Dockerfile FROM rewriting so builder VMs pull base image layers from the local registry instead of Docker Hub. Key changes: - Add mirror infrastructure (lib/images/mirror.go, lib/builds/mirror.go) to push base images to the local registry during admin builds - Add Dockerfile FROM parser (lib/builds/dockerfile.go) to extract base image references for both mirroring and token scope - Add builder agent FROM rewriting (builder_agent/main.go) to detect mirrored images and rewrite FROM instructions to use local refs - Fix auth on HEAD request: checkImageExistsInRegistry now sends Bearer auth header so the registry doesn't return 401 - Fix token scope: builder tokens now include pull access for base image repos so the agent can detect mirrored images - Add admin /mirror-base-image endpoint for manual mirroring - Add registry middleware support for base image pull auth Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove convenience-only files that aren't needed for the core fix: - cmd/api/api/admin.go (manual mirror endpoint) - cmd/api/main.go admin route registration - lib/middleware/oapi_auth.go JWTAuthMiddleware - lib/builds/cache_integration_test.go (testcontainers dep) - lib/images/oci_test.go (pre-existing bug test) This also removes the testcontainers-go dependency from go.mod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rewriteDockerfileFROMs was missing stage name tracking, which could
cause it to rewrite inter-stage FROM references (e.g. FROM builder)
to point at the local registry instead of the prior build stage.
Also skips ARG variable references (${...}) that can't be resolved.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| for _, ref := range refs { | ||
| repo := ref | ||
| if idx := strings.LastIndex(repo, ":"); idx > 0 { | ||
| repo = repo[:idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repo extraction mishandles digest references in token scope
Low Severity
The repo-name extraction using strings.LastIndex(repo, ":") doesn't handle digest references (image@sha256:abc...). For such refs, it splits on the : inside the digest, producing alpine@sha256 instead of alpine as the repo name. This generates incorrect token scopes, causing auth failures for mirroring and agent HEAD checks. The builder agent's checkImageExistsInRegistry correctly handles this by checking for @ before :, but this server-side code is missing the same check.


Summary
image-manifest=trueso cache layer blobs are stored inline rather than as external references to Docker HubFROMinstructions in builder VMs to pull from the local registry instead of Docker HubcheckImageExistsInRegistrynow sendsAuthorization: Bearerheader (was getting 401)pullaccess for base image repos so the agent can detect mirrored imagesChanges
lib/builds/builder_agent/main.golib/builds/manager.golib/builds/dockerfile.golib/builds/mirror.golib/images/mirror.golib/middleware/oapi_auth.gocmd/api/api/admin.golib/builds/cache.goimage-manifest=true,oci-mediatypes=trueto cache exportlib/registry/registry.gocache/*reposTest plan
go test ./lib/builds/builder_agent/ -v— builder agent tests pass with updated signaturesgo test ./lib/builds/ -v— builds package tests pass (including cache integration)go test ./lib/images/ -v— mirror tests passmake build-builder— new builder image compiles🤖 Generated with Claude Code
Note
Medium Risk
Touches build execution, registry token scoping, and registry conversion triggers; failures could impact cache hits or image pulls, though changes are guarded with fallbacks (no rewrite/mirror on errors).
Overview
Improves build reliability in ephemeral BuildKit VMs by exporting registry cache with
image-manifest=true,oci-mediatypes=true, and updates tests accordingly.Adds an admin-build flow to mirror Dockerfile base images into the local registry (new
images.MirrorBaseImage+mirrorBaseImagesForBuild) and has the builder agent rewrite DockerfileFROMlines to local registry references when the image exists, using authenticated manifestHEADchecks.Updates registry behavior to skip conversion for
cache/*repos (BuildKit cache images), and expands build-scoped registry tokens to includepullscope for base-image repos (Dockerfile parsed from inline content or extracted from the source tarball).Written by Cursor Bugbot for commit 31afe9d. This will update automatically on new commits. Configure here.