Your Build Succeeded, But Your Deploy Failed: The Google Cloud Build Docker Export Bug Nobody Talks About

Your Build Succeeded, But Your Deploy Failed: The Google Cloud Build Docker Export Bug Nobody Talks About

I keep facing and experiencing, which is a well-documented infrastructure-level failure in Google Cloud Build's Docker image export pipeline.

The Problem Explained

What you're experiencing is a well-documented infrastructure-level failure in Google Cloud Build's Docker image export pipeline. Here's the full breakdown:

What's Actually Happening

Your build follows this pipeline when using Firebase App Hosting (or Cloud Build with buildpacks):

  1. Source Code → Build (your Next.js app compiles successfully)
  2. Build → Layer Assembly (Docker layers are created — node runtime, pnpm, lifecycle layers, etc.)
  3. Layer Assembly → Image Export (the finished image must be saved from the Docker daemon to Artifact Registry)

The failure occurs at step 3. The connection to the Docker socket drops with an EOF error during the image save phase, and exit code 62 is the buildpacks lifecycle exit code specifically for export failures. GitHub The buildpacks pack tool tries to communicate with the Docker daemon via /var/run/docker.sock to save the assembled image, and the daemon either crashes or drops the connection mid-transfer.

Root Cause: A Convergence of Two Issues

1. The containerd/Docker Storage Driver Incompatibility

Using the containerd backend in Docker together with an untrusted builder fails the build with the same "failed to fetch base layers" error. GitHub This is a known issue in the Cloud Native Buildpacks ecosystem. The official buildpacks troubleshooting documentation identifies this as a problem with the underlying image store in Docker, and recommends pruning existing images, potentially from multiple storage drivers if switching between overlay2 and containerd. Buildpacks

2. Docker Daemon Memory/Resource Exhaustion in Cloud Build VMs

When the final image is large (common with Next.js apps that include heavy dependencies), the Docker daemon inside the Cloud Build VM can run out of memory or hit resource limits during the export phase. A related issue in BuildKit shows the Docker daemon consuming excessive memory (4GB+) and 50% CPU on an effectively idle host during image operations, with builds appearing to succeed but never actually finishing. GitHub In the constrained Cloud Build environment, this manifests as the daemon crashing and dropping the socket connection — hence the EOF.

Is This a Google Cloud Limitation?

Yes — but it's nuanced. It's the intersection of several limitations:

  1. Google Cloud Build doesn't give you control over the Docker daemon configuration inside the build VM. You can't tune containerd settings, increase Docker daemon memory, or switch storage drivers. Firebase App Hosting abstracts this entirely.
  2. It's region-specific and intermittent. One developer reported the issue persisting for 3+ days in the europe-west4 region, affecting multiple independent projects, with all remediation attempts (fresh commits, manual rollouts, deleting old images from Artifact Registry) proving unsuccessful. GitHub
  3. Another developer reported the same issue in the us-east4 region, where the build finished successfully in Cloud Build but deployment failed with the identical EOF error. Google
  4. Firebase App Hosting provides zero workarounds. You can't specify a larger build machine, configure the Docker daemon, or opt out of buildpacks. The entire pipeline is managed for you, which means you're stuck when it breaks.
  5. The buildpacks/pack project acknowledges this is a known containerd compatibility issue — but the fix has to come from the platform side (Google's Cloud Build infrastructure), not from the user.

Community Sentiment

The community response is frustrated. The Firebase Tools GitHub issue (#9675) was filed in late December 2025, reporting that the issue persisted for 3+ days with no resolution. GitHub The issue has received multiple thumbs-up reactions from affected developers across different projects. GitHub Meanwhile, the Google Developer forums thread from January 2026 shows multiple developers hitting the same wall, with one user pointing to a separate thread about App Hosting deployment errors as a potential workaround path. Google

The bottom line: your code is not the problem, and reducing bundle size is a workaround at best — not a fix. The real issue lies in the Docker daemon management within Google Cloud Build VMs, particularly how buildpacks interact with the containerd storage driver under memory pressure.


Introduction

You push a commit. Your CI/CD pipeline kicks off. The logs scroll by — dependencies install, pages compile, the bundle is created. Everything looks green. Then, in the final seconds, you see it:

ERROR: failed to export: saving image: failed to fetch base layers: 
saving image with ID "sha256:..." from the docker daemon: 
error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.41/images/get?names=sha256%3A...": EOF        

Your application built perfectly. But it will never see production. Welcome to one of the most frustrating bugs in the Google Cloud Build ecosystem — a failure that has nothing to do with your code.


The Anatomy of the Failure

When you deploy through Firebase App Hosting (or use Cloud Build with Cloud Native Buildpacks), your build goes through a multi-phase lifecycle. The final phase, "EXPORTING," is where the assembled Docker image is saved from the in-VM Docker daemon to your Artifact Registry. This is the phase that fails.

The error — an EOF on the Docker socket connection — means the Docker daemon inside your Cloud Build VM dropped the connection while transferring the image. Think of it like a file transfer that gets cut off at 99%. Your code was compiled, your image was assembled, but the handoff to storage was severed.

The critical detail: exit code 62 is the buildpacks lifecycle code specifically for export failures. It's not a compilation error, not a dependency error, not a configuration error. It's infrastructure.


Why This Happens: The Technical Deep Dive

Three factors converge to create this failure:

1. The containerd Storage Driver Problem

Docker Desktop and newer Docker versions have shifted to containerd as the default image storage backend. Cloud Native Buildpacks (the technology behind Firebase App Hosting's build process) have a documented incompatibility with the containerd storage driver when using "untrusted" builders. The buildpacks project explicitly acknowledges this: builds fail with the exact "failed to fetch base layers" error when containerd is the active storage driver.

2. Docker Daemon Resource Exhaustion

Cloud Build VMs have fixed resource allocations. When your Next.js application (or any modern web framework) produces a large output — many server-rendered pages, large static assets, heavy node_modules — the Docker daemon needs to hold the entire image in memory during the export phase. In BuildKit's issue tracker, developers have documented the Docker daemon consuming 4+ GB of memory just for image operations on otherwise idle systems. Inside a constrained Cloud Build VM, this can push the daemon past its limits.

3. The Managed Platform Trap

Firebase App Hosting is fully managed. You don't choose the build machine size. You don't configure the Docker daemon. You don't select the storage driver. This is great when everything works — and a dead end when it doesn't. There's no cloudbuild.yaml override, no machine type flag, no daemon configuration file you can tweak.


The Community Evidence

This isn't a one-off glitch. It's a pattern:

  • A developer filed GitHub issue #9675 against firebase-tools in late December 2025, reporting persistent failures in europe-west4 that lasted over three days and affected multiple independent projects.
  • In January 2026, another developer reported the identical error in us-east4 on Google's Developer Forums.
  • The Cloud Native Buildpacks project has documented this failure pattern since October 2024, specifically linking it to containerd compatibility.
  • Docker Community Forums and BuildKit's issue tracker both have parallel threads about the same Docker daemon behavior.

Every affected developer reports the same pattern: the build succeeds, the layers are assembled, and then the export fails with an EOF on the Docker socket.


Is This a Google Cloud Limitation?

Yes. Specifically, it's a limitation at the intersection of:

  • Google Cloud Build's fixed VM configurations (no user control over Docker daemon resources)
  • The buildpacks lifecycle's requirement to save images through the Docker daemon socket
  • The containerd storage driver's incompatibility with certain buildpacks operations
  • Firebase App Hosting's fully-managed nature, which removes all user levers for troubleshooting

It's not a theoretical limit like "Cloud Build can only run for 24 hours." It's an operational limitation where the platform's infrastructure choices create a failure mode that users cannot work around.


What Can You Actually Do?

While waiting for Google to fix the underlying infrastructure issue, here are practical (if imperfect) strategies:

  1. Reduce your image size. Fewer pages, smaller node_modules, aggressive tree-shaking. This isn't a fix — it's reducing the probability of hitting the daemon's memory wall.
  2. Try a different region. If you're in europe-west4 and the issue is region-specific, creating a new backend in us-central1 may help.
  3. Retry. Some developers report that the same commit succeeds on retry, suggesting the issue is partly load-dependent on Google's infrastructure.
  4. File an issue. The more visibility this gets, the faster Google will address it. Reference GitHub issue #9675.
  5. Consider alternatives. If Firebase App Hosting's managed nature is blocking you, deploying directly to Cloud Run with your own Dockerfile gives you control over the build machine type and Docker configuration.


The Bigger Lesson

Managed platforms are a bargain — until they're not. Firebase App Hosting abstracts away Docker, buildpacks, Cloud Build, and Cloud Run into a single "push and deploy" experience. But when a failure occurs in the abstracted layers, you have no levers to pull. You can't docker prune, can't increase VM memory, can't switch to overlay2, can't even see the Docker daemon logs.

This is the tradeoff of managed infrastructure: you trade control for convenience, and sometimes that trade goes the wrong way.

To view or add a comment, sign in

More articles by Uddit .

  • Google Apps Script to send emails from a Google Sheet

    Open your Google Sheet. Go to the Extensions menu, then click on "Apps Script".

  • DATA VALIDATION in Google SpreadSheets

    Imagine you have a sheet where users input project deadlines. You want to ensure that the entered dates are valid and…

  • KNN:K-Nearest Neighbor

    Introduction: KNN also called K- nearest neighbour is a supervised machine learning algorithm that can be used for…

  • Tic-Tac-Toe AI

    A tic-tac-toe AI program that never loses. This program uses the minimax algorithm with alpha-beta pruning to reduce…

  • What is Object Classification?

    It is a task that aims to identify and categorize objects within an image into pre-defined classes. Object…

Others also viewed

Explore content categories