Step-by-Step Guide: Optimizing Docker Build Performance

Introduction

Problem Statement: Docker builds taking 8+ minutes per PR, causing CI bottlenecks and increased costs.

Solution Overview: Reduce build time to 40 seconds through multi-stage builds, BuildKit caching, and proper layer optimization.

What You’ll Need:

  • Docker 20.10+ (for BuildKit support)
  • Basic understanding of Dockerfile syntax
  • A Next.js or Node.js application (examples are adaptable to other frameworks)

Expected Time Investment: 2-3 hours for implementation and testing


Step 1: Diagnose Your Current Build Performance

Before optimizing, establish baseline metrics.

1.1 Measure Current Build Time

# Clear all Docker cache
docker builder prune -a -f

# Time your build
time docker build -t myapp:test .

Record this time. Our baseline: 8 minutes 15 seconds

1.2 Identify Cache Invalidation Points

# Build once
docker build -t myapp:test .

# Change a single source file
echo "// comment" >> src/index.js

# Build again and observe
docker build -t myapp:test .

Question to answer: Does changing source code trigger npm install to re-run?

If yes, your Dockerfile has cache invalidation issues.

1.3 Analyze Layer Sizes

docker history myapp:test --human --no-trunc

Look for:

  • Layers over 500MB
  • Multiple large node_modules layers
  • Presence of unnecessary files (.git, local node_modules)

Step 2: Create a Proper .dockerignore File

Purpose: Prevent unnecessary files from invalidating Docker cache.

2.1 Create .dockerignore in Project Root

touch .dockerignore

2.2 Add Essential Exclusions

# Dependencies
node_modules
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Build output
.next
dist
build
out

# Development
.git
.gitignore
.env*.local
*.log

# IDE
.vscode
.idea
.DS_Store

# Testing
coverage
.nyc_output

# Documentation
README.md
CHANGELOG.md
docs

2.3 Test Impact

# Build and check context size
docker build -t myapp:test . 2>&1 | grep "Sending build context"

Expected: Context size should drop significantly (e.g., from 800MB to 50MB).


Step 3: Enable Docker BuildKit

Purpose: Access advanced caching features and improved build performance.

3.1 Enable BuildKit Globally

Linux/Mac:

echo 'export DOCKER_BUILDKIT=1' >> ~/.bashrc
source ~/.bashrc

Windows PowerShell:

[System.Environment]::SetEnvironmentVariable('DOCKER_BUILDKIT', '1', 'User')

3.2 Enable in CI/CD

GitHub Actions:

- name: Build Docker image
  run: docker build -t myapp:latest .
  env:
    DOCKER_BUILDKIT: 1

GitLab CI:

build:
  script:
    - docker build -t myapp:latest .
  variables:
    DOCKER_BUILDKIT: 1

3.3 Verify BuildKit is Active

docker build -t myapp:test .

You should see output format changes (colored, parallel operations shown).


Step 4: Optimize Dockerfile for Layer Caching

Purpose: Separate dependencies from source code to maximize cache reuse.

4.1 Current Dockerfile (Inefficient)

FROM node:18

WORKDIR /app
COPY . .                    # ❌ Copies everything
RUN npm install             # ❌ Re-runs on any file change
RUN npm run build

CMD ["npm", "start"]

Problem: Any code change invalidates all subsequent layers.

4.2 Optimized Dockerfile (Basic)

FROM node:18

WORKDIR /app

# Copy only dependency files first
COPY package.json package-lock.json ./

# Install dependencies (cached unless package files change)
RUN npm ci

# Then copy source code
COPY . .

# Build application
RUN npm run build

CMD ["npm", "start"]

4.3 Test Improvement

# First build
time docker build -t myapp:test .

# Change source file
echo "// test" >> src/index.js

# Second build (should be faster)
time docker build -t myapp:test .

Expected Result: Second build skips npm ci layer.


Step 5: Implement Multi-Stage Build

Purpose: Separate build dependencies from runtime, reducing final image size.

5.1 Create Multi-Stage Dockerfile

# syntax=docker/dockerfile:1.4

# ============================================
# Stage 1: Install production dependencies
# ============================================
FROM node:18-alpine AS deps

WORKDIR /app

COPY package.json package-lock.json ./

# Use BuildKit cache mount for npm cache
RUN --mount=type=cache,target=/root/.npm \
    npm ci --only=production

# ============================================
# Stage 2: Build application
# ============================================
FROM node:18-alpine AS builder

WORKDIR /app

COPY package.json package-lock.json ./

# Install all dependencies (including devDependencies)
RUN --mount=type=cache,target=/root/.npm \
    npm ci

COPY . .

# Cache Next.js build cache
RUN --mount=type=cache,target=/app/.next/cache \
    npm run build

# ============================================
# Stage 3: Production runtime
# ============================================
FROM node:18-alpine AS runner

WORKDIR /app

ENV NODE_ENV=production

# Security: Run as non-root user
RUN addgroup --system --gid 1001 nodejs && \
    adduser --system --uid 1001 nextjs

# Copy only necessary files
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static

USER nextjs

EXPOSE 3000

CMD ["node", "server.js"]

5.2 Configure Next.js for Standalone Output

Edit next.config.js:

/** @type {import('next').NextConfig} */
const nextConfig = {
  output: 'standalone',
  // ... other config
}

module.exports = nextConfig

5.3 Build and Compare

# Build new multi-stage image
docker build -t myapp:multi-stage .

# Compare sizes
docker images | grep myapp

Expected Results:

  • Old image: ~1200MB
  • New image: ~150MB (87% reduction)

Step 6: Configure BuildKit Cache Mounts

Purpose: Persist npm cache between builds, avoiding re-downloads.

6.1 Verify Cache Mount Syntax

The --mount=type=cache directive requires:

  • Docker BuildKit enabled
  • Syntax directive at top of Dockerfile: # syntax=docker/dockerfile:1.4

6.2 Cache Mount Locations

# NPM cache (persists downloaded packages)
RUN --mount=type=cache,target=/root/.npm \
    npm ci

# Next.js build cache (reuses built pages)
RUN --mount=type=cache,target=/app/.next/cache \
    npm run build

6.3 Test Cache Effectiveness

# First build (cold cache)
docker builder prune -a -f
time docker build -t myapp:test .

# Second build (warm cache, no changes)
time docker build -t myapp:test .

Expected: Second build should be significantly faster (8min → 40sec).


Step 7: Implement Layer Caching in CI

Purpose: Persist Docker layers between CI runs.

7.1 GitHub Actions Configuration

Create .github/workflows/docker-build.yml:

name: Docker Build

on:
  pull_request:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Cache Docker layers
        uses: actions/cache@v3
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ github.sha }}
          restore-keys: |
            ${{ runner.os }}-buildx-

      - name: Build Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: false
          tags: myapp:${{ github.sha }}
          cache-from: type=local,src=/tmp/.buildx-cache
          cache-to: type=local,dest=/tmp/.buildx-cache-new,mode=max

      - name: Move cache
        run: |
          rm -rf /tmp/.buildx-cache
          mv /tmp/.buildx-cache-new /tmp/.buildx-cache

7.2 GitLab CI Configuration

Create .gitlab-ci.yml:

build:
  image: docker:24
  services:
    - docker:24-dind
  variables:
    DOCKER_BUILDKIT: 1
    DOCKER_DRIVER: overlay2
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - .buildx-cache
  script:
    - docker buildx create --use
    - docker buildx build 
        --cache-from type=local,src=.buildx-cache
        --cache-to type=local,dest=.buildx-cache-new,mode=max
        -t myapp:${CI_COMMIT_SHA} .
    - rm -rf .buildx-cache
    - mv .buildx-cache-new .buildx-cache

Step 8: Optimize npm Install Process

8.1 Use npm ci Instead of npm install

# ❌ Don't use: npm install (slower, less predictable)
RUN npm install

# ✅ Use: npm ci (faster, uses lock file)
RUN npm ci

Why npm ci:

  • 2-3x faster than npm install
  • Uses exact versions from lock file
  • Removes node_modules first (clean install)

8.2 Consider Package Manager Alternatives

Using pnpm (faster, better caching):

FROM node:18-alpine AS deps

# Install pnpm
RUN corepack enable && corepack prepare pnpm@latest --activate

WORKDIR /app

COPY package.json pnpm-lock.yaml ./

RUN --mount=type=cache,target=/root/.local/share/pnpm \
    pnpm install --frozen-lockfile --prod

Performance comparison:

  • npm ci: ~120s
  • pnpm install: ~45s (62% faster)

Step 9: Implement Build Skipping

Purpose: Skip builds when image already exists for current git commit.

9.1 Tag by Git Commit

#!/bin/bash
# build.sh

GIT_COMMIT=$(git rev-parse --short HEAD)
IMAGE_NAME="myapp:${GIT_COMMIT}"

# Check if image exists
if docker pull "$IMAGE_NAME" 2>/dev/null; then
    echo "Image ${IMAGE_NAME} already exists, skipping build"
    exit 0
fi

# Build new image
docker build -t "$IMAGE_NAME" .
docker push "$IMAGE_NAME"

9.2 CI Integration

# GitHub Actions
- name: Check if image exists
  id: check
  run: |
    if docker pull myapp:${{ github.sha }} 2>/dev/null; then
      echo "exists=true" >> $GITHUB_OUTPUT
    else
      echo "exists=false" >> $GITHUB_OUTPUT
    fi

- name: Build if needed
  if: steps.check.outputs.exists == 'false'
  run: docker build -t myapp:${{ github.sha }} .

Impact: Saves ~30% of CI builds (re-runs, retries, multiple jobs for same commit).


Step 10: Monitor and Validate Results

10.1 Build Time Metrics

Track these metrics:

# Build time
time docker build -t myapp:test .

# Cache hit rate
docker build -t myapp:test . 2>&1 | grep "CACHED"

# Final image size
docker images myapp:test --format "{{.Size}}"

10.2 Create Measurement Script

#!/bin/bash
# measure-build.sh

echo "=== Docker Build Performance Test ==="
echo "Date: $(date)"
echo ""

# Clean build
echo "Test 1: Clean build (no cache)"
docker builder prune -a -f
START=$(date +%s)
docker build -t myapp:clean .
END=$(date +%s)
CLEAN_TIME=$((END - START))
echo "Clean build time: ${CLEAN_TIME}s"
echo ""

# Cached build (no changes)
echo "Test 2: Cached build (no changes)"
START=$(date +%s)
docker build -t myapp:cached .
END=$(date +%s)
CACHED_TIME=$((END - START))
echo "Cached build time: ${CACHED_TIME}s"
echo ""

# Code change build
echo "Test 3: Build with code change"
echo "// test" >> src/index.js
START=$(date +%s)
docker build -t myapp:code-change .
END=$(date +%s)
CODE_CHANGE_TIME=$((END - START))
git restore src/index.js
echo "Code change build time: ${CODE_CHANGE_TIME}s"
echo ""

# Dependency change build
echo "Test 4: Build with dependency change"
npm install --save-dev lodash
START=$(date +%s)
docker build -t myapp:dep-change .
END=$(date +%s)
DEP_CHANGE_TIME=$((END - START))
git restore package.json package-lock.json
echo "Dependency change build time: ${DEP_CHANGE_TIME}s"
echo ""

# Image size
IMAGE_SIZE=$(docker images myapp:cached --format "{{.Size}}")
echo "Final image size: ${IMAGE_SIZE}"

# Summary
echo ""
echo "=== Summary ==="
echo "Clean build:         ${CLEAN_TIME}s"
echo "Cached build:        ${CACHED_TIME}s ($(echo "scale=1; 100 - ($CACHED_TIME * 100 / $CLEAN_TIME)" | bc)% faster)"
echo "Code change build:   ${CODE_CHANGE_TIME}s"
echo "Dep change build:    ${DEP_CHANGE_TIME}s"
echo "Image size:          ${IMAGE_SIZE}"

Run weekly to ensure performance doesn’t regress.


Results Summary

Before Optimization

MetricValue
Clean build time8min 15s
Cached build time8min 10s (no effective caching)
Code change build8min 5s
Image size1.2GB
CI cost/month$400

After Optimization

MetricValueImprovement
Clean build time2min 30s-70%
Cached build time40s-92%
Code change build45s-91%
Image size150MB-87%
CI cost/month$150-62%

Troubleshooting Guide

Issue 1: BuildKit not working

Symptom: --mount=type=cache causes errors

Solution:

# Verify Docker version (need 20.10+)
docker version

# Explicitly enable BuildKit
export DOCKER_BUILDKIT=1

# Use syntax directive in Dockerfile
# syntax=docker/dockerfile:1.4

Issue 2: Cache not persisting in CI

Symptom: Every CI build starts from scratch

Solution:

  • Verify cache action is configured correctly
  • Check cache size limits (GitHub Actions: 10GB limit)
  • Ensure cache key includes branch name
  • Use cache-to: type=local,dest=...,mode=max (not mode=min)

Issue 3: npm ci fails in Docker

Symptom: npm ci errors about missing package-lock.json

Solution:

# Ensure you're copying lock file
COPY package.json package-lock.json ./

# Verify lock file exists locally
RUN test -f package-lock.json || (echo "Lock file missing" && exit 1)

Issue 4: Large image size persists

Symptom: Multi-stage build doesn’t reduce size

Solution:

# Verify you're copying FROM correct stage
COPY --from=builder /app/.next/standalone ./

# Check Next.js config has standalone output
# next.config.js: output: 'standalone'

# Build and inspect
docker build -t myapp:test .
docker history myapp:test

Next Steps and Advanced Topics

  1. Registry proxy: Cache layers closer to CI runners
  2. pnpm: Switch from npm for better caching
  3. Turborepo: If monorepo, use Turborepo for intelligent caching
  4. Docker layer caching services: Consider services like Depot.dev

Monitoring in Production

# Grafana dashboard metrics
- docker_build_duration_seconds
- docker_build_cache_hits_total
- docker_build_cache_misses_total
- docker_image_size_bytes
- ci_cost_usd

Resources


Last Updated: 2026-01-06
Tested With: Docker 24.0, Next.js 14, Node 18
Maintenance: Re-test after major Docker or framework updates