Overview and What You Will Learn
A 2GB Docker image is not just an inconvenience — it is a tax you pay on every deployment. Every time a new EC2 instance spins up, it pulls 2GB over the network before running a single line of your code. At Hotstar during an IPL match when 50 new instances spin up in 90 seconds, that is 100GB of unnecessary data transfer. A well-optimised image of 80MB means the same 50 instances are ready in 20 seconds instead of 10 minutes.
In this guide you will learn exactly what is inside your images, which techniques make the biggest difference, and how to measure the results of every change.
Why This Matters in Production
Image size affects four things directly: deployment speed (pull time), storage cost (registry storage), security exposure (more packages = more CVEs), and memory usage (larger layers mean more I/O). Every MB you remove from your image is a permanent improvement to all four.
Core Principles
+------------------------------------------+| Factors that increase image size: || || * Wrong base image (ubuntu vs alpine) || * Dev dependencies in production image || * Build tools left in final image || * Package manager caches not cleaned || * .git directory copied in || * node_modules copied instead of built || * Unnecessary files via COPY . . |+------------------------------------------+ +------------------------------------------+| Techniques that reduce image size: || || * Minimal base image (alpine/distroless) || * Multi-stage builds (biggest impact) || * .dockerignore (exclude garbage) || * Clean caches in same RUN instruction || * Combine RUN commands to merge layers || * BuildKit cache mounts for pkg managers |+------------------------------------------+Detailed Step-by-Step Practical Lab
Milestone 1: Measuring What Is Inside Your Image
You cannot optimise what you cannot measure. Always start by understanding where the bytes are.
# See image sizesdocker images# REPOSITORY TAG SIZE# payment-api latest 1.82GB <- too big# node 20 1.1GB# node 20-slim 220MB# node 20-alpine 55MB # See exactly what each layer addeddocker history payment-api:latest# IMAGE CREATED BY SIZE# a84f9c2b1d3e CMD ["node" "dist/server.js"] 0B# b72c8a9f4e1d COPY . . 890MB <- source + node_modules!# c91d8b3f2a5e RUN npm install 650MB <- dev dependencies!# ... FROM node:20 1.1GB <- full Node base # Install dive — the best image layer analyser# https://github.com/wagoodman/divebrew install dive # macOSsudo snap install dive # Linux # Analyse image interactivelydive payment-api:latest# Shows each layer, what files it added, and efficiency score# Look for: large files that should not be there, duplicate files, temp files # Quick size check per layer without divedocker history payment-api:latest \ --format "{{.Size}}\t{{.CreatedBy}}" \ --no-trunc | sort -rh | head -10# Shows top 10 largest layers sorted by sizeMilestone 2: The Biggest Win — Multi-Stage Builds
The single biggest size reduction comes from multi-stage builds. Before/after comparison for a Node.js application:
# BEFORE — single stage (everything in one image)FROM node:20WORKDIR /appCOPY package.json package-lock.json ./RUN npm install # Installs ALL deps including dev toolsCOPY . .RUN npm run build # Build tools now permanently in imageEXPOSE 8080CMD ["node", "dist/server.js"]# Result: ~1.8GB image# Contains: Node.js + ALL npm packages (dev + prod) + build tools + source # AFTER — multi-stage buildFROM node:20-alpine AS builderWORKDIR /appCOPY package.json package-lock.json ./RUN npm ci # Install everything for buildingCOPY . .RUN npm run build # Build output goes to dist/ FROM node:20-alpine AS productionWORKDIR /appCOPY package.json package-lock.json ./RUN npm ci --omit=dev # Production deps onlyCOPY --from=builder /app/dist ./dist # Just the compiled outputEXPOSE 8080CMD ["node", "dist/server.js"]# Result: ~180MB image# Contains: Node.js + production npm packages + compiled output only# 90% size reduction — build tools and dev deps not included# Measure the before/afterdocker build -t payment-api:before -f Dockerfile.before .docker build -t payment-api:after -f Dockerfile.after .docker images | grep payment-api# payment-api before 1.82GB# payment-api after 178MBMilestone 3: Choosing the Right Base Image
# Size comparison for Node.js base imagesdocker pull node:20 # 1.1GB — full Debian with build toolsdocker pull node:20-slim # 220MB — Debian without build toolsdocker pull node:20-alpine # 55MB — Alpine Linuxdocker pull gcr.io/distroless/nodejs20-debian12 # 120MB — no shell at all # CVE comparison (run trivy to count vulnerabilities)trivy image --severity CRITICAL,HIGH node:20 2>/dev/null | tail -5# Total: 48 (CRITICAL: 8, HIGH: 40) trivy image --severity CRITICAL,HIGH node:20-alpine 2>/dev/null | tail -5# Total: 0 (CRITICAL: 0, HIGH: 0) # Alpine has dramatically fewer CVEs# But: Alpine uses musl libc, not glibc# Some npm packages with native modules may not work on Alpine# Always test before switching to Alpine in production # Decision guide:# node:20-alpine = first choice for most Node.js apps# node:20-slim = when Alpine breaks native modules (bcrypt, canvas, etc.)# node:20 = only in build stages, never in final production stage# distroless = maximum security, no shell for debuggingMilestone 4: Cleaning Package Manager Caches
This is the most commonly missed optimisation. If you clean the cache in a separate RUN instruction, the cache bytes are still stored in the previous layer.
# WRONG — cache bytes still in the apt-get layerRUN apt-get update && apt-get install -y curl gitRUN rm -rf /var/lib/apt/lists/* # Too late — previous layer has the cache # CORRECT — clean in the same RUN instructionRUN apt-get update && \ apt-get install -y --no-install-recommends curl git && \ rm -rf /var/lib/apt/lists/*# --no-install-recommends: skip optional packages (saves 50-200MB)# rm in same RUN: cache never stored in any layer # For Alpine (apk):RUN apk add --no-cache curl git# --no-cache: never writes cache to disk at all # For pip (Python):RUN pip install --no-cache-dir -r requirements.txt# --no-cache-dir: never writes pip cache # For npm:RUN npm ci --omit=dev && npm cache clean --force# Clean npm cache in the same instructionMilestone 5: BuildKit Cache Mounts
BuildKit cache mounts keep the package manager cache between builds but exclude it from the final image — you get fast rebuilds without bloated images.
# syntax=docker/dockerfile:1 FROM node:20-alpineWORKDIR /appCOPY package.json package-lock.json ./ # Cache mount — /root/.npm is reused between builds but NOT in the imageRUN --mount=type=cache,target=/root/.npm \ npm ci --omit=dev COPY . .RUN --mount=type=cache,target=/root/.npm \ npm run build# Build with BuildKit enabled (default in Docker 23+)DOCKER_BUILDKIT=1 docker build . # Or use docker buildx:docker buildx build . # First build: downloads packages (slow)# Second build with same packages: uses cache mount (fast, no network)# Image size: same as without cache mount (cache not in image)For Python, the same pattern:
# syntax=docker/dockerfile:1FROM python:3.11-slimWORKDIR /appCOPY requirements.txt ./ # Cache pip downloads between buildsRUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt COPY . .CMD ["python", "app.py"]Milestone 6: Combining RUN Instructions
Each RUN creates a new layer. Combining multiple RUN instructions into one reduces total layer count and prevents intermediate files from adding to the image size.
# WRONG — 3 layers, intermediate files persistRUN apt-get updateRUN apt-get install -y wgetRUN wget https://example.com/tool.tar.gz && \ tar -xzf tool.tar.gz && \ mv tool /usr/local/bin/ # CORRECT — 1 layer, temp files cleaned upRUN apt-get update && \ apt-get install -y --no-install-recommends wget && \ wget https://example.com/tool.tar.gz && \ tar -xzf tool.tar.gz && \ mv tool /usr/local/bin/ && \ rm tool.tar.gz && \ apt-get remove -y wget && \ apt-get autoremove -y && \ rm -rf /var/lib/apt/lists/*# Downloads wget, uses it, removes it — wget not in final imageMilestone 7: Measuring Progress
# Build and measure after each optimisationdocker build -t myapp:v1 -f Dockerfile.v1 .docker build -t myapp:v2 -f Dockerfile.v2 .docker build -t myapp:v3 -f Dockerfile.v3 . docker images | grep myapp# myapp v1 1.82GB (single stage, wrong base)# myapp v2 420MB (multi-stage added)# myapp v3 178MB (alpine base + cache cleaning) # Track layer efficiency with divedive myapp:v3# Image efficiency score: 97% <- good# Wasted space: 5.2MB <- mostly temp files # Check for files that should not be in the imagedocker run --rm myapp:v3 find / -name "*.log" 2>/dev/nulldocker run --rm myapp:v3 find / -name "*.test.js" 2>/dev/nulldocker run --rm myapp:v3 ls -la /root/.npm 2>/dev/null # npm cache should not be hereCommon Mistakes
| Mistake | Size Impact | Fix |
|---|---|---|
Using node:20 instead of node:20-alpine |
+1GB | Switch base image — test Alpine compatibility first |
| Dev dependencies in production image | +200-500MB | Multi-stage build with npm ci --omit=dev in final stage |
| Cleaning apt cache in separate RUN | 0 reduction (cache still in prev layer) | Clean in same RUN instruction |
| COPY . . copies node_modules | +500MB | Add node_modules/ to .dockerignore |
| Build tools in final image | +100-300MB | Use multi-stage — build tools stay in builder stage only |
| Large assets not excluded | +varies | Add *.png, *.mp4, docs/ to .dockerignore if not needed at runtime |
Troubleshooting Reference
| Problem | Diagnostic | Fix |
|---|---|---|
| Image still large after multi-stage | docker history image:tag |
Find which layer is large — likely a COPY bringing in too much |
| Alpine build fails for native modules | Build error mentioning node-gyp or gyp |
Switch to node:20-slim (glibc) or add apk add python3 make g++ |
| Cache not working between builds | Every build re-downloads packages | Verify --mount=type=cache syntax and BuildKit is enabled |
| distroless image fails to start | exec format error |
Verify entrypoint is a compiled binary, not a shell script |
PLACEMENT PRO TIP**Tip:** Run `docker history your-image --no-trunc | awk '{print $NF, $4}' | sort -k2 -rh | head -20` to see the 20 largest layers with the exact Dockerfile instruction that created each one. This tells you exactly where to focus your optimisation effort.
REMEMBER THIS**Remember:** Image size optimisation has diminishing returns. Going from 2GB to 200MB is worth days of engineering effort. Going from 200MB to 180MB is not. Focus on the big wins first — wrong base image and dev dependencies in production images — then stop.
COMMON MISTAKE / WARNING**Common Mistake:** Using `RUN apt-get clean` in a separate layer after installing packages. The cleaned files are NOT removed from the image — they are still in the previous layer. The only way to exclude them is to run the cleanup in the exact same RUN instruction as the installation.
COMMON MISTAKE / WARNING**Security:** Smaller images have fewer CVEs. Alpine-based images typically have zero CRITICAL CVEs because they include far fewer OS packages than Debian or Ubuntu. Every package that is not in your image is a package that cannot have a vulnerability exploited. Use `trivy image your-image:tag` in your CI pipeline and fail builds on CRITICAL severity findings.