What is the career path for learning Running Docker Containers Securely — Non-Root Users and Capabilities?

Mastering Running Docker Containers Securely — Non-Root Users and Capabilities enables engineering opportunities in DevOps, SRE, and cloud platform automation.

Running Docker Containers Securely — Non-Root Users and Capabilities | DevOps Network

Q: How long does it take to learn Running Docker Containers Securely — Non-Root Users and Capabilities?

Most students gain core proficiency in Running Docker Containers Securely — Non-Root Users and Capabilities in 2–3 weeks of active hands-on labs.

Overview and What You Will Learn

By default, every Docker container runs as root. This means if an attacker finds a vulnerability in your application and gains code execution inside the container, they have root privileges inside the container — and potentially a path to root on the host. At Zerodha processing trading orders, a compromised container running as root is not an application bug — it is a business-ending security incident.

In this guide you will learn how to run containers as non-root users, how to drop Linux capabilities to the minimum needed, and how to use read-only root filesystems to prevent malware from persisting inside containers.

Core Principles

◈ DIAGRAM

+------------------------------------------+
| Security Layers for Docker Containers    |
|                                          |
| Layer 1: Non-root USER instruction       |
|   Container process runs as UID 1001     |
|   Not root — limited damage if exploited |
|                                          |
| Layer 2: Capability dropping             |
|   --cap-drop ALL                         |
|   Add back only what you need            |
|   NET_BIND_SERVICE for port 80/443       |
|                                          |
| Layer 3: Read-only root filesystem       |
|   --read-only                            |
|   Malware cannot write to disk           |
|   Writable /tmp via tmpfs if needed      |
|                                          |
| Layer 4: seccomp profile                 |
|   Blocks 44 dangerous syscalls by default|
|   Custom profiles for high-security      |
+------------------------------------------+

Detailed Step-by-Step Practical Lab

Milestone 1: Running as Non-Root

Dockerfile

# WITHOUT USER instruction — runs as root (default)
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY dist/ ./dist/
CMD ["node", "dist/server.js"]
# Container process runs as UID 0 (root) — dangerous
 
# WITH USER instruction — runs as non-root
FROM node:20-alpine
WORKDIR /app
 
# Create a non-root user and group
RUN addgroup -S appgroup && \
    adduser -S appuser -G appgroup
 
COPY package.json package-lock.json ./
# Install as root (package installation often needs root)
RUN npm ci --omit=dev
 
COPY dist/ ./dist/
 
# Change ownership of app files to the non-root user
RUN chown -R appuser:appgroup /app
 
# Switch to non-root user before CMD
USER appuser
 
CMD ["node", "dist/server.js"]
# Container process runs as UID 1001 (appuser) — safe

Bash

# Verify the container is running as non-root
docker run --rm myapp whoami
# appuser   <- not root
 
docker run --rm myapp id
# uid=1001(appuser) gid=1001(appgroup)
 
# Override user at runtime if needed for debugging
docker run --rm --user root myapp bash
 
# Check what user a running container is using
docker inspect payment-api \
  --format '{{.Config.User}}'
# appuser

The node:20-alpine image already includes a node user (UID 1000). You can use it directly:

Dockerfile

FROM node:20-alpine
WORKDIR /app
COPY --chown=node:node package.json package-lock.json ./
RUN npm ci --omit=dev
COPY --chown=node:node dist/ ./dist/
USER node
CMD ["node", "dist/server.js"]
# Uses the pre-existing node user — no need to create one

Milestone 2: Linux Capabilities

Linux capabilities split root privileges into individual, grantable permissions. Docker's default is to drop 14 dangerous capabilities. You can go further with --cap-drop ALL:

Bash

# Run with all capabilities dropped
docker run -d \
  --cap-drop ALL \
  --name payment-api \
  registry.razorpay.in/payment-api:v3.1.0
 
# Add back only what the application specifically needs
docker run -d \
  --cap-drop ALL \
  --cap-add NET_BIND_SERVICE \
  --name nginx \
  nginx:1.25
# NET_BIND_SERVICE allows binding to ports below 1024 (80, 443)
# Without it, nginx cannot bind to port 80
 
# Common capabilities and what they allow:
# NET_BIND_SERVICE  - bind to ports < 1024
# SYS_PTRACE        - debug processes (strace, gdb) — do NOT add to production
# SYS_ADMIN         - many admin operations — almost never needed, very dangerous
# CHOWN             - change file ownership — often needed for setup scripts
# DAC_OVERRIDE      - bypass file permission checks — avoid in production
 
# Check what capabilities a running container has
docker inspect payment-api \
  --format '{{.HostConfig.CapAdd}} / {{.HostConfig.CapDrop}}'
# [] / [ALL]   <- dropped all, added none
 
# View capabilities inside the container
docker exec payment-api cat /proc/1/status | grep CapEff
# CapEff: 0000000000000000   <- zero capabilities (all dropped)

Milestone 3: Read-Only Root Filesystem

Bash

# Run with read-only root filesystem
docker run -d \
  --read-only \
  --name payment-api \
  registry.razorpay.in/payment-api:v3.1.0
 
# If the app tries to write to the root filesystem:
# Error: EROFS: read-only file system
 
# Add tmpfs for directories that need to be writable
docker run -d \
  --read-only \
  --tmpfs /tmp \
  --tmpfs /run \
  --name payment-api \
  registry.razorpay.in/payment-api:v3.1.0
# /tmp and /run are in-memory and writable
# Everything else is read-only
 
# In Docker Compose:
services:
  api:
    read_only: true
    tmpfs:
      - /tmp
      - /run

Milestone 4: The --no-new-privileges Flag

Bash

# Prevent privilege escalation inside the container
docker run -d \
  --security-opt no-new-privileges=true \
  payment-api
 
# This prevents:
# - setuid binaries from elevating privileges
# - sudo from working inside the container
# - any mechanism for gaining more privileges than the container started with
# Essential when running untrusted code inside containers

Complete Hardened docker run Command

Bash

docker run -d \
  --name payment-api \
  --user 1001:1001 \                    # Non-root user
  --read-only \                          # Read-only filesystem
  --tmpfs /tmp:rw,noexec,nosuid,size=64m \ # Writable /tmp, no exec
  --cap-drop ALL \                       # Drop all capabilities
  --cap-add NET_BIND_SERVICE \           # Add back only what's needed
  --security-opt no-new-privileges=true \ # No privilege escalation
  --security-opt seccomp=/etc/docker/seccomp.json \ # Syscall filter
  --memory=512m \                        # Memory limit
  --cpus=1.5 \                           # CPU limit
  registry.razorpay.in/payment-api:v3.1.0

Common Mistakes

Mistake	Risk	Fix
No USER instruction	Root compromise = host compromise risk	Always add `USER` before `CMD`
Using `--privileged`	Disables all container isolation	Never use — find the specific capability you need
Not setting `--cap-drop ALL`	Container has 12+ powerful capabilities by default	Start with drop all, add back specific needs
Read-write filesystem without need	Malware can persist files, install tools	Use `--read-only` + `--tmpfs /tmp`
Setting USER in CMD instead of Dockerfile	Image runs as root until CMD — root during build	Set USER in Dockerfile before CMD

COMMON MISTAKE / WARNING
**Security:** Never use `--privileged` in production containers. A privileged container has access to all host devices, can load kernel modules, and can modify network interfaces. It is effectively the same as running a process directly on the host with root access. If you think you need `--privileged`, investigate whether adding specific `--cap-add` capabilities satisfies the requirement instead.

JSON

{
  "title": "Managing Docker Secrets — BuildKit Secrets, Runtime Secrets, and Vault",
  "slug": "docker-secrets-management",
  "cluster": "docker",
  "description": "Manage secrets in Docker builds and at runtime without leaking credentials into image layers or environment variables using BuildKit secrets and Vault integration.",
  "primaryKeyword": "docker secrets management"
}

Managing Docker Secrets — BuildKit Secrets, Runtime Secrets, and Vault

Overview and What You Will Learn

The most common Docker security mistake is putting secrets — API keys, database passwords, private SSH keys — into Dockerfiles as ENV instructions or copying them into the image. These secrets end up permanently baked into image layers where anyone who pulls the image can read them with docker history.

This guide teaches you the correct way to handle secrets at both build time and runtime.

Core Principles

Bash

+------------------------------------------+
| Where secrets MUST NOT go:               |
|                                          |
| ENV DB_PASSWORD=secret123   <- in layer  |
| COPY .env .                 <- in layer  |
| ARG API_KEY=secret123       <- in history|
|                                          |
| Anyone who does 'docker history image'   |
| or 'docker inspect container' can see    |
| all ENV and ARG values                   |
+------------------------------------------+
 
+------------------------------------------+
| Where secrets SHOULD go:                 |
|                                          |
| Build time: BuildKit secret mounts       |
|   Not stored in any layer                |
|                                          |
| Runtime: environment variables           |
|   Passed at docker run time from         |
|   a secure secret store                  |
|   (AWS Secrets Manager, Vault)           |
+------------------------------------------+

Detailed Step-by-Step Practical Lab

Milestone 1: Why ENV is Dangerous for Secrets

Bash

# This Dockerfile looks innocent
FROM node:20-alpine
ENV GITHUB_TOKEN=ghp_secret_token_here
RUN npm install --registry https://registry.github.com
 
# The token is now permanently in the image
docker history my-image
# IMAGE     CREATED BY                                  SIZE
# a84...    ENV GITHUB_TOKEN=ghp_secret_token_here      0B
# <- Token visible in plain text to anyone who runs this command
 
# Even if you unset it in a later layer:
ENV GITHUB_TOKEN=
# The previous layer STILL has the token
# Layers are immutable — you cannot overwrite history
 
# Anyone who pulls your image can read all secrets:
docker run --rm my-image env | grep TOKEN
# Or even simpler:
docker history my-image --no-trunc

Milestone 2: BuildKit Secret Mounts (Build-Time Secrets)

BuildKit secret mounts inject secrets during build without storing them in any layer:

Dockerfile

# syntax=docker/dockerfile:1
 
FROM node:20-alpine
WORKDIR /app
 
COPY package.json ./
 
# Secret is available during this RUN command only
# NOT stored in the resulting image layer
RUN --mount=type=secret,id=github_token \
    GITHUB_TOKEN=$(cat /run/secrets/github_token) \
    npm install --registry https://npm.pkg.github.com
 
# After this RUN completes, the secret is gone
# No trace in the image, no trace in docker history
 
COPY . .
CMD ["node", "server.js"]

Bash

# Pass the secret at build time
docker buildx build \
  --secret id=github_token,src=$HOME/.secrets/github-token \
  -t my-image:latest .
 
# Or use environment variable
export GITHUB_TOKEN=ghp_secret_token_here
docker buildx build \
  --secret id=github_token,env=GITHUB_TOKEN \
  -t my-image:latest .
 
# Verify secret is not in image
docker history my-image:latest
# No ENV GITHUB_TOKEN line anywhere — secret was never stored

Milestone 3: Runtime Secrets

For secrets the running application needs, pass them at docker run time from a secure source — never hardcode in docker-compose.yml:

Bash

# WRONG — secret in compose file (in version control)
services:
  api:
    environment:
      DB_PASSWORD: supersecret123  # Committed to git — disaster
 
# CORRECT — reference from environment
services:
  api:
    environment:
      DB_PASSWORD: ${DB_PASSWORD}  # From .env file or shell env
    # .env file is in .gitignore — never committed
 
# EVEN BETTER — from AWS Secrets Manager at runtime
# In your application startup code:
# const secret = await secretsManager.getSecretValue({SecretId: 'prod/db/password'})
# process.env.DB_PASSWORD = JSON.parse(secret.SecretString).password

Milestone 4: Docker Compose Secrets

Docker Compose has a native secrets mechanism:

YAML

version: "3.8"
 
services:
  api:
    image: myapp:latest
    secrets:
      - db_password
      - api_key
    # Secrets are mounted at /run/secrets/secret-name inside the container
 
secrets:
  db_password:
    file: ./secrets/db_password.txt   # Local file (dev only)
  api_key:
    external: true                    # Managed externally (production)
    name: prod_api_key

Bash

# In the container, read the secret from the file
cat /run/secrets/db_password
# supersecret123
 
# Application code reads from file, not environment variable:
const dbPassword = fs.readFileSync('/run/secrets/db_password', 'utf8').trim()

Milestone 5: HashiCorp Vault Integration

For production at Razorpay or PhonePe, secrets come from Vault:

YAML

# docker-compose.yml with Vault agent sidecar
services:
  vault-agent:
    image: hashicorp/vault:1.15
    command: ["vault", "agent", "-config=/vault/config/agent.hcl"]
    volumes:
      - vault-config:/vault/config
      - shared-secrets:/run/secrets
 
  api:
    image: registry.razorpay.in/payment-api:v3.1.0
    volumes:
      - shared-secrets:/run/secrets:ro  # Read-only access to Vault-populated secrets
    depends_on:
      - vault-agent
 
volumes:
  shared-secrets:
  vault-config:

Common Mistakes

Mistake	Risk	Fix
`ENV SECRET=value` in Dockerfile	Permanent in image history	Use BuildKit `--mount=type=secret`
Copying `.env` file into image	All secrets in image	Add `.env` to `.dockerignore`
Secrets in docker-compose.yml	Committed to git	Use `${VAR}` from .env or external secret manager
Using ARG for secrets	Visible in `docker history`	Use BuildKit secrets instead of ARG
Logging environment variables	Secrets in log files	Never log `process.env` dump in production

COMMON MISTAKE / WARNING
**Security:** If you have accidentally put a secret in a Docker image that was pushed to a registry, rotate the secret immediately. Removing the image from the registry does not protect you — anyone who pulled the image before you removed it still has the secret in their local Docker cache. Rotate first, investigate second.

JSON

{
  "title": "Docker Production Logging — Drivers, Aggregation, and Best Practices",
  "slug": "docker-production-logging",
  "cluster": "docker",
  "description": "Configure Docker logging drivers for production — from json-file limits to centralised aggregation with Fluentd, AWS CloudWatch, and the ELK stack.",
  "primaryKeyword": "docker logging production"
}

Docker Production Logging — Drivers, Aggregation, and Best Practices

Overview and What You Will Learn

Docker captures everything your container writes to stdout and stderr. By default it stores logs in JSON files on the host at /var/lib/docker/containers/*/. Without log rotation configured, these files grow indefinitely until the host disk fills up — which is one of the most common production Docker failures.

This guide covers how to configure log rotation, how to ship logs to centralised aggregation systems, and the logging patterns that work correctly in Docker.

Core Principles

TEXT

12-factor app logging rule:
  Application writes to stdout/stderr ONLY
  Docker captures stdout/stderr
  Log driver routes to destination
 
Do NOT:
  Write logs to files inside the container
  Manage log rotation inside the application
  Use a logging library that writes to disk
 
DO:
  console.log() in Node.js
  print() in Python
  fmt.Println() in Go
  Docker captures it, routes it via log driver

Docker Logging Drivers

Bash

# Check current default log driver
docker info | grep "Logging Driver"
# Logging Driver: json-file   <- default
 
# Available log drivers:
# json-file    Default. Writes to host filesystem. Must configure rotation.
# local        Efficient local storage with compression. Docker 20.10+
# syslog       Send to syslog daemon on host
# journald     Send to systemd journal on host
# fluentd      Send to Fluentd aggregator
# awslogs      Send to AWS CloudWatch Logs
# gelf         Send to Graylog Extended Log Format server
# splunk       Send to Splunk
# none         Disable all logging (docker logs will not work)

Configuring Log Rotation (Do This Immediately)

JSON

// /etc/docker/daemon.json — configure for ALL containers
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",    // Rotate when log file reaches 100MB
    "max-file": "3"        // Keep at most 3 rotated log files
    // Total max: 300MB per container — reasonable for most workloads
  }
}

Bash

# Apply without restarting containers
sudo systemctl reload docker
 
# Per-container override (overrides daemon defaults)
docker run -d \
  --name payment-api \
  --log-driver json-file \
  --log-opt max-size=200m \
  --log-opt max-file=5 \
  registry.razorpay.in/payment-api:v3.1.0
 
# In Docker Compose
services:
  api:
    logging:
      driver: json-file
      options:
        max-size: "100m"
        max-file: "3"

Shipping Logs to AWS CloudWatch

Bash

# Install CloudWatch log driver (built into Docker)
docker run -d \
  --name payment-api \
  --log-driver awslogs \
  --log-opt awslogs-region=ap-south-1 \
  --log-opt awslogs-group=/production/payment-api \
  --log-opt awslogs-stream=payment-api-$(hostname) \
  registry.razorpay.in/payment-api:v3.1.0
 
# In Docker Compose
services:
  api:
    logging:
      driver: awslogs
      options:
        awslogs-region: ap-south-1
        awslogs-group: /production/payment-api
        awslogs-stream: payment-api-1

Shipping Logs to Fluentd / ELK Stack

YAML

# docker-compose.yml with Fluentd log shipping
version: "3.8"
 
services:
  payment-api:
    image: registry.razorpay.in/payment-api:v3.1.0
    logging:
      driver: fluentd
      options:
        fluentd-address: fluentd:24224
        tag: payment-api.{{.Name}}
 
  fluentd:
    image: fluent/fluentd:v1.16
    volumes:
      - ./fluentd/fluent.conf:/fluentd/etc/fluent.conf
    ports:
      - "24224:24224"

Structured Logging — JSON for Production

JAVASCRIPT

// BAD — unstructured text logs
console.log("Payment processed for user " + userId + " amount: " + amount)
// 2024-01-15T09:00:00Z Payment processed for user 12345 amount: 999
// Hard to search, hard to filter, hard to alert on
 
// GOOD — structured JSON logs
const logger = {
  info: (msg, data) => console.log(JSON.stringify({
    level: "info",
    message: msg,
    timestamp: new Date().toISOString(),
    service: "payment-api",
    ...data
  }))
}
 
logger.info("Payment processed", {
  userId: 12345,
  amount: 999,
  currency: "INR",
  transactionId: "txn_abc123",
  durationMs: 45
})
// {"level":"info","message":"Payment processed","timestamp":"...","userId":12345,...}
// Easy to filter: jq 'select(.userId == 12345)'
// Easy to alert: CloudWatch metric filter on level=error

Common Mistakes

Mistake	Consequence	Fix
No log rotation configured	Disk fills up in days/weeks on busy services	Set `max-size` and `max-file` in daemon.json immediately
Writing logs to files inside container	Logs lost when container removed, disk fills in container layer	Write only to stdout/stderr
Using `none` driver	`docker logs` never works	Only use `none` for batch jobs where logs are not needed
Unstructured text logs	Cannot search or alert on specific fields	Use JSON structured logging
Different log formats per service	Cannot correlate logs across services	Standardise log format across all services

PLACEMENT PRO TIP
**Tip:** When `--log-driver awslogs` is set, `docker logs container-name` stops working — logs go directly to CloudWatch and Docker does not keep a local copy. Keep a fallback `--log-driver json-file` local copy for debugging by using `local` as the primary driver and shipping with a Fluentd sidecar instead.

REMEMBER THIS
**Remember:** Configure log rotation in `/etc/docker/daemon.json` before running any production workloads. A single busy API container generating 100MB of logs per day fills the host disk in 30 days with default settings. This is one of the most preventable production outages.

JSON

{
  "title": "Docker in CI/CD Pipelines — Build, Scan, Push, and Deploy",
  "slug": "docker-ci-cd-pipeline",
  "cluster": "docker",
  "description": "Build a complete Docker CI/CD pipeline in GitHub Actions — building images, scanning for vulnerabilities, pushing to ECR, and deploying to production.",
  "primaryKeyword": "docker ci cd pipeline"
}

Docker in CI/CD Pipelines — Build, Scan, Push, and Deploy

Overview and What You Will Learn

A complete Docker CI/CD pipeline does four things: builds the image, scans it for vulnerabilities, pushes it to a registry, and deploys it. Every step is a gate — if scanning finds CRITICAL CVEs, nothing is pushed. If pushing fails, nothing is deployed. In this guide you will build this complete pipeline in GitHub Actions.

Core Principles

Bash

Code push to main branch
           |
           v
+------------------------------------------+
| Stage 1: Build                           |
| docker buildx build with cache           |
| Tag with git SHA                         |
| Load into local daemon for scanning      |
+------------------------------------------+
           |
           | (fails if build error)
           v
+------------------------------------------+
| Stage 2: Scan                            |
| trivy image with CRITICAL,HIGH           |
| Fail pipeline if vulnerabilities found   |
| Upload SARIF to GitHub Security tab      |
+------------------------------------------+
           |
           | (fails if CVEs found)
           v
+------------------------------------------+
| Stage 3: Push                            |
| Authenticate to ECR with OIDC            |
| Push with SHA tag and latest tag         |
+------------------------------------------+
           |
           | (only on main branch)
           v
+------------------------------------------+
| Stage 4: Deploy                          |
| Update ECS task definition               |
| Or kubectl set image for Kubernetes      |
+------------------------------------------+

Complete GitHub Actions Pipeline

YAML

# .github/workflows/docker-pipeline.yml
name: Docker CI/CD Pipeline
 
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 
env:
  AWS_REGION: ap-south-1
  ECR_REPOSITORY: payment-api
  IMAGE_TAG: ${{ github.sha }}
 
jobs:
  build-scan-push:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      id-token: write          # Required for OIDC AWS authentication
      security-events: write   # Required for GitHub Security tab
 
    outputs:
      image-uri: ${{ steps.push.outputs.image-uri }}
 
    steps:
      - name: Checkout
        uses: actions/checkout@v4
 
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
 
      # OIDC authentication — no long-lived AWS keys in GitHub Secrets
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::905418385260:role/github-actions-ecr
          aws-region: ${{ env.AWS_REGION }}
 
      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2
 
      # Build with BuildKit cache
      - name: Build Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: false              # Do not push yet — scan first
          load: true               # Load into local daemon for scanning
          tags: ${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          build-args: |
            BUILD_VERSION=${{ env.IMAGE_TAG }}
            BUILD_DATE=${{ github.event.head_commit.timestamp }}
      # Scan BEFORE pushing — never push vulnerable images
      - name: Scan image with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }}
          format: sarif
          output: trivy-results.sarif
          severity: CRITICAL,HIGH
          exit-code: 1             # Fail pipeline on findings
          ignore-unfixed: true
 
      - name: Upload Trivy results to GitHub Security
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: trivy-results.sarif
 
      # Push only if scan passed and we are on main
      - name: Push to ECR
        id: push
        if: github.ref == 'refs/heads/main'
        run: |
          ECR_REGISTRY=${{ steps.login-ecr.outputs.registry }}
          FULL_IMAGE=$ECR_REGISTRY/$ECR_REPOSITORY
          docker tag $ECR_REPOSITORY:$IMAGE_TAG $FULL_IMAGE:$IMAGE_TAG
          docker tag $ECR_REPOSITORY:$IMAGE_TAG $FULL_IMAGE:latest
 
          docker push $FULL_IMAGE:$IMAGE_TAG
          docker push $FULL_IMAGE:latest
 
          echo "image-uri=$FULL_IMAGE:$IMAGE_TAG" >> $GITHUB_OUTPUT
 
  deploy:
    runs-on: ubuntu-latest
    needs: build-scan-push
    if: github.ref == 'refs/heads/main'
    environment: production         # Requires manual approval in GitHub
 
    steps:
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::905418385260:role/github-actions-deploy
          aws-region: ap-south-1
 
      # Option 1: Deploy to ECS
      - name: Update ECS service
        run: |
          aws ecs update-service \
            --cluster production \
            --service payment-api \
            --force-new-deployment \
            --region ap-south-1
      # Option 2: Deploy to Kubernetes
      # - name: Deploy to Kubernetes
      #   run: |
      #     kubectl set image deployment/payment-api \
      #       payment-api=${{ needs.build-scan-push.outputs.image-uri }} \
      #       -n production

Multi-Platform Builds

YAML

# Build for both AMD64 (standard) and ARM64 (AWS Graviton, M1 Mac)
- name: Set up QEMU for multi-platform
  uses: docker/setup-qemu-action@v3
 
- name: Build multi-platform image
  uses: docker/build-push-action@v5
  with:
    platforms: linux/amd64,linux/arm64
    push: true
    tags: |
      ${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }}
      ${{ env.ECR_REPOSITORY }}:latest

Rollback Strategy

Bash

# If deployment causes issues, rollback to previous SHA
PREVIOUS_SHA=$(git rev-parse HEAD~1)
 
# Update ECS to previous image
aws ecs update-service \
  --cluster production \
  --service payment-api \
  --task-definition payment-api:$(aws ecs describe-task-definition \
    --task-definition payment-api \
    --query 'taskDefinition.revision' --output text)
 
# Or for Kubernetes:
kubectl rollout undo deployment/payment-api -n production
# Kubernetes keeps the previous ReplicaSet for exactly this purpose

Common Mistakes

Mistake	Risk	Fix
Using long-lived AWS access keys in GitHub Secrets	Key rotation nightmare, security risk	Use OIDC role assumption — no static keys needed
Pushing before scanning	Vulnerable images in registry	Always scan before push, never after
Using `latest` as the only tag	Cannot identify which build is deployed	Always tag with git SHA: `image:${{ github.sha }}`
No deployment approval gate	Direct push to production on every merge	Add GitHub environment with required reviewers
Building twice (PR + merge)	Wasted CI minutes and time	Cache build artifacts between stages using outputs

PLACEMENT PRO TIP
**Tip:** Use GitHub Actions environments with required reviewers for the deploy job. This creates a manual approval gate before production deployment — an engineer reviews the scan results and confirms the deployment. The approved/rejected record is preserved in GitHub's audit log automatically.

REMEMBER THIS
**Remember:** Tag every production image with the git commit SHA (`${{ github.sha }}`). This makes it possible to trace any running container back to the exact code that built it — essential for incident response. When an incident happens, `docker inspect container-name --format '{{.Config.Image}}'` gives you the image tag, which maps directly to a git commit.

Running Docker Containers Securely — Non-Root Users and Capabilities

Overview and What You Will Learn

Core Principles

Detailed Step-by-Step Practical Lab

Milestone 1: Running as Non-Root

Milestone 2: Linux Capabilities

Milestone 3: Read-Only Root Filesystem

Milestone 4: The --no-new-privileges Flag

Complete Hardened docker run Command

Common Mistakes

Managing Docker Secrets — BuildKit Secrets, Runtime Secrets, and Vault

Overview and What You Will Learn

Core Principles

Detailed Step-by-Step Practical Lab

Milestone 1: Why ENV is Dangerous for Secrets

Milestone 2: BuildKit Secret Mounts (Build-Time Secrets)

Milestone 3: Runtime Secrets

Milestone 4: Docker Compose Secrets

Milestone 5: HashiCorp Vault Integration

Common Mistakes

Docker Production Logging — Drivers, Aggregation, and Best Practices

Overview and What You Will Learn

Core Principles

Docker Logging Drivers

Configuring Log Rotation (Do This Immediately)

Shipping Logs to AWS CloudWatch

Shipping Logs to Fluentd / ELK Stack

Structured Logging — JSON for Production

Common Mistakes

Docker in CI/CD Pipelines — Build, Scan, Push, and Deploy

Overview and What You Will Learn

Core Principles

Complete GitHub Actions Pipeline

Multi-Platform Builds

Rollback Strategy

Common Mistakes

Resources

Explore More in Docker Security and Production Practices

Docker Image Security Scanning — Trivy, Snyk, and ECR Scanning

Docker Secrets Management — Avoiding Credentials in Images and Compose Files

Docker Production Logging — Log Drivers, Rotation, and Centralised Collection