What is the career path for learning Docker Production Logging — Log Drivers, Rotation, and Centralised Collection?

Mastering Docker Production Logging — Log Drivers, Rotation, and Centralised Collection enables engineering opportunities in DevOps, SRE, and cloud platform automation.

Docker Production Logging — Log Drivers, Rotation, and Centralised Collection | DevOps Network

Q: How long does it take to learn Docker Production Logging — Log Drivers, Rotation, and Centralised Collection?

Most students gain core proficiency in Docker Production Logging — Log Drivers, Rotation, and Centralised Collection in 2–3 weeks of active hands-on labs.

Overview and What You Will Learn

By default, Docker writes all container stdout and stderr to JSON files on the host. There are no size limits. A single verbose service can fill an entire disk overnight, taking down every other container on the host. This is not a hypothetical — it is one of the most common production outages caused by Docker misconfiguration.

Beyond disk safety, production logging requires centralised collection. When you have five services across three hosts, you cannot SSH into each machine and tail logs individually every time something goes wrong. You need all logs in one place, searchable, with timestamps aligned.

By the end of this lab you will:

Set log rotation limits on individual services and as a Docker daemon default
Understand when to use each major log driver: json-file, local, fluentd, awslogs
Configure the local driver for better performance and compression
Ship logs to Grafana Loki using the Docker Loki plugin
Ship logs to AWS CloudWatch for EC2-based deployments
Read and filter logs efficiently using docker logs

Why This Matters in Production

A Hotstar backend team ran a high-frequency event tracking service that logged every incoming request at DEBUG level. No log rotation was configured. During a major cricket match, traffic spiked 20x, the log volume followed, and the host disk hit 100% at 2 AM. Docker stopped being able to write log files, containers started crashing with write errors, and the entire stack went down — not because of application bugs, but because nobody had set a max-size on the log driver.

Log rotation is a five-line config change. There is no reason to skip it.

Core Principles

Log driver selection by deployment type:

◈ DIAGRAM

+--------------------------------------------+
| Single host, no central logging needed     |
| Use: local driver with rotation limits     | <- best default for small setups
+--------------------------------------------+
                    |
                    v
+--------------------------------------------+
| Single/multi host, self-hosted stack       |
| Use: Loki log driver -> Grafana            | <- pairs with Prometheus monitoring
+--------------------------------------------+
                    |
                    v
+--------------------------------------------+
| AWS EC2 or ECS deployment                  |
| Use: awslogs driver -> CloudWatch Logs     | <- native AWS integration
+--------------------------------------------+
                    |
                    v
+--------------------------------------------+
| High-volume, structured log pipeline       |
| Use: fluentd driver -> Elasticsearch       | <- EFK stack for heavy workloads
+--------------------------------------------+

Detailed Step-by-Step Practical Lab

Milestone 1 — Set rotation limits per service in Compose

YAML

## docker-compose.yml
services:
  api:
    image: hotstar-api:latest
    logging:
      ## json-file is the default driver
      driver: "json-file"
      options:
        ## Each log file is capped at 50MB before rotation
        max-size: "50m"
        ## Docker keeps the last 5 rotated files = max 250MB total
        max-file: "5"
 
  nginx:
    image: nginx:stable
    logging:
      driver: "json-file"
      options:
        max-size: "20m"
        max-file: "3"

Milestone 2 — Set rotation as a daemon default (applies to all containers)

JSON

// /etc/docker/daemon.json
// Changes here apply to ALL containers on this host
// that do not have their own explicit logging config
{
  "log-driver": "local",
  "log-opts": {
    "max-size": "50m",
    "max-file": "5"
  }
}

Bash

## Apply daemon config changes
sudo systemctl restart docker
 
## Verify the daemon picked up the new config
docker info | grep -A5 "Logging Driver"

REMEMBER THIS
**Remember:** The `daemon.json` logging config only applies to containers started *after* the daemon restart. Existing containers keep the log driver they were started with. Recreate them with `docker compose up -d` to apply the new config.

Milestone 3 — Use the local driver for better performance

YAML

services:
  worker:
    image: hotstar-worker:latest
    logging:
      ## local driver compresses rotated log files automatically
      ## Uses ~70% less disk than json-file for the same log volume
      ## Slightly faster writes — uses protobuf internally instead of JSON
      driver: "local"
      options:
        max-size: "50m"
        max-file: "5"

COMMON MISTAKE / WARNING
**Common Mistake:** Using the `local` driver and then trying to read logs with external tools that parse Docker's `json-file` format. The `local` driver uses a different internal format — always use `docker logs` to read from it, not direct file parsing.

Milestone 4 — Ship logs to Grafana Loki

Bash

## Install the Loki Docker log driver plugin
docker plugin install grafana/loki-docker-driver:latest \
  --alias loki \
  --grant-all-permissions
 
## Verify installation
docker plugin ls

YAML

## docker-compose.yml with Loki driver
services:
  api:
    image: hotstar-api:latest
    logging:
      driver: loki
      options:
        ## Loki endpoint — replace with your Loki host
        loki-url: "http://10.0.1.55:3100/loki/api/v1/push"
        ## Labels appear as searchable dimensions in Grafana
        loki-external-labels: "job=hotstar-api,env=production,host=${HOSTNAME}"
        ## Keep a local buffer in case Loki is temporarily unreachable
        loki-retries: "5"
        loki-batch-wait: "1s"
        ## Fallback to local disk if Loki is down
        loki-pipeline-stages: |
          - json:
              expressions:
                level: level
                message: msg

Milestone 5 — Ship logs to AWS CloudWatch

YAML

## docker-compose.yml for EC2 deployment with CloudWatch
services:
  api:
    image: hotstar-api:latest
    logging:
      driver: awslogs
      options:
        ## CloudWatch log group — create it in AWS console first
        awslogs-group: /hotstar/production/api
        ## Region where the EC2 instance is running
        awslogs-region: ap-south-1
        ## Stream per container — unique identifier
        awslogs-stream: api-${HOSTNAME}
        ## Create the log stream automatically if it does not exist
        awslogs-create-group: "true"

Bash

## The EC2 instance needs this IAM policy attached to its instance role
## (or set AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY env vars as fallback)
## Required permissions:
## logs:CreateLogGroup
## logs:CreateLogStream
## logs:PutLogEvents
## logs:DescribeLogStreams

Milestone 6 — Read and filter logs efficiently

Bash

## Follow logs in real time (equivalent to tail -f)
docker logs -f hotstar-api
 
## Show only the last 100 lines
docker logs --tail 100 hotstar-api
 
## Show logs since a specific time
docker logs --since "2024-01-15T14:00:00" hotstar-api
 
## Show logs between two timestamps
docker logs --since "2024-01-15T14:00:00" --until "2024-01-15T14:30:00" hotstar-api
 
## Include timestamps in output (useful when the app does not log its own)
docker logs -t hotstar-api
 
## Filter by keyword using grep
docker logs hotstar-api 2>&1 | grep -i "error"
 
## With Compose — logs across all services simultaneously
docker compose logs -f
docker compose logs -f api worker

Production Best Practices and Common Pitfalls

Scenario	Wrong	Correct
Log rotation	No limits set (default)	`max-size: 50m`, `max-file: 5` per service
Daemon default	Each compose file manages rotation	Set rotation in `/etc/docker/daemon.json`
High volume services	`json-file` driver	`local` driver — compresses rotated files
Multi-host debugging	SSH into each host	Centralise with Loki, CloudWatch, or Fluentd
Log level in prod	DEBUG logging on by default	INFO or WARN in prod, DEBUG only when needed
Structured logs	Plain string messages	JSON-formatted logs for easier filtering in Loki/CloudWatch

Quick Reference and Troubleshooting Commands

Task	Command
View container logs	`docker logs <container>`
Follow logs live	`docker logs -f <container>`
Last N lines	`docker logs --tail 100 <container>`
Since timestamp	`docker logs --since 1h <container>`
Check log driver	`docker inspect <container> --format '{{.HostConfig.LogConfig}}'`
Check log file size	`du -sh /var/lib/docker/containers/<id>/<id>-json.log`
List installed plugins	`docker plugin ls`
View daemon log config	`docker info

PLACEMENT PRO TIP
**Tip:** Applications should write logs to stdout and stderr, never to files inside the container. Docker's log driver captures stdout/stderr automatically. Logs written to files inside the container are not rotated by Docker and disappear when the container is removed.

COMMON MISTAKE / WARNING
**Security:** CloudWatch log groups should have retention policies set (e.g., 30 or 90 days). Without a retention policy, logs accumulate indefinitely and CloudWatch costs grow continuously. Set retention when creating the log group.

Docker Production Logging — Log Drivers, Rotation, and Centralised Collection

Overview and What You Will Learn

Why This Matters in Production

Core Principles

Detailed Step-by-Step Practical Lab

Production Best Practices and Common Pitfalls

Quick Reference and Troubleshooting Commands

Resources

Explore More in Docker Security and Production Practices

Docker Image Security Scanning — Trivy, Snyk, and ECR Scanning

Running Docker Containers Securely — Non-Root Users and Capabilities

Docker Secrets Management — Avoiding Credentials in Images and Compose Files