Linux Capabilities — Fine-Grained Container Privilege Control
What Are Linux Capabilities in Simple Terms?
In traditional Unix, a process is either unprivileged (UID > 0) or root (UID 0). Root can do everything. Linux capabilities split root's power into approximately 40 individual privileges that can be granted or revoked independently.
Docker does not give containers full root capabilities even when the container process runs as root inside the container. Docker drops 14 dangerous capabilities by default, and you can go further with --cap-drop ALL.
Without capabilities: Root = can do EVERYTHING (mount filesystems, load kernel modules, change network interfaces, bypass file permissions...) With capabilities: Each privilege is separate and controllable: CAP_NET_BIND_SERVICE = bind to ports < 1024 CAP_SYS_PTRACE = trace/debug processes CAP_SYS_ADMIN = many admin operations CAP_CHOWN = change file ownership etc.Docker's Default Capability Set
# Docker drops these 14 by default (always blocked):# CAP_SETPCAP, CAP_MKNOD, CAP_AUDIT_WRITE, CAP_CHOWN,# CAP_NET_RAW, CAP_DAC_OVERRIDE, CAP_FOWNER, CAP_FSETID,# CAP_KILL, CAP_SETGID, CAP_SETUID, CAP_NET_BIND_SERVICE,# CAP_SYS_CHROOT, CAP_SETFCAP # This is better than full root but still grants many capabilities# Production containers should drop everything and add back only what is neededDropping All Capabilities
# Maximum security — drop everythingdocker run -d \ --cap-drop ALL \ --name payment-api \ registry.razorpay.in/payment-api:v3.1.0 # If nginx needs to bind to port 80 (< 1024):docker run -d \ --cap-drop ALL \ --cap-add NET_BIND_SERVICE \ --name nginx \ nginx:1.25 # Common legitimate --cap-add needs:# NET_BIND_SERVICE = bind to ports below 1024 (80, 443)# CHOWN = change file ownership (some init scripts need this)# SYS_PTRACE = process debugging (NEVER in production)# NET_ADMIN = network configuration (monitoring agents)In Docker Compose
services: payment-api: cap_drop: * ALL cap_add: * NET_BIND_SERVICE nginx: cap_drop: * ALL cap_add: * NET_BIND_SERVICE * CHOWN # nginx changes file ownership on startupChecking Container Capabilities
# Check current effective capabilities of a running containerdocker exec payment-api cat /proc/1/status | grep Cap# CapEff: 0000000000000000 <- zero capabilities (all dropped) # Decode capability bitmaskcapsh --decode=00000000a80425fb# Useful for understanding what capabilities a container currently hasCOMMON MISTAKE / WARNING**Security:** `CAP_SYS_ADMIN` is the most dangerous capability — it allows mounting filesystems, loading kernel modules, and dozens of other privileged operations. Never add `CAP_SYS_ADMIN` to production containers. If you think you need it, investigate whether a more specific capability satisfies the requirement.