Burstable QoS — The Right Choice for Most Production Workloads
What is Burstable QoS in Simple Terms?
Burstable is the middle class of Kubernetes QoS. A Burstable pod has a guaranteed minimum — the resource request — but is allowed to use more than that when the node has spare capacity. It is the most practical and most commonly used QoS class in real production clusters.
Think of it as a mobile data plan with a base allocation and an optional burst. Your pod is guaranteed its 200m CPU request, but can use up to 1000m when other pods on the node are quiet. This balances protection with efficient node utilisation.
How a Pod Gets Burstable Classification
A pod is Burstable when at least one container has requests set but they differ from limits:
# Example 1 — requests and limits both set but different valuescontainers: * name: order-api resources: requests: cpu: "200m" # Guaranteed minimum memory: "256Mi" # Guaranteed minimum limits: cpu: "1000m" # Can burst to 5x CPU when node is idle memory: "512Mi" # Hard ceiling — OOMKilled if exceeded# QoS class: Burstable (requests != limits) # Example 2 — only requests set, no limitscontainers: * name: worker resources: requests: cpu: "100m" memory: "128Mi" # No limits — can consume unlimited resources up to node capacity # QoS class: Burstable BUT dangerous without limits # Don't do this in production — always set limits # Example 3 — mixed containers (one Guaranteed, one not)containers: * name: main-app resources: requests: cpu: "500m" memory: "512Mi" limits: cpu: "500m" # Equal — this container is Guaranteed memory: "512Mi" # Equal — this container is Guaranteed * name: sidecar resources: requests: cpu: "100m" memory: "64Mi" limits: cpu: "200m" # Different — this container is Burstable memory: "128Mi" # Different — this container is Burstable# Entire pod QoS: Burstable (because sidecar has requests != limits)# Check QoS class of a podkubectl get pod order-api-7d9f8c-xkp2q -n production \ -o jsonpath='{.status.qosClass}'# Output: BurstableHow Burstable Bursting Actually Works
The CPU burst behaviour comes from Linux cgroups. When a node has free CPU cycles, the kernel allows Burstable containers to use beyond their request — up to their limit:
| Node CPU: 16 cores|| 14:00 — High traffic period:| +------------------+| | order-api pod || | request: 200m || | using: 950m | <- Bursting — other pods quiet| | limit: 1000m || +------------------+|| 14:30 — All services spike:| +------------------+ +------------------+ +------------------+| | order-api | | payment-api | | notification-api || | request: 200m | | request: 200m | | request: 200m || | using: 200m | | using: 200m | | using: 200m || | (throttled back) | | (throttled back) | | (throttled back) || +------------------+ +------------------+ +------------------+| Each pod falls back to its guaranteed request during contentionMemory burst works differently — memory cannot be throttled the way CPU can. If a Burstable pod tries to use beyond its memory limit, it is immediately OOMKilled.
The Right Way to Set Burstable Resources
# Step 1 — Measure actual usage firstkubectl top pods -n production -l app=order-api# NAME CPU(cores) MEMORY(bytes)# order-api-7d9f8c-xkp2q 185m 198Mi# order-api-7d9f8c-ab1cd 210m 225Mi# order-api-7d9f8c-zx9lp 195m 207Mi # Step 2 — Calculate good values# Average CPU: ~195m -> set request to 200m# Peak CPU: ~210m -> set limit to 500m (2.5x for burst headroom) # Average memory: ~210Mi -> set request to 256Mi (round up)# Peak memory: ~225Mi -> set limit to 384Mi (peak + 70% headroom)# Memory headroom must be generous — exceeding limit = instant OOMKill# Result — well-configured Burstable podcontainers: * name: order-api image: registry.swiggy.in/order-api:v3.1.0 resources: requests: cpu: "200m" # Based on observed average memory: "256Mi" # Based on observed average + buffer limits: cpu: "500m" # 2.5x burst capacity for traffic spikes memory: "384Mi" # Peak observed + 70% headroomBurstable Eviction — When Memory Pressure Hits
When a node runs low on memory and all BestEffort pods have already been evicted, the kubelet starts evicting Burstable pods. It chooses which Burstable pod to evict based on how far above its request it is currently using:
| Node memory pressure — evicting Burstable pods:|| order-api: request=256Mi, using=380Mi (using 148% of request)| payment-api: request=256Mi, using=320Mi (using 125% of request)| worker: request=256Mi, using=260Mi (using 101% of request)|| Eviction order:| 1. order-api evicted first (highest % over request)| 2. payment-api evicted second| 3. worker evicted lastThis means a Burstable pod that is currently using exactly its request is nearly as safe as a Guaranteed pod during eviction events.
Checking Burstable Pod Behaviour in Production
# List all Burstable pods in a namespacekubectl get pods -n production -o json | \ jq -r '.items[] | select(.status.qosClass=="Burstable") | [.metadata.name, .status.qosClass] | @tsv' # Monitor whether pods are being CPU throttled (bursting to limit too often)kubectl exec -it order-api-7d9f8c-xkp2q -n production -- \ cat /sys/fs/cgroup/cpu/cpu.stat# nr_throttled: 0 <- good, not hitting the CPU limit# nr_throttled: 450 <- bad, CPU limit too low — increase it # Watch for pods approaching their memory limitkubectl top pods -n production -l app=order-api# If MEMORY usage is close to the limit you set, increase the limit# before an OOMKill happensPLACEMENT PRO TIP**Tip:** Burstable is the right choice for 80-90% of workloads in a production Kubernetes cluster. Use it for all stateless API pods, workers, and background services. The key is setting the request at average observed usage and the limit at peak observed usage plus 30-50% safety headroom for memory. CPU limits can be more generous since throttling is survivable — memory limits must be carefully sized since exceeding them causes instant pod death.
REMEMBER THIS**Remember:** The memory limit in a Burstable pod is a hard wall — the kernel kills the process the instant it crosses the limit with no warning. CPU limits are soft — the kernel throttles (slows down) the container instead. This asymmetry means you should size memory limits conservatively with generous headroom, while CPU limits can afford to be set higher since throttling is recoverable.
COMMON MISTAKE / WARNING**Common Mistake:** Setting requests equal to limits thinking it makes the pod safer. This creates Guaranteed QoS which sounds better but wastes cluster capacity. If your pod usually uses 200m CPU but you set request=limit=1000m, the scheduler reserves 1000m on the node even though the pod only uses 200m. You waste 800m of capacity on every node where this pod lands. Measure actual usage first and set limits based on real peak values.
COMMON MISTAKE / WARNING**Security:** Burstable pods without memory limits (`limits.memory` not set) are classified as Burstable but can consume all available memory on the node — effectively becoming a denial of service attack against other pods. Always set memory limits even if you set CPU limits generously. A Burstable pod with no memory limit is as dangerous as a BestEffort pod during a memory leak event.