BestEffort QoS — The Most Dangerous Pod Class in Production
What is BestEffort in Simple Terms?
BestEffort is what Kubernetes assigns to a pod when you write no resource requests or limits at all. The name sounds acceptable — "best effort" — but in production it means the pod has zero guarantees and will be killed first when any node runs low on memory. It is the lowest possible priority class.
Think of it like this: on a crowded Mumbai local train, Guaranteed passengers have reserved seats, Burstable passengers have standing room, and BestEffort passengers are told "we will let you on if there is space, and we will push you off first if we need room."
+------------------------------------------+| Node Memory Pressure Event || Available memory drops below 200Mi |+------------------------------------------+ | v+------------------------------------------+| kubelet eviction order: || || 1. BestEffort pods killed FIRST | <- No resources configured| 2. Burstable pods killed second | <- requests < limits| 3. Guaranteed pods killed last | <- requests == limits+------------------------------------------+How a Pod Gets BestEffort Classification
BestEffort is assigned automatically when there is no resources block at all on any container:
# This pod has BestEffort QoS — NEVER do this in productionapiVersion: v1kind: Podmetadata: name: risky-api namespace: productionspec: containers: * name: api image: registry.swiggy.in/api:v2.1.0 ports: * containerPort: 8080 # No resources block at all # Kubernetes classifies this as BestEffort automatically# Verify the QoS class of any podkubectl get pod risky-api -n production \ -o jsonpath='{.status.qosClass}'# Output: BestEffort # Find ALL BestEffort pods in your cluster right nowkubectl get pods -A -o json | \ jq '.items[] | select(.status.qosClass=="BestEffort") | {name: .metadata.name, namespace: .metadata.namespace}'What Happens to BestEffort Pods at Swiggy Scale
At 7pm during dinner rush, Swiggy's nodes spike in memory usage as HPA adds pods rapidly. The kubelet on each node monitors memory pressure continuously. When available memory drops below the eviction threshold (default 200Mi free), it immediately kills BestEffort pods — no warning, no graceful shutdown timer.
+------------------------------------------+| Normal traffic: node at 60% memory || BestEffort pod running fine |+------------------------------------------+ | Traffic spike at 7pm v+------------------------------------------+| Memory climbs: node at 92% memory || Available drops below 200Mi threshold |+------------------------------------------+ | v+------------------------------------------+| kubelet kills BestEffort pod immediately | <- No graceful shutdown| Exit code: 137 (SIGKILL) | Data in memory lost+------------------------------------------+ | v+------------------------------------------+| If inside a Deployment: || Deployment creates replacement pod || Pod restarts from scratch |+------------------------------------------+Why BestEffort Pods Get Rejected in Quota Namespaces
If a namespace has a ResourceQuota set for CPU or memory, BestEffort pods are rejected outright at admission — they cannot even start:
# Namespace has ResourceQuota:kubectl get quota -n production# NAME AGE REQUEST LIMIT# prod-quota 5d requests.cpu: 14/32 limits.cpu: 28/64 # Try to create a BestEffort pod (no resources block)kubectl apply -f no-resources-pod.yaml -n production# Error from server (Forbidden): pods "risky-api" is forbidden:# failed quota: prod-quota: must specify limits.cpu for: api;# must specify limits.memory for: api;# must specify requests.cpu for: api;# must specify requests.memory for: apiThis rejection behaviour is actually a feature — it forces developers to set resource values, which is the correct behaviour for any production namespace.
How to Fix BestEffort — Add Proper Resources
# Correct version — Burstable QoS (most appropriate for most services)spec: containers: * name: api image: registry.swiggy.in/api:v2.1.0 resources: requests: cpu: "200m" # Guaranteed minimum — scheduler reserves this memory: "256Mi" # Guaranteed minimum limits: cpu: "1000m" # Can burst up to 5x request under low node load memory: "512Mi" # Hard ceiling — OOMKilled if exceeded # QoS class: Burstable (requests != limits)# After adding resources, verify QoS class changedkubectl get pod api-pod -n production \ -o jsonpath='{.status.qosClass}'# Output: BurstableFinding and Fixing BestEffort Pods in an Existing Cluster
# Step 1 — Find all BestEffort pods in the clusterkubectl get pods -A -o json | \ jq -r '.items[] | select(.status.qosClass=="BestEffort") | [.metadata.namespace, .metadata.name] | @tsv' | \ column -t # Step 2 — Find which Deployments produce BestEffort podskubectl get deployments -A -o json | \ jq -r '.items[] | select(.spec.template.spec.containers[].resources == {}) | [.metadata.namespace, .metadata.name] | @tsv' | \ column -t # Step 3 — Check what resource values they actually use right nowkubectl top pods -n production# Use this output to set informed requests and limits# Rule: requests = average observed, limits = peak observed + 30% # Step 4 — Apply LimitRange to auto-inject defaults for any pods that slip throughkubectl apply -f - <<EOFapiVersion: v1kind: LimitRangemetadata: name: prevent-besteffort namespace: productionspec: limits: * type: Container defaultRequest: cpu: "100m" memory: "128Mi" default: cpu: "500m" memory: "512Mi"EOF# Now any pod created without resources gets Burstable defaults injected# BestEffort is impossible in this namespaceCOMMON MISTAKE / WARNING**Security:** BestEffort pods on a multi-tenant cluster at Razorpay or PhonePe are a stability risk for everyone. One BestEffort pod with a memory leak can cause node-level memory pressure that evicts Burstable pods from completely different teams. Always apply a `ResourceQuota` and `LimitRange` to every production namespace — this structurally prevents BestEffort pods from existing.
REMEMBER THIS**Remember:** BestEffort is not inherently bad for every use case. It is perfectly acceptable for local development clusters, one-off debugging pods (`kubectl run debug --image=busybox`), and non-critical batch jobs in non-production namespaces. The rule is simple: never BestEffort in production namespaces.
COMMON MISTAKE / WARNING**Common Mistake:** Assuming that a pod running fine today will keep running fine tomorrow. A BestEffort pod can run for weeks without issue on a lightly loaded cluster. Then a traffic spike or a new team member deploying a memory-hungry service causes a node pressure event — and your BestEffort pod vanishes with no warning and no explanation in standard monitoring dashboards.
PLACEMENT PRO TIP**Tip:** Add a Prometheus alert for BestEffort pods appearing in production namespaces. Use the query `kube_pod_status_qos_class{qos_class="BestEffort", namespace=~"production|payments|orders"}` and fire a Slack warning immediately. This catches misconfigured deployments before they cause problems during peak traffic.