What is BestEffort? | DevOps Dictionary

BestEffort QoS — The Most Dangerous Pod Class in Production

What is BestEffort in Simple Terms?

BestEffort is what Kubernetes assigns to a pod when you write no resource requests or limits at all. The name sounds acceptable — "best effort" — but in production it means the pod has zero guarantees and will be killed first when any node runs low on memory. It is the lowest possible priority class.

Think of it like this: on a crowded Mumbai local train, Guaranteed passengers have reserved seats, Burstable passengers have standing room, and BestEffort passengers are told "we will let you on if there is space, and we will push you off first if we need room."

◈ DIAGRAM

+------------------------------------------+
| Node Memory Pressure Event               |
| Available memory drops below 200Mi       |
+------------------------------------------+
                    |
                    v
+------------------------------------------+
| kubelet eviction order:                  |
|                                          |
| 1. BestEffort pods killed FIRST          | <- No resources configured
| 2. Burstable pods killed second          | <- requests < limits
| 3. Guaranteed pods killed last           | <- requests == limits
+------------------------------------------+

How a Pod Gets BestEffort Classification

BestEffort is assigned automatically when there is no resources block at all on any container:

YAML

# This pod has BestEffort QoS — NEVER do this in production
apiVersion: v1
kind: Pod
metadata:
  name: risky-api
  namespace: production
spec:
  containers:
    * name: api
      image: registry.swiggy.in/api:v2.1.0
      ports:
        * containerPort: 8080
      # No resources block at all
      # Kubernetes classifies this as BestEffort automatically

Bash

# Verify the QoS class of any pod
kubectl get pod risky-api -n production \
  -o jsonpath='{.status.qosClass}'
# Output: BestEffort
 
# Find ALL BestEffort pods in your cluster right now
kubectl get pods -A -o json | \
  jq '.items[] | select(.status.qosClass=="BestEffort") | 
  {name: .metadata.name, namespace: .metadata.namespace}'

What Happens to BestEffort Pods at Swiggy Scale

At 7pm during dinner rush, Swiggy's nodes spike in memory usage as HPA adds pods rapidly. The kubelet on each node monitors memory pressure continuously. When available memory drops below the eviction threshold (default 200Mi free), it immediately kills BestEffort pods — no warning, no graceful shutdown timer.

◈ DIAGRAM

+------------------------------------------+
| Normal traffic: node at 60% memory       |
| BestEffort pod running fine              |
+------------------------------------------+
                    |
          Traffic spike at 7pm
                    v
+------------------------------------------+
| Memory climbs: node at 92% memory        |
| Available drops below 200Mi threshold    |
+------------------------------------------+
                    |
                    v
+------------------------------------------+
| kubelet kills BestEffort pod immediately | <- No graceful shutdown
| Exit code: 137 (SIGKILL)                 |    Data in memory lost
+------------------------------------------+
                    |
                    v
+------------------------------------------+
| If inside a Deployment:                  |
| Deployment creates replacement pod       |
| Pod restarts from scratch                |
+------------------------------------------+

Why BestEffort Pods Get Rejected in Quota Namespaces

If a namespace has a ResourceQuota set for CPU or memory, BestEffort pods are rejected outright at admission — they cannot even start:

Bash

# Namespace has ResourceQuota:
kubectl get quota -n production
# NAME             AGE   REQUEST                    LIMIT
# prod-quota       5d    requests.cpu: 14/32        limits.cpu: 28/64
 
# Try to create a BestEffort pod (no resources block)
kubectl apply -f no-resources-pod.yaml -n production
# Error from server (Forbidden): pods "risky-api" is forbidden:
# failed quota: prod-quota: must specify limits.cpu for: api;
# must specify limits.memory for: api;
# must specify requests.cpu for: api;
# must specify requests.memory for: api

This rejection behaviour is actually a feature — it forces developers to set resource values, which is the correct behaviour for any production namespace.

How to Fix BestEffort — Add Proper Resources

YAML

# Correct version — Burstable QoS (most appropriate for most services)
spec:
  containers:
    * name: api
      image: registry.swiggy.in/api:v2.1.0
      resources:
        requests:
          cpu: "200m"      # Guaranteed minimum — scheduler reserves this
          memory: "256Mi"  # Guaranteed minimum
        limits:
          cpu: "1000m"     # Can burst up to 5x request under low node load
          memory: "512Mi"  # Hard ceiling — OOMKilled if exceeded
      # QoS class: Burstable (requests != limits)

Bash

# After adding resources, verify QoS class changed
kubectl get pod api-pod -n production \
  -o jsonpath='{.status.qosClass}'
# Output: Burstable

Finding and Fixing BestEffort Pods in an Existing Cluster

Bash

# Step 1 — Find all BestEffort pods in the cluster
kubectl get pods -A -o json | \
  jq -r '.items[] | select(.status.qosClass=="BestEffort") |
  [.metadata.namespace, .metadata.name] | @tsv' | \
  column -t
 
# Step 2 — Find which Deployments produce BestEffort pods
kubectl get deployments -A -o json | \
  jq -r '.items[] | 
  select(.spec.template.spec.containers[].resources == {}) |
  [.metadata.namespace, .metadata.name] | @tsv' | \
  column -t
 
# Step 3 — Check what resource values they actually use right now
kubectl top pods -n production
# Use this output to set informed requests and limits
# Rule: requests = average observed, limits = peak observed + 30%
 
# Step 4 — Apply LimitRange to auto-inject defaults for any pods that slip through
kubectl apply -f - <<EOF
apiVersion: v1
kind: LimitRange
metadata:
  name: prevent-besteffort
  namespace: production
spec:
  limits:
    * type: Container
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      default:
        cpu: "500m"
        memory: "512Mi"
EOF
# Now any pod created without resources gets Burstable defaults injected
# BestEffort is impossible in this namespace

COMMON MISTAKE / WARNING
**Security:** BestEffort pods on a multi-tenant cluster at Razorpay or PhonePe are a stability risk for everyone. One BestEffort pod with a memory leak can cause node-level memory pressure that evicts Burstable pods from completely different teams. Always apply a `ResourceQuota` and `LimitRange` to every production namespace — this structurally prevents BestEffort pods from existing.

REMEMBER THIS
**Remember:** BestEffort is not inherently bad for every use case. It is perfectly acceptable for local development clusters, one-off debugging pods (`kubectl run debug --image=busybox`), and non-critical batch jobs in non-production namespaces. The rule is simple: never BestEffort in production namespaces.

COMMON MISTAKE / WARNING
**Common Mistake:** Assuming that a pod running fine today will keep running fine tomorrow. A BestEffort pod can run for weeks without issue on a lightly loaded cluster. Then a traffic spike or a new team member deploying a memory-hungry service causes a node pressure event — and your BestEffort pod vanishes with no warning and no explanation in standard monitoring dashboards.

PLACEMENT PRO TIP
**Tip:** Add a Prometheus alert for BestEffort pods appearing in production namespaces. Use the query `kube_pod_status_qos_class{qos_class="BestEffort", namespace=~"production|payments|orders"}` and fire a Slack warning immediately. This catches misconfigured deployments before they cause problems during peak traffic.