kube-proxy ā Extended Technical Detail
What is kube-proxy in Simple Terms?
When you create a Kubernetes Service with a ClusterIP, that IP address is virtual ā no actual network interface has it. kube-proxy is the component that makes that virtual IP work by programming routing rules on every node so traffic sent to the Service IP gets forwarded to a real, live pod IP.
+------------------------------------------+| Service created: payments-svc || ClusterIP: 10.96.45.200, port 3000 | <- Virtual IP ā no interface owns this+------------------------------------------+ | v+------------------------------------------+| kube-proxy watches API server | <- Detects new Service and its Endpoints+------------------------------------------+ | v+------------------------------------------+| kube-proxy writes rules on EVERY node || || 10.96.45.200:3000 -> 10.244.1.15:3000 | <- pod-1 on mumbai-worker-1| -> 10.244.2.8:3000 | <- pod-2 on mumbai-worker-2| -> 10.244.3.22:3000 | <- pod-3 on mumbai-worker-3+------------------------------------------+ | v+------------------------------------------+| Any pod on any node reaches Service IP | <- Transparent load balancing+------------------------------------------+kube-proxy Modes ā iptables vs IPVS
+------------------------+ +------------------------------+| iptables mode | | IPVS mode || (default) | | (high-performance) || | | || Linear chain scan | <------> | Hash table lookup || O(n) per packet | | O(1) per packet || | | || Sufficient up to | | Required for 1000+ Services || ~1000 Services | | (PhonePe, Razorpay scale) || | | || No load balance algos | | Round-robin, least-conn, || | | source-hash available |+------------------------+ +------------------------------+Check kube-proxy Status and Mode
1# kube-proxy runs as a DaemonSet ā one pod per node2kubectl get pods -n kube-system -l k8s-app=kube-proxy -o wide3 4# Output:5# NAME READY STATUS NODE6# kube-proxy-4xj9p 1/1 Running mumbai-worker-17# kube-proxy-7kmnz 1/1 Running mumbai-worker-28# kube-proxy-9plvw 1/1 Running mumbai-worker-39 10# Check which mode kube-proxy is running in11kubectl logs -n kube-system kube-proxy-4xj9p | grep -i "using\|proxier"12# Output: Using iptables Proxier13 14# View the iptables rules kube-proxy has written for a Service15iptables -t nat -L KUBE-SERVICES | grep 10.96.45.200Switching kube-proxy to IPVS Mode
1# Edit the kube-proxy ConfigMap to switch to IPVS2kubectl edit configmap kube-proxy -n kube-system3 4# Find and update the mode field:5apiVersion: v16kind: ConfigMap7metadata:8 name: kube-proxy9 namespace: kube-system10data:11 config.conf: |12 apiVersion: kubeproxy.config.k8s.io/v1alpha113 kind: KubeProxyConfiguration14 mode: "ipvs" # Change from "" or "iptables" to "ipvs"15 ipvs:16 scheduler: "rr" # Round-robin (options: rr, lc, dh, sh, sed, nq)1# After editing the ConfigMap, restart all kube-proxy pods to pick up the change2kubectl rollout restart daemonset kube-proxy -n kube-system3 4# Verify the switch worked5kubectl logs -n kube-system kube-proxy-4xj9p | grep -i "using\|proxier"6# Output: Using ipvs Proxier7 8# View the IPVS virtual server table9ipvsadm -Ln | grep 10.96.45.200How kube-proxy Handles Pod Churn
When pods are added or removed (deployments scale up/down, rolling updates), kube-proxy updates rules in real time:
+------------------------------------------+| Deployment scales from 3 to 5 replicas | <- kubectl scale or HPA triggers+------------------------------------------+ | v+------------------------------------------+| API Server updates Endpoints object | <- New pod IPs added to endpoint slice+------------------------------------------+ | v+------------------------------------------+| kube-proxy detects Endpoints change | <- Watches API server continuously+------------------------------------------+ | v+------------------------------------------+| iptables/IPVS rules updated on all nodes | <- New pods immediately receive traffic+------------------------------------------+What kube-proxy Does NOT Handle
+-----------------------------+| kube-proxy handles | <- ClusterIP, NodePort, LoadBalancer (internal rules)+-----------------------------+| NOT handled by kube-proxy | <- Ingress traffic routing+-----------------------------+| NOT handled by kube-proxy | <- DNS resolution (that is CoreDNS)+-----------------------------+| NOT handled by kube-proxy | <- NetworkPolicy enforcement (that is CNI: Calico/Cilium)+-----------------------------+| NOT handled by kube-proxy | <- Pod-to-pod routing across nodes (that is CNI overlay)+-----------------------------+Troubleshooting Common kube-proxy Problems
| Problem | Symptom | Fix |
|---|---|---|
| Service ClusterIP unreachable | curl 10.96.45.200:3000 times out from inside a pod |
kube-proxy pod may be crashing ā check kubectl logs -n kube-system kube-proxy-xxxxx |
| kube-proxy pod in CrashLoopBackOff | Logs show failed to sync iptables rules |
Check if the node has iptables or ip_tables kernel module loaded: `lsmod |
| IPVS mode not activating | Logs show fallback to iptables despite config change | IPVS kernel modules not loaded ā run modprobe ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh on the node |
| Stale routing after pod deletion | Traffic to deleted pod IP causes connection refused | kube-proxy sync delay ā check --iptables-sync-period flag (default 30s); reduce to 10s for high-churn workloads |
| NodePort not accessible externally | Internal ClusterIP works but NodePort times out | Firewall or security group blocking the NodePort range (30000-32767) ā open the range on the cloud provider |
š” Tip: For high-traffic clusters like PhonePe's payment processing nodes handling 1000+ Services, switch kube-proxy to IPVS mode. IPVS uses hash tables instead of linear iptables chains ā at 5000 Services, iptables packet processing can add milliseconds of latency per request while IPVS stays at sub-millisecond.
š Remember: kube-proxy does not handle Ingress traffic. It only manages internal cluster Service routing (ClusterIP, NodePort, LoadBalancer backend rules). External traffic routing from the internet into the cluster is handled by the Ingress Controller (NGINX, Traefik) and the cloud provider's load balancer.
ā ļø Security: On Razorpay's production cluster, kube-proxy's iptables rules are a critical security boundary. Any node compromise that gives an attacker root access allows them to modify iptables rules directly ā redirecting Service traffic to attacker-controlled pods. Use NodeRestriction admission and audit logging on the kube-proxy ConfigMap to detect tampering.
š“ Common Mistake: Disabling kube-proxy entirely when adopting eBPF-based CNIs (like Cilium) without enabling Cilium's kube-proxy replacement mode first. Removing kube-proxy with no replacement leaves all Service ClusterIPs non-functional ā a full cluster networking outage. Always enable kubeProxyReplacement: strict in Cilium config before removing the kube-proxy DaemonSet.