What is the career path for learning Configuring Persistent Volumes and Storage Classes in Kubernetes?

Mastering Configuring Persistent Volumes and Storage Classes in Kubernetes enables engineering opportunities in DevOps, SRE, and cloud platform automation.

Configuring Persistent Volumes and Storage Classes in Kubernetes | DevOps Network

Q: How long does it take to learn Configuring Persistent Volumes and Storage Classes in Kubernetes?

Most students gain core proficiency in Configuring Persistent Volumes and Storage Classes in Kubernetes in 2–3 weeks of active hands-on labs.

Overview and What You Will Learn

Containers are ephemeral by design — when a pod restarts, all data written to its filesystem is lost. For databases, message queues, and any stateful workload, this is catastrophic. Kubernetes solves this through PersistentVolumes (PV), PersistentVolumeClaims (PVC), and StorageClasses — a three-layer abstraction that decouples storage provisioning from storage consumption, allowing pods to survive restarts, rescheduling, and node failures without losing data.

By the end of this guide you will be able to:

Understand the PV, PVC, and StorageClass relationship and provisioning lifecycle
Create StorageClasses for dynamic volume provisioning on AWS, GCP, and on-prem clusters
Write PersistentVolumeClaims and mount volumes correctly inside pod specs
Configure volume access modes and reclaim policies for different production workloads
Troubleshoot PVC stuck in Pending state and volume mount failures

Why This Matters in Production

Zerodha's trading platform stores order books, trade history, and user portfolio data in PostgreSQL running on Kubernetes. If the PostgreSQL pod restarts without a PersistentVolume, every trade record since the last external backup is lost — a regulatory violation and a catastrophic user trust failure.

At Hotstar, the video transcoding pipeline writes intermediate encoded segments to shared storage that multiple pods must read simultaneously. The wrong access mode on the PVC causes silent data corruption or outright mount failures. Understanding storage configuration is not optional for any engineer running stateful workloads on Kubernetes.

Core Principles

The three-layer storage abstraction and how they compose:CLUSTER ADMIN DEVELOPER APPLICATION ───────────── ───────── ─────────── StorageClass PersistentVolumeClaim Pod spec (defines HOW (requests WHAT (mounts PVC storage is storage is needed: as a volume provisioned: size, access mode, at a path) AWS EBS, GCP PD, storage class) NFS, local disk) │ │ │ └────────────────────────────────┴───────────────────────────┘ │ PersistentVolume (PV) (the actual provisioned storage unit — created automatically by StorageClass or manually by admin for static provisioning)

Access modes — the most misunderstood configuration in Kubernetes storage:ReadWriteOnce (RWO) → One node can mount read-write at a time Used for: databases, single-instance stateful apps Supported by: AWS EBS, GCP Persistent Disk, Azure DiskReadWriteMany (RWX) → Multiple nodes can mount read-write simultaneously Used for: shared file storage, media assets, ML datasets Supported by: NFS, AWS EFS, GCP Filestore, Azure Files NOT supported by: EBS, GCP PD, Azure DiskReadOnlyMany (ROX) → Multiple nodes can mount read-only simultaneously Used for: shared config files, static assets

Detailed Step-by-Step Practical Lab

Step 1 — Inspect Available StorageClasses

NIX

kubectl get storageclassesExample output on an AWS EKS cluster:
NAME                PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE
gp2 (default)       ebs.csi.aws.com         Delete          WaitForFirstConsumer
gp3-encrypted       ebs.csi.aws.com         Retain          WaitForFirstConsumer
efs-sc              efs.csi.aws.com         Retain          ImmediateInspect a specific StorageClass for full configuration
kubectl describe storageclass gp3-encrypted
 
> 📌 **Remember:** `RECLAIMPOLICY: Delete` means the underlying cloud disk is permanently deleted when the PVC is deleted. `RECLAIMPOLICY: Retain` keeps the disk even after PVC deletion — always use Retain for production databases.
 
#### Step 2 — Create Production StorageClasses
 
```yamlstorageclasses.yaml — define storage tiers for different workload typesTier 1: Fast encrypted SSD for databases (AWS gp3)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
annotations:
storageclass.kubernetes.io/is-default-class: "false"
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "6000"                  # 6000 IOPS — good for PostgreSQL/MySQL
throughput: "250"             # 250 MB/s throughput
encrypted: "true"             # Encrypt at rest — required for financial data
kmsKeyId: "arn:aws:kms:ap-south-1:123456789:key/zerodha-ebs-key"
reclaimPolicy: Retain           # NEVER auto-delete production database disks
allowVolumeExpansion: true      # Allow resizing PVCs without downtime
volumeBindingMode: WaitForFirstConsumer  # Provision in same AZ as pod
Tier 2: Shared file storage for media assets (AWS EFS — supports RWX)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-shared
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-0a1b2c3d4e5f6789   # Your EFS filesystem ID
directoryPerms: "700"
reclaimPolicy: Retain
volumeBindingMode: Immediate
Tier 3: Fast local NVMe for temporary high-performance workloads
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-nvme
provisioner: kubernetes.io/no-provisioner   # Manual provisioning
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete           # Local storage is node-specific — delete on release
 
```bashkubectl apply -f storageclasses.yaml
 
#### Step 3 — Create PersistentVolumeClaims for Different Workloads
 
```yamlpvcs.yaml — storage claims for different production workloadsPVC for PostgreSQL database — single node, high IOPS, encrypted
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data-pvc
namespace: production
labels:
app: postgres
team: platform
spec:
accessModes:
- ReadWriteOnce           # Only one node mounts at a time — correct for databases
storageClassName: gp3-encrypted
resources:
requests:
storage: 100Gi          # Start with 100GB — can expand later without downtime
PVC for Hotstar video asset storage — multiple pods read/write simultaneously
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: video-assets-pvc
namespace: transcoding
spec:
accessModes:
- ReadWriteMany           # Multiple transcoding pods mount simultaneously
storageClassName: efs-shared
resources:
requests:
storage: 5Ti            # 5TB for video asset storage
PVC for Redis cache persistence — small, fast, single node
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-data-pvc
namespace: production
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3-encrypted
resources:
requests:
storage: 20Gi
 
```bashkubectl apply -f pvcs.yamlCheck PVC status — should move from Pending to Bound
kubectl get pvc -n production
NAME                STATUS    VOLUME                                     CAPACITY
postgres-data-pvc   Bound     pvc-a1b2c3d4-e5f6-7890-abcd-ef1234567890   100Gi
redis-data-pvc      Bound     pvc-b2c3d4e5-f6a7-8901-bcde-f12345678901   20Gi
 
#### Step 4 — Mount PVCs into Pod Specs
 
```yamldeployment-postgres.yaml — PostgreSQL with persistent storage
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
securityContext:
fsGroup: 999            # PostgreSQL runs as UID 999 — set volume ownership
containers:
- name: postgres
image: postgres:15.4
env:
- name: POSTGRES_DB
value: zerodha_trading
- name: POSTGRES_USER
value: rahul
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata   # Subdirectory avoids lost+found issue
ports:
- containerPort: 5432
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data     # PostgreSQL data directory
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-data-pvc              # Reference the PVC by name
 
```bashkubectl apply -f deployment-postgres.yamlVerify volume is mounted inside the pod
kubectl exec -it postgres-0 -n production -- df -h /var/lib/postgresql/data
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme1n1     98G  156M   98G   1% /var/lib/postgresql/data
 
> ⚠️ **Security:** Always set `securityContext.fsGroup` to match the UID your application runs as. Without it, the mounted volume is owned by root and your application process may fail to write to it — causing a crash that looks like a storage failure but is actually a permissions issue.
 
#### Step 5 — Expand a PVC Without Downtime
 
When your database grows beyond the initial allocation:
 
```bashVerify the StorageClass supports volume expansion
kubectl get storageclass gp3-encrypted -o jsonpath='{.allowVolumeExpansion}'
trueEdit the PVC to request more storage
kubectl patch pvc postgres-data-pvc -n production 
--type='json' 
-p='[{"op":"replace","path":"/spec/resources/requests/storage","value":"200Gi"}]'Watch the expansion happen
kubectl get pvc postgres-data-pvc -n production -w
NAME                STATUS   CAPACITY   CONDITIONS
postgres-data-pvc   Bound    100Gi      Resizing...
postgres-data-pvc   Bound    200Gi      FileSystemResizePending
postgres-data-pvc   Bound    200Gi      ← expansion completeFor filesystem resize to complete — the pod may need a restart
kubectl rollout restart statefulset/postgres -n production
 
> 💡 **Tip:** Volume expansion only works in one direction — you can increase a PVC's size but never decrease it. Always start with a reasonable baseline and use `allowVolumeExpansion: true` on your StorageClass so you can grow without recreating the PVC.
 
#### Step 6 — Troubleshoot PVC Stuck in Pending State
 
```bashPVC not moving from Pending to Bound
kubectl get pvc postgres-data-pvc -n production
NAME                STATUS    VOLUME   CAPACITY   ACCESS MODES
postgres-data-pvc   Pending                                      ← stuckStep 1 — Describe the PVC for the reason
kubectl describe pvc postgres-data-pvc -n production
Events:
Warning  ProvisioningFailed  storageclass.storage.k8s.io "gp3-encrypted" not found
→ StorageClass name is wrong or not installedWarning  ProvisioningFailed  failed to provision volume:
InvalidParameterValue: The iops parameter is not supported for volume type gp2
→ Wrong parameters for the volume typeWarning  WaitForFirstConsumer  waiting for first consumer to be created
→ VolumeBindingMode is WaitForFirstConsumer —
PVC will stay Pending until a pod tries to mount it. This is normal.Step 2 — Check if the CSI driver is running
kubectl get pods -n kube-system | grep ebs-csi
ebs-csi-controller-xxx   6/6     Running   0   5d
ebs-csi-node-xxx          3/3     Running   0   5dStep 3 — Check CSI driver logs for provisioning errors
kubectl logs -n kube-system 
-l app=ebs-csi-controller 
-c csi-provisioner 
--tail=50
 
#### Step 7 — Implement Volume Snapshots for Backup
 
```yamlvolume-snapshot.yaml — take a point-in-time snapshot of the PostgreSQL volume
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgres-snapshot-20250525
namespace: production
spec:
volumeSnapshotClassName: csi-aws-vsc
source:
persistentVolumeClaimName: postgres-data-pvc    # Snapshot this PVC
 
```bashkubectl apply -f volume-snapshot.yamlCheck snapshot status
kubectl get volumesnapshot -n production
NAME                           READYTOUSE   SOURCEPVC           RESTORESIZE   AGE
postgres-snapshot-20250525     true         postgres-data-pvc   100Gi         2mRestore from snapshot into a new PVC
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data-restored
namespace: production
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3-encrypted
resources:
requests:
storage: 100Gi
dataSource:
name: postgres-snapshot-20250525
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
 
> 📌 **Remember:** Volume snapshots are crash-consistent, not application-consistent. For PostgreSQL, always run `pg_dump` or use `pg_basebackup` for application-consistent backups. Use volume snapshots as a fast recovery complement, not as your only backup strategy.
 
### Production Best Practices & Common Pitfalls
 
* Always use `PGDATA=/var/lib/postgresql/data/pgdata` (a subdirectory) for PostgreSQL on Kubernetes. Mounting directly to `/var/lib/postgresql/data` causes PostgreSQL to fail because the volume root contains a `lost+found` directory it cannot handle.
* Tag your PVCs with team and application labels — at scale, identifying which PVC belongs to which application becomes impossible without consistent labelling.
* Set up automated volume snapshot schedules using Velero or the cloud provider's native snapshot scheduler. A 100GB PostgreSQL volume with no snapshots is a single point of failure.
* Monitor PVC usage with `kubectl exec <pod> -- df -h` and alert at 80% full — Kubernetes does not automatically expand volumes and a full disk causes immediate pod failure.
* Never share a single RWO PVC between multiple pods. Only one node can mount it at a time — if a second pod tries to mount it on a different node, it will stay in Pending or ContainerCreating indefinitely.
 
> 🔴 **Common Mistake:** Using `reclaimPolicy: Delete` on production database StorageClasses. When a developer accidentally runs `kubectl delete pvc postgres-data-pvc`, the underlying cloud disk and all its data is permanently deleted within seconds. Always use `Retain` for any storage containing production data.
 
### Quick Reference & Troubleshooting Commands
 
| Command | Purpose |
|:---|:---|
| `kubectl get pvc -n <ns>` | List all PVCs and their binding status |
| `kubectl describe pvc <name> -n <ns>` | Full PVC details and provisioning events |
| `kubectl get pv` | List all PersistentVolumes cluster-wide |
| `kubectl get storageclass` | List available StorageClasses |
| `kubectl describe storageclass <name>` | Full StorageClass configuration |
| `kubectl patch pvc <name> -n <ns> --type='json' -p='[...]'` | Expand PVC size |
| `kubectl exec <pod> -n <ns> -- df -h` | Check disk usage inside a pod |
| `kubectl get volumesnapshot -n <ns>` | List volume snapshots |
| `kubectl logs -n kube-system -l app=ebs-csi-controller -c csi-provisioner` | Debug CSI provisioning failures |
| `kubectl get events -n <ns> --field-selector reason=ProvisioningFailed` | Filter storage provisioning failures |

Configuring Persistent Volumes and Storage Classes in Kubernetes

Overview and What You Will Learn

Why This Matters in Production

Core Principles

Detailed Step-by-Step Practical Lab

Step 1 — Inspect Available StorageClasses

Resources

Explore More in Kubernetes Workload Management

Running StatefulSets for Databases on Kubernetes

Implementing Liveness and Readiness Probes for Zero-Downtime Deploys

Kubernetes Jobs and CronJobs for Batch Workloads

Configuring Pod Disruption Budgets for Zero-Downtime Upgrades