Explore hands-on topics grouped by their parent concept. Filter by hub to find exactly what you need.
Deploy Lambda functions with correct memory, concurrency controls, VPC access, RDS Proxy integration, and cold start optimisation for production use.
Explore TopicConfigure Application, Network, and Gateway Load Balancers with health checks, SSL termination, sticky sessions, and Auto Scaling Group integration.
Explore TopicConfigure Auto Scaling Groups with Launch Templates, scaling policies, and CloudWatch alarms to automatically adjust EC2 fleet size based on real demand.
Explore TopicQuery raw S3 data with Athena using SQL, convert CSV to Parquet with Glue ETL, and build a governed data lake with Lake Formation for cost-optimised analytics.
Explore TopicDeploy containerised applications on ECS with Fargate, configure Task Definitions, IAM roles, Auto Scaling, and integrate with ALB for production traffic.
Explore TopicRun production Kubernetes workloads on EKS with Managed Node Groups, Fargate profiles, EBS/EFS storage, and AWS-native IAM integration via IRSA.
Explore TopicDecouple services with SQS queues, fan-out messages with SNS, and process real-time streaming data with Kinesis Data Streams and Firehose.
Explore TopicImplement layered AWS security with KMS encryption, secret rotation, WAF rules, DDoS protection, and intelligent threat detection with GuardDuty.
Explore TopicConfigure Route 53 hosted zones, routing policies, health checks, and hybrid DNS endpoints to control global traffic routing and automatic failover.
Explore TopicDesign and build a production VPC with public and private subnets, Internet Gateway, NAT Gateway, Security Groups, and VPC Flow Logs from scratch.
Explore TopicDistribute content globally with CloudFront edge caching and route latency-sensitive traffic through AWS private network using Global Accelerator.
Explore TopicImplement AWS IAM with users, groups, roles, and JSON policies following least privilege principles — and audit with Credentials Report and Access Advisor.
Explore TopicManage multi-account AWS environments with Organizations, SCPs, IAM Identity Center SSO, Permission Boundaries, and cross-account role assumptions.
Explore TopicDesign and implement the four AWS disaster recovery strategies based on RPO and RTO requirements, from simple S3 backups to active-active multi-site.
Explore TopicDesign DynamoDB tables with the right primary key, capacity mode, and access patterns, and use Streams, Global Tables, DAX, and TTL for production workloads.
Explore TopicConfigure RDS Multi-AZ, Read Replicas, Aurora Global, and automated backups for production relational database workloads with zero-downtime operations.
Explore TopicMaster S3 bucket policies, versioning, replication, storage classes, lifecycle rules, encryption, and performance optimisation for production workloads.
Explore TopicBuild reliable automated testing gates in CI — unit tests with coverage enforcement, integration tests with real dependencies, and E2E smoke tests post-deployment.
Explore TopicDesign production-grade multi-stage pipelines with gate checks, artifact promotion, parallel job execution, and environment-specific deployment steps.
Explore TopicLearn what CI/CD actually means, how pipelines are structured into stages and jobs, and why automated delivery reduces release risk at production scale.
Explore TopicBuild a complete production CI/CD pipeline in GitHub Actions — Docker build, testing, image scanning, ECR push, and Kubernetes deployment in one workflow.
Explore TopicConfigure GitLab CI/CD pipelines with .gitlab-ci.yml — covering runners, environments, artifacts, cache, rules, and deployment to Kubernetes clusters.
Explore TopicWrite production-grade declarative Jenkinsfiles with multi-stage pipelines, parallel execution, shared libraries, Docker agent builds, and credential management.
Explore TopicImplement blue-green and canary deployment strategies inside CI/CD pipelines — automating traffic switching, monitoring gates, and rollback triggers.
Explore TopicUse the four DORA metrics to measure deployment frequency, lead time, change failure rate, and MTTR in production.
Explore TopicBuild environment promotion pipelines that automatically deploy to dev and staging, then require manual approval for production — with proper config management.
Explore TopicImplement GitOps using ArgoCD so Git is the single source of truth and the cluster always matches the repo state.
Explore TopicSecure CI/CD pipelines with OIDC authentication, proper secret management, and least-privilege runner permissions in production.
Explore TopicImplement fast rollback strategies for failed deployments — kubectl rollout undo, ArgoCD sync to prior revision, and automated metric-based rollback triggers.
Explore TopicBuild a complete Docker CI/CD pipeline: multi-stage builds, vulnerability scanning with Trivy, push to ECR, and zero-downtime deploy using GitHub Actions.
Explore TopicDefine and run multi-container applications with Docker Compose — covering the compose file schema, service configuration, and essential CLI commands.
Explore TopicConfigure service health checks and correct startup ordering in Docker Compose so apps never start before their dependencies are ready.
Explore TopicApply production-grade Compose patterns: restart policies, CPU/memory limits, environment separation, and config management for real deployments.
Explore TopicBuild a complete local development stack with Docker Compose — API, PostgreSQL, Redis, and RabbitMQ — with hot reloading and clean config.
Explore TopicUse Docker Compose inside GitHub Actions to run integration tests against real databases, caches, and queues before every merge.
Explore TopicManage persistent data in Docker using named volumes, bind mounts, and tmpfs — and learn exactly which one to use for production vs development.
Explore TopicConfigure CPU and memory limits on Docker containers using cgroups to prevent resource exhaustion and ensure stable multi-container host performance.
Explore TopicMaster the complete Docker container lifecycle — creating, running, stopping, restarting, and removing containers with confidence in production.
Explore TopicLearn how the Docker daemon, CLI client, containerd runtime, and runc work together to create and manage containers on a Linux host.
Explore TopicImplement a production image tagging strategy using semantic versions and git SHAs, and manage images across Docker Hub, ECR, and private registries.
Explore TopicUse multi-stage Dockerfiles to produce minimal production images by separating build dependencies from runtime artifacts, reducing image size by up to 90%.
Explore TopicReduce Docker image sizes from gigabytes to megabytes using layer squashing, minimal base images, cache mounts, and build output analysis.
Explore TopicWrite lean, secure, and cache-efficient Dockerfiles for production using layer ordering, .dockerignore, non-root users, and minimal base images.
Explore TopicDiagnose and fix common Docker container failures using logs, exec, inspect, and stats — the same debugging workflow used in production environments.
Explore TopicUnderstand Docker's embedded DNS server, how containers resolve each other by name inside user-defined networks, and how this maps to Kubernetes CoreDNS.
Explore TopicMaster Docker's four network drivers and understand when to use each for local development, single-host production, and multi-host cluster deployments.
Explore TopicScan Docker images for CVEs and vulnerabilities using Trivy and Snyk, integrate scanning into CI/CD pipelines, and enforce policies that block unsafe images.
Explore TopicConfigure Docker log drivers, set rotation limits to protect disk space, and ship container logs to centralised systems like Loki or CloudWatch.
Explore TopicKeep credentials out of Docker images and Compose files using Docker Secrets, BuildKit secret mounts, and environment variable best practices.
Explore TopicHarden Docker containers for production by running as non-root, dropping Linux capabilities, and using read-only filesystems to reduce the attack surface.
Explore TopicConfigure NGINX Ingress Controllers on Kubernetes to route production HTTP and HTTPS traffic with SSL termination, path routing, and rate limiting.
Explore TopicLearn how to correctly set CPU and memory requests and limits on Kubernetes pods to prevent OOMKills, CPU throttling, and noisy neighbour problems in production.
Explore TopicDiagnose and fix Kubernetes pod networking failures, DNS resolution issues, and CNI plugin misconfigurations using kubectl, netshoot, and network policy debugging tools.
Explore TopicLearn how to deploy new application versions to Kubernetes with zero downtime using rolling updates, blue-green switching, and canary traffic splitting.
Explore TopicSecure Kubernetes pods using Pod Security Standards, securityContext settings, and runtime controls that prevent privilege escalation and container breakout attacks.
Explore TopicBy default, every pod in a Kubernetes cluster can talk to every other pod — regardless of namespace, team, or sensitivity. Network Policies are Kubernetes firewall rules that restrict this. They define which pods are allowed to send traffic to which other pods, and which external IPs can reach your services. Without them, a compromised frontend pod can directly connect to your production database.
Explore TopicSet up a complete Kubernetes monitoring stack with Prometheus for metrics collection, Grafana for dashboards, and Alertmanager for notifications to Slack and PagerDuty.
Explore TopicMaster all Kubernetes Service types — ClusterIP for internal traffic, NodePort for node-level access, and LoadBalancer for production external exposure with real examples.
Explore TopicConfigure Kubernetes HPA to automatically scale pod replicas based on CPU, memory, and custom metrics to handle traffic spikes without manual intervention.
Explore TopicDiagnose and fix ImagePullBackOff and ErrImagePull errors in Kubernetes caused by registry authentication failures, incorrect image names, and network restrictions.
Explore TopicMaster standard practices for Troubleshooting Kubernetes Pod OOMKilled and CrashLoopBackOff Errors.
Explore TopicImplement Kubernetes RBAC with Roles, ClusterRoles, and ServiceAccounts to enforce least-privilege access across multi-team production clusters.
Explore TopicSecurely manage Kubernetes Secrets and ConfigMaps in production using HashiCorp Vault, secret injection, encryption at rest, and RBAC access controls.
Explore TopicConfigure PersistentVolumes, PersistentVolumeClaims, and StorageClasses in Kubernetes to provide durable storage for stateful workloads in production.
Explore TopicA Pod Disruption Budget (PDB) is a policy that tells Kubernetes the minimum number of pods that must stay running during voluntary disruptions — node drains, cluster upgrades, and rolling deployments. Without one, a node drain can terminate all pods of a service simultaneously, causing a complete outage.
Explore TopicConfigure liveness, readiness, and startup probes in Kubernetes to eliminate downtime during rolling deployments and protect production traffic from unhealthy pods.
Explore TopicKubernetes Jobs and CronJobs run tasks that are meant to complete - not run forever like a web server. A Job runs a pod once and exits cleanly. A CronJob runs it on a schedule. Every database migration, report generation, and data pipeline at Swiggy or Razorpay that needs Kubernetes-level reliability uses one of these two.
Explore TopicA Kubernetes pod can run more than one container. Two patterns govern how these containers are used: Init Containers run sequentially before your app starts — used for setup tasks and dependency checks. Sidecars run alongside your app for the entire pod lifetime — used for logging, proxying, and metrics collection. Understanding both patterns eliminates entire categories of startup bugs and observability gaps.
Explore TopicDeploy and manage stateful database workloads on Kubernetes using StatefulSets with stable network identities, ordered scaling, and persistent storage.
Explore TopicUnderstand the Unix permission model and manage file permissions and ownership on production Linux servers using chmod, chown, and ACLs.
Explore TopicMaster Linux filesystem hierarchy, absolute and relative paths, and the navigation commands used on every production server.
Explore TopicManage software installation, updates, and removal across Debian and RedHat Linux distributions using apt, yum, and dnf in production environments.
Explore TopicManage Linux users, groups, and sudo access to enforce least-privilege access control on production servers shared across engineering teams.
Explore TopicConfigure production Linux firewalls using iptables and ufw — implement default-deny rules, protect SSH, handle Docker port exposure, and persist rules across reboots.
Explore TopicMaster SSH key pair generation, ~/.ssh/config setup, bastion host access, port forwarding, and production sshd hardening for secure Linux server management.
Explore TopicApply a production security hardening baseline to Linux servers covering SSH, users, packages, kernel parameters, fail2ban, and automated compliance scanning with lynis.
Explore TopicUse Linux networking tools to configure interfaces, diagnose connectivity failures, inspect DNS, capture packets, and test HTTP in production environments.
Explore TopicMaster Linux process inspection and control using ps, top, htop, kill, and signals to diagnose and manage production server workloads.
Explore TopicMaster systemd for production service management — start, stop, enable services, write unit files for Node.js apps, and diagnose failures with journalctl.
Explore TopicDiagnose CPU, memory, disk I/O, and network bottlenecks on production Linux servers using vmstat, iostat, free, iotop, ss, and dstat.
Explore TopicSchedule reliable automated tasks on Linux servers using cron and systemd timers with proper logging, error handling, and production-grade automation patterns.
Explore TopicMaster Linux environment variables, PATH management, dotfile configuration, shell startup file ordering, and direnv for consistent production and development environments.
Explore TopicProcess and transform text on Linux servers using grep, awk, sed, cut, sort, uniq, and jq — the tools for parsing logs, extracting fields, and analysing JSON in production.
Explore TopicWrite production-grade shell scripts for DevOps — deployment pipelines, health checks, retry logic, Slack notifications, and CI/CD automation patterns used at scale.
Explore TopicWrite production-grade shell scripts for DevOps — deployment pipelines, health checks, retry logic, Slack notifications, and CI/CD automation patterns used at scale.
Explore TopicWrite reliable Bash scripts from scratch — variables, quoting, conditionals, loops, functions, and production error handling with set -euo pipefail and trap.
Explore TopicProvision a production-ready AWS infrastructure stack with Terraform — VPC, subnets, security groups, EC2 instances, S3 buckets, and RDS databases.
Explore TopicUse Terraform data sources to read existing cloud resources — fetching AMI IDs, VPC details, and secrets without importing them into your state file.
Explore TopicMaster Terraform HCL syntax — writing reusable variables, capturing outputs, using locals for calculations, and writing expressions and conditionals.
Explore TopicWrite your first Terraform configuration from scratch — installing Terraform, configuring the AWS provider, and provisioning real infrastructure.
Explore TopicSet up Atlantis to automate Terraform plan and apply through GitHub pull requests, giving every team member a safe, auditable way to change infrastructure.
Explore TopicAutomate Terraform with GitHub Actions — running terraform plan on pull requests and terraform apply on merge, with OIDC authentication and PR comments.
Explore TopicDetect and fix infrastructure drift — when someone changes a resource in the AWS console and your Terraform state no longer matches reality.
Explore TopicUse the Terraform Registry to find battle-tested community modules for AWS, GCP, and Azure — and learn how to publish and version your own private modules.
Explore TopicOrganise Terraform code for large teams — directory-per-environment patterns, Terraform workspaces, and Terragrunt for DRY multi-environment infrastructure.
Explore TopicWrite your first Terraform module from scratch — the right directory structure, input variables, outputs, and the patterns platform teams use at scale.
Explore TopicImport AWS resources created manually or outside Terraform into your state file so Terraform can manage them going forward without recreating them.
Explore TopicConfigure Terraform remote state on AWS S3 with DynamoDB locking so your whole team can safely run Terraform without overwriting each other's changes.
Explore TopicUnderstand Terraform state — what the tfstate file contains, why it is the source of truth for your infrastructure, and how to inspect and manage it safely.
Explore Topic