What is the career path for learning Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle?

Mastering Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle enables engineering opportunities in DevOps, SRE, and cloud platform automation.

Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle | DevOps Network

Q: How long does it take to learn Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle?

Most students gain core proficiency in Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle in 2–3 weeks of active hands-on labs.

Overview and What You Will Learn

Before CI/CD, shipping software was an event. Teams would accumulate changes for weeks, merge everything together, cross their fingers, and deploy on a Friday night. When something broke — and something always broke — engineers spent the weekend debugging. This model does not scale. It does not scale to the deployment frequency Razorpay needs (dozens per day), to the team size Hotstar operates at (hundreds of engineers), or to the risk tolerance of a payment company where downtime costs real money every minute.

CI/CD — Continuous Integration and Continuous Delivery — replaces the deployment event with a deployment process. Every code change is automatically built, tested, and prepared for delivery. Releases become routine rather than risky.

By the end of this topic you will:

Explain the difference between CI, CD, and CDP precisely
Read a pipeline and identify every stage, job, trigger, and artifact
Apply the fail-fast principle to pipeline stage ordering
Understand why pipeline-as-code matters for team workflows
Recognise the cost of a slow or flaky pipeline

Why This Matters in Production

A 45-minute pipeline is a broken team workflow. When engineers wait 45 minutes to know if their code works, they context-switch away, start something new, and forget what they were debugging when the failure finally arrives. The cognitive cost of slow pipelines is enormous.

A 10-minute pipeline is a superpower. Engineers get near-instant feedback, fix issues while the context is fresh, and deploy with confidence multiple times a day. At Zerodha, fast pipelines are what make high deployment frequency possible — and high deployment frequency is what keeps each individual release small, low-risk, and easy to roll back.

Core Principles

CI vs CD vs Continuous Deployment:

◈ DIAGRAM

+------------------------------------------+
| Continuous Integration (CI)              |
|                                          |
| Every commit is automatically:           |
|   - Built (compiled / packaged)          |
|   - Tested (unit + integration tests)    |
|   - Validated (lint, format, scan)       |
|                                          |
| Goal: detect integration problems fast   |
+------------------------------------------+
                    |
                    v
+------------------------------------------+
| Continuous Delivery (CD)                 |
|                                          |
| Every change that passes CI is:          |
|   - Packaged as a deployable artifact    |
|   - Deployed to staging automatically   |
|   - Ready for production deployment      |
|                                          |
| Human still approves production deploy  |
+------------------------------------------+
                    |
                    v
+------------------------------------------+
| Continuous Deployment (CDP)              |
|                                          |
| Every change that passes all gates is:   |
|   - Deployed to production automatically |
|   - No manual approval gate             |
|                                          |
| Requires: excellent test coverage,       |
| fast rollback, feature flags            |
+------------------------------------------+

Pipeline anatomy — every component explained:

◈ DIAGRAM

Trigger                                    Artifact
   |                                          |
   v                                          v
[Git push] -> [Build] -> [Test] -> [Scan] -> [Deploy]
                |            |         |
              Job 1        Job 2     Job 3
           compile      unit-test  trivy-scan
                         Job 4
                        lint        <- parallel with Job 2

The fail-fast principle in practice:

◈ DIAGRAM

Fastest checks first -- most expensive checks last:
 
1. Lint and format (10 seconds)    <- cheapest, catches typos
2. Unit tests (2 minutes)          <- fast, catches logic errors
3. Build Docker image (3 minutes)  <- medium, needs successful tests
4. Integration tests (5 minutes)   <- slower, needs running service
5. Security scan (4 minutes)       <- parallel with integration
6. Deploy to staging (2 minutes)   <- only after all above pass
 
Total time: ~12 minutes to staging

Detailed Step-by-Step Practical Lab

Milestone 1 — Read a real pipeline definition

YAML

## .github/workflows/ci.yaml -- read this and understand every line
name: Payment API CI/CD
 
on:
  push:
    branches: [main]           ## trigger: push to main
  pull_request:
    branches: [main]           ## trigger: PR targeting main
 
jobs:
  ## Job 1: Build Docker image
  build:
    runs-on: ubuntu-latest
    outputs:
      image-tag: ${{ steps.tag.outputs.tag }}
    steps:
      - uses: actions/checkout@v4
      - id: tag
        run: echo "tag=${{ github.sha }}" >> $GITHUB_OUTPUT
      - run: docker build -t payment-api:${{ github.sha }} .
      - run: docker push ${{ env.ECR_REGISTRY }}/payment-api:${{ github.sha }}
 
  ## Job 2: Unit tests (runs after build)
  unit-test:
    needs: build              ## dependency: build must complete first
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm test
 
  ## Job 3: Lint (runs PARALLEL with unit-test, both need build)
  lint:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm run lint
 
  ## Job 4: Deploy (runs after BOTH unit-test AND lint)
  deploy:
    needs: [unit-test, lint]  ## both must pass
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - run: ./deploy.sh staging ${{ needs.build.outputs.image-tag }}

Milestone 2 — Trace a commit through the pipeline

Bash

## 1. Engineer pushes a commit
git add payments/processor.js
git commit -m "fix: handle timeout in payment processor"
git push origin main
 
## 2. GitHub receives the push event
## Webhook fires -> GitHub Actions scheduler receives event
## Pipeline starts within seconds
 
## 3. Build job starts
## ubuntu-latest VM spins up
## Checks out code
## docker build -t payment-api:abc1234 .
## Pushes to ECR: 123456789.dkr.ecr.ap-south-1.amazonaws.com/payment-api:abc1234
 
## 4. unit-test and lint start IN PARALLEL
## Two separate ubuntu-latest VMs spin up simultaneously
## unit-test: runs 847 tests in 2m 14s -- PASS
## lint: checks code style in 23s -- PASS
 
## 5. deploy job starts
## Waits for BOTH unit-test AND lint to complete
## Connects to EKS cluster
## helm upgrade payment-api --set image.tag=abc1234
## Verifies pods are healthy
 
## 6. Slack notification
## "Deployment SUCCESS: payment-api abc1234 to staging"
 
## Total time: 8 minutes 43 seconds from push to staging

Milestone 3 — Understand pipeline triggers

YAML

on:
  ## Push trigger: runs on commits to main
  push:
    branches: [main]
    ## Path filter: skip pipeline if only docs changed
    paths:
      - 'src/**'
      - 'Dockerfile'
      - 'package*.json'
 
  ## PR trigger: runs on pull requests
  pull_request:
    branches: [main]
    types: [opened, synchronize, reopened]
 
  ## Schedule trigger: nightly security scan
  schedule:
    - cron: '0 21 * * *'  ## 2:30 AM IST daily
 
  ## Manual trigger with inputs
  workflow_dispatch:
    inputs:
      environment:
        type: choice
        options: [staging, production]
        required: true

Milestone 4 — Work with pipeline artifacts

YAML

jobs:
  build:
    steps:
      - name: Build application
        run: npm run build
 
      ## Artifact 1: built application files
      - uses: actions/upload-artifact@v4
        with:
          name: dist-files
          path: dist/
          retention-days: 7
 
  test:
    needs: build
    steps:
      ## Download the built files (not rebuilding from source)
      - uses: actions/download-artifact@v4
        with:
          name: dist-files
          path: dist/
 
      - name: Run tests against built artifacts
        run: npm test
 
      ## Artifact 2: test results for GitHub to display
      - uses: actions/upload-artifact@v4
        if: always()          ## upload even when tests fail
        with:
          name: test-results
          path: test-results/junit.xml

Milestone 5 — Diagnose and fix a failing pipeline

Bash

## Step 1: See which job failed
gh run list --workflow=ci.yaml --limit 5
## STATUS  TITLE                        BRANCH  EVENT  ID
## X       fix: payment timeout         main    push   9876543
 
## Step 2: See which step failed
gh run view 9876543
## Jobs:
## X unit-test  (3m 12s)
##   * checkout  -- pass
##   * npm ci    -- pass
##   * npm test  -- FAIL
 
## Step 3: Read the failure
gh run view 9876543 --log | grep -A 10 "FAIL"
## FAIL src/payments/processor.test.js
## * timeout handler not called when connection fails
## Expected: true
## Received: false
 
## Step 4: Fix locally and verify
npm test src/payments/processor.test.js  ## run just the failing test
## Investigate and fix the test
## Verify the fix
 
## Step 5: Push the fix
git add .
git commit -m "fix: processor timeout test -- mock timer correctly"
git push
## New pipeline run starts automatically

Milestone 6 — Measure pipeline health

Bash

## Pipeline speed: how long does each job take?
## GitHub Actions > workflow run > each job shows duration
 
## Check pipeline pass rate over time
gh run list --workflow=ci.yaml --limit 50   --json status,createdAt   | jq '[.[] | .status] | group_by(.) | map({status: .[0], count: length})'
 
## Find the slowest steps
gh run view RUN_ID --json jobs   | jq '.jobs[] | {name: .name, duration: (.completedAt - .startedAt)}'
 
## Identify flaky tests (tests that sometimes pass, sometimes fail)
## Look for jobs that fail intermittently:
gh run list --workflow=ci.yaml --limit 100   --json conclusion   | jq '[.[].conclusion] | group_by(.) | map({result: .[0], count: length})'

Production Best Practices and Common Pitfalls

Mistake	Problem	Fix
One giant job with 20 steps	Cannot run steps in parallel, slow	Split into multiple jobs with dependencies
No path filters on triggers	Pipeline runs on README changes	Add `paths:` filter to push triggers
Rebuilding image per environment	Untested image goes to production	Build once, promote same digest everywhere
No artifact retention policy	Old artifacts consume storage quota	Set `retention-days` on all artifacts
Ignoring flaky tests	Team ignores pipeline failures	Quarantine flaky tests, fix or delete them

Quick Reference and Troubleshooting Commands

Task	Command
List recent pipeline runs	`gh run list --workflow=ci.yaml`
View run details	`gh run view RUN_ID`
View run logs	`gh run view RUN_ID --log`
Re-run failed jobs only	`gh run rerun RUN_ID --failed`
Cancel a running pipeline	`gh run cancel RUN_ID`
Trigger manually	`gh workflow run ci.yaml`
Watch run in progress	`gh run watch RUN_ID`

PLACEMENT PRO TIP
**Tip:** Add `paths-ignore: ['**.md', 'docs/**']` to your push trigger. Documentation changes should not trigger a full build, test, and deploy cycle. Every unnecessary pipeline run wastes runner minutes and creates noise in the deployment history.

REMEMBER THIS
**Remember:** The pipeline is not a safety net for bad code — it is a verification system for good code. If your team is relying on the pipeline to catch bugs that code review should catch, the pipeline will become slow and flaky as the test suite grows to cover every edge case that should have been caught in review.

COMMON MISTAKE / WARNING
**Security:** Never print environment variables or secrets in pipeline logs. Even with secret masking enabled, structured logging can sometimes expose secrets in unexpected formats. Audit your pipeline logs regularly and use `::add-mask::$SECRET_VALUE` in GitHub Actions to mask any dynamically generated sensitive values.

COMMON MISTAKE / WARNING
**Common Mistake:** Treating the pipeline as passing when it is green but slow. A green pipeline that takes 45 minutes is a failing pipeline — it is just failing at a different dimension. Set time budgets for each stage (Build < 5 min, Test < 8 min, Deploy < 3 min) and treat violations as bugs to fix, not acceptable trade-offs.

Understanding CI/CD — Pipelines, Stages, and the Delivery Lifecycle

Overview and What You Will Learn

Why This Matters in Production

Core Principles

Detailed Step-by-Step Practical Lab

Production Best Practices and Common Pitfalls

Quick Reference and Troubleshooting Commands

Resources

Explore More in CI/CD Fundamentals and Pipeline Design

Designing Multi-Stage CI/CD Pipelines — Build, Test, Scan, and Deploy

Automated Testing in CI — Unit, Integration, and E2E Gates