Overview and What You Will Learn
GitHub Actions is the default CI/CD choice for teams building on GitHub. It requires no external infrastructure, integrates directly with pull requests, and has a marketplace of thousands of pre-built actions for every common task. A complete pipeline — build, test, scan, deploy — can be running in under an hour.
By the end of this lab you will:
- Write a complete multi-job GitHub Actions workflow from scratch
- Configure OIDC authentication with AWS — eliminating long-lived credentials
- Build and push Docker images to ECR with layer caching
- Deploy to Kubernetes using Helm with environment protection gates
- Set up matrix builds for multi-version testing
- Create reusable workflows to share pipeline logic across repositories
Why This Matters in Production
At CRED, every pull request triggers a pipeline that runs 800 tests, builds a Docker image, scans it for vulnerabilities, and posts a deployment preview URL as a PR comment — all automatically. Engineers get instant feedback without leaving GitHub. When the PR merges, the same pipeline deploys to staging automatically and waits for approval before production. This is what modern CI/CD looks like.
Core Principles
GitHub Actions workflow structure:
+------------------------------------------+| .github/workflows/deploy.yaml |+------------------------------------------+ |+------------------------------------------+| on: (triggers) || push: branches: [main] || pull_request: branches: [main] |+------------------------------------------+ |+------------------------------------------+| jobs: || || build -------> test -------> deploy || | || +-> lint (parallel) |+------------------------------------------+ Each job: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: your commands hereDetailed Step-by-Step Practical Lab
Milestone 1 — Basic workflow structure
## .github/workflows/ci.yamlname: CI Pipeline on: push: branches: [main] pull_request: branches: [main] ## Permissions for OIDC token generationpermissions: contents: read id-token: write pull-requests: write ## for posting PR comments env: AWS_REGION: ap-south-1 ECR_REGISTRY: 123456789.dkr.ecr.ap-south-1.amazonaws.com IMAGE_NAME: payment-api K8S_CLUSTER: mumbai-prod-cluster jobs: build: name: Build and Push Image runs-on: ubuntu-latest outputs: image-tag: ${{ steps.meta.outputs.version }} steps: - name: Checkout code uses: actions/checkout@v4 - name: Configure AWS credentials via OIDC uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789:role/github-deploy-role aws-region: ${{ env.AWS_REGION }} - name: Login to ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v2 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Generate image metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ env.ECR_REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=sha,prefix=,format=short type=ref,event=branch - name: Build and push uses: docker/build-push-action@v5 with: context: . push: true tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} ## GitHub Actions layer cache cache-from: type=gha cache-to: type=gha,mode=maxMilestone 2 — Testing with coverage gate
test: name: Run Tests runs-on: ubuntu-latest needs: build steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' ## caches node_modules between runs - name: Install dependencies run: npm ci ## faster and stricter than npm install - name: Run tests with coverage run: npm test -- --coverage --coverageReporters=text-summary - name: Check coverage threshold run: | COVERAGE=$(cat coverage/coverage-summary.json | \ jq '.total.lines.pct') echo "Line coverage: $COVERAGE%" if (( $(echo "$COVERAGE < 80" | bc -l) )); then echo "::error::Coverage $COVERAGE% is below 80% threshold" exit 1 fi - name: Upload coverage report uses: actions/upload-artifact@v4 if: always() ## upload even when tests fail with: name: coverage-report path: coverage/ retention-days: 7 lint: name: Lint and Format Check runs-on: ubuntu-latest needs: build ## Runs in parallel with test steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - run: npm ci - run: npm run lint - run: npm run format:checkMilestone 3 — Security scanning
security-scan: name: Security Scan runs-on: ubuntu-latest needs: build steps: - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789:role/github-deploy-role aws-region: ${{ env.AWS_REGION }} - name: Login to ECR uses: aws-actions/amazon-ecr-login@v2 ## Scan Docker image for vulnerabilities - name: Run Trivy vulnerability scan uses: aquasecurity/trivy-action@master with: image-ref: | ${{ env.ECR_REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.build.outputs.image-tag }} format: 'sarif' output: 'trivy-results.sarif' ## Fail on HIGH or CRITICAL vulnerabilities severity: 'HIGH,CRITICAL' exit-code: '1' ## Upload results to GitHub Security tab - name: Upload Trivy scan results uses: github/codeql-action/upload-sarif@v3 if: always() with: sarif_file: 'trivy-results.sarif' ## SAST scan with Semgrep - name: Run Semgrep SAST uses: semgrep/semgrep-action@v1 with: config: 'p/nodejs p/secrets'Milestone 4 — Deploy to staging
deploy-staging: name: Deploy to Staging runs-on: ubuntu-latest needs: [test, lint, security-scan] ## Only deploy from main branch if: github.ref == 'refs/heads/main' environment: name: staging url: https://staging.payment.internal steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789:role/github-deploy-role aws-region: ${{ env.AWS_REGION }} - name: Update kubeconfig run: | aws eks update-kubeconfig \ --name ${{ env.K8S_CLUSTER }} \ --region ${{ env.AWS_REGION }} - name: Deploy with Helm run: | helm upgrade --install payment-api ./charts/payment-api \ --namespace payment-api-staging \ --create-namespace \ --values ./charts/payment-api/values-staging.yaml \ --set image.tag=${{ needs.build.outputs.image-tag }} \ --atomic \ --timeout 5m \ --wait - name: Smoke test staging run: | sleep 10 curl -sf https://staging.payment.internal/health echo "Staging deployment healthy" - name: Notify Slack if: always() uses: 8398a7/action-slack@v3 with: status: ${{ job.status }} fields: repo,message,commit,author env: SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}Milestone 5 — Production deployment with approval gate
deploy-production: name: Deploy to Production runs-on: ubuntu-latest needs: deploy-staging ## Production environment has required reviewers configured in ## GitHub Settings > Environments > production > Protection rules environment: name: production url: https://api.payment.razorpay.com steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789:role/github-deploy-prod-role aws-region: ${{ env.AWS_REGION }} - name: Update kubeconfig for production cluster run: | aws eks update-kubeconfig \ --name mumbai-prod-cluster \ --region ${{ env.AWS_REGION }} - name: Deploy to production run: | helm upgrade --install payment-api ./charts/payment-api \ --namespace payment-api-production \ --values ./charts/payment-api/values-production.yaml \ --set image.tag=${{ needs.build.outputs.image-tag }} \ --atomic \ --timeout 10m \ --waitMilestone 6 — Matrix builds and reusable workflows
## Matrix build: test across multiple Node versions test-matrix: runs-on: ubuntu-latest strategy: matrix: node-version: [18, 20, 21] steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: ${{ matrix.node-version }} - run: npm ci && npm test ## Reusable workflow: .github/workflows/deploy-env.yaml## Called by other workflows with inputson: workflow_call: inputs: environment: required: true type: string image-tag: required: true type: string secrets: AWS_ROLE_ARN: required: true jobs: deploy: runs-on: ubuntu-latest environment: ${{ inputs.environment }} steps: - uses: actions/checkout@v4 - name: Deploy run: | helm upgrade --install payment-api ./charts/payment-api \ --set image.tag=${{ inputs.image-tag }}## Calling the reusable workflow deploy-staging: uses: ./.github/workflows/deploy-env.yaml with: environment: staging image-tag: ${{ needs.build.outputs.image-tag }} secrets: AWS_ROLE_ARN: ${{ secrets.STAGING_AWS_ROLE_ARN }}Production Best Practices and Common Pitfalls
| Scenario | Wrong | Correct |
|---|---|---|
| AWS credentials | Store access keys as secrets | Use OIDC role assumption |
| Action versions | uses: actions/checkout@v4 |
Pin to SHA for security |
| Cache strategy | No caching | cache: 'npm' in setup actions |
| Concurrent deploys | Allow multiple deploy runs | Use concurrency to cancel stale runs |
| Secrets in logs | echo $SECRET |
Never echo secrets, use masking |
Quick Reference and Troubleshooting Commands
| Task | Command |
|---|---|
| Trigger workflow manually | gh workflow run deploy.yaml |
| View workflow runs | gh run list --workflow=deploy.yaml |
| View run logs | gh run view RUN_ID --log |
| Cancel a run | gh run cancel RUN_ID |
| Re-run failed jobs | gh run rerun RUN_ID --failed |
| View secrets (names only) | gh secret list |
| Set a secret | gh secret set SECRET_NAME |
PLACEMENT PRO TIP**Tip:** Add `concurrency` to your deployment workflow to cancel in-progress runs when a new commit pushes. `concurrency: { group: deploy-${{ github.ref }}, cancel-in-progress: true }` ensures only the latest commit gets deployed — stale deploys are cancelled automatically.
REMEMBER THIS**Remember:** OIDC requires configuring a trust policy in AWS IAM that allows your specific GitHub repository and branch to assume the role. The trust policy must include the `token.actions.githubusercontent.com` issuer and the `sub` condition matching your repo — `repo:razorpay/payment-api:ref:refs/heads/main`. Without this, OIDC authentication will fail.
COMMON MISTAKE / WARNING**Security:** Use separate IAM roles for staging and production deployments with different permission scopes. The staging deployment role should only have permission to update the staging EKS namespace. The production role needs more permissions but should be restricted to production resources only. Never use a single role for all environments.
COMMON MISTAKE / WARNING**Common Mistake:** Forgetting `if: always()` on the Slack notification step. By default, steps are skipped when a previous step fails. If your deploy step fails, the Slack notification is also skipped — and the team does not get alerted. Use `if: always()` on notification steps to ensure they run regardless of previous step outcomes.