GitOps
GitOps is a way of operating infrastructure and applications where Git is the single source of truth for the desired state of a system, and a software agent running inside the target environment continuously pulls that state and reconciles the live system to match it.
The Four GitOps Principles
- Declarative — the system's desired state is expressed as data
(Kubernetes manifests, Helm values, Kustomize overlays), never as a
sequence of imperative commands like
kubectl scalerun by a human. - Versioned and immutable — the desired state is stored in Git, giving
full history, diffs between any two points in time, and the ability to
revert with a single
git revertrather than reconstructing what changed. - Pulled automatically — an agent inside the cluster (ArgoCD, Flux) pulls the desired state on its own schedule. Nothing external pushes into the cluster.
- Continuously reconciled — the agent does not just apply state once at deploy time; it keeps comparing live state to desired state forever and corrects drift whenever it appears.
GitOps vs Push-Based CD
+------------------------+ +------------------------+| PUSH MODEL (traditional) | | PULL MODEL (GitOps) || | | || CI has cluster creds | | Agent pulls from Git || CI runs kubectl apply | | Agent applies + heals || No record outside CI logs | | Git history is the record |+------------------------+ +------------------------+In a push model, your CI system needs network access and credentials into every cluster it deploys to — a meaningful blast radius if CI is ever compromised. In a pull model, the cluster only needs read access to a Git repository; nothing external needs write access to the cluster at all.
App Repo vs Config Repo Separation
Most GitOps setups split the application source code from the rendered deployment manifests into two repositories:
- App repo — application source code. CI builds and tests it, then builds and pushes a container image.
- Config repo — Kubernetes manifests, Helm values, or Kustomize overlays. CI opens a pull request here to bump the image tag after a successful build; ArgoCD or Flux only ever watches this repo.
app-repo/ config-repo/ src/ base/ Dockerfile deployment.yaml .github/workflows/ service.yaml build.yml overlays/ staging/ production/CRED's platform team, for example, structures deploys this way: a merge to
app-repo/main triggers a build, and the resulting image digest is what
gets written into a PR against config-repo — a human reviews that PR
before the digest reaches the production overlay.
Drift Detection
Because the GitOps agent continuously reconciles, any manual change to the
cluster — a kubectl edit during an incident, a change made directly in
a cloud console — is detected on the next reconciliation pass. Depending on
configuration, the agent either reports the drift (manual sync mode) or
automatically reverts it (automated self-heal mode).
syncPolicy: automated: selfHeal: true # auto-revert drift back to Git state prune: true # delete resources removed from GitPLACEMENT PRO TIP**Tip:** During a declared incident, pause self-heal before making any manual emergency change, or the GitOps agent will quietly undo your fix a few minutes later. Record the emergency change in Git afterward so the two states converge again.
Audit Trail and Compliance
Every change to production state has a Git commit: an author, a timestamp, and — if branch protection requires it — a reviewer's approval. For teams operating in regulated environments (payments, fintech, anything Razorpay or PhonePe-adjacent), this Git history doubles as the compliance audit log for "who changed what in production, and who approved it" without any separate change-management tooling.
REMEMBER THIS**Remember:** GitOps gives you an audit trail of *intended* state changes (what was committed), not necessarily of every action the reconciliation agent itself took. Pair Git history with the agent's own operational logs (ArgoCD's application events, for example) if you need a full picture of exactly when a change was applied to the live cluster.
COMMON MISTAKE / WARNING**Security:** The GitOps agent's service account inside the cluster typically has broad apply permissions across the namespaces it manages. Anyone who can merge to the config repo's protected branch effectively has that level of access to the cluster — treat config repo write access with the same care as direct cluster admin access.
COMMON MISTAKE / WARNING**Common Mistake:** Running GitOps for most changes but keeping one manual `kubectl apply` escape hatch "just for emergencies." That escape hatch is the side door that breaks the entire audit and self-healing guarantee GitOps is supposed to provide — emergency changes should still go through Git, just on a faster review path.
Troubleshooting Reference
| Symptom | Check | What to Look For |
|---|---|---|
| Cluster state not matching Git | Agent's sync status (argocd app get) |
Reconciliation paused, or sync policy set to manual |
| Manual fix keeps disappearing | Sync policy on the Application/Kustomization | selfHeal/auto-apply enabled, working as designed |
| New manifest never appears in cluster | Agent's source path configuration | Path or branch in the agent config doesn't match where the file was committed |