CI/CD & DevOps — learn.surkar.in

01 / Git Internals

Git's Object Model & Workflows

Git is a content-addressable filesystem. Every piece of data is stored as an object identified by its SHA-1 hash. Understanding the four object types and how they compose is the key to mastering Git.

Blob

Stores file contents. No filename, no metadata -- just raw bytes identified by SHA-1. Two files with identical content share one blob.

Tree

A directory listing. Maps filenames to blob hashes (and sub-trees). Represents a snapshot of the project's directory structure.

Commit

Points to a tree (project snapshot), parent commit(s), author, committer, and message. Forms the nodes in Git's DAG.

Tag

An annotated pointer to a commit (or any object). Includes tagger identity, date, and a message. Lightweight tags are just refs.

The DAG, Refs & Index

Commits form a Directed Acyclic Graph (DAG) -- each commit points to its parent(s), never creating cycles. Branches and tags are simply refs: mutable pointers (files in .git/refs/) to commit hashes. HEAD is a symbolic ref pointing to the current branch.

The index (staging area) sits between your working directory and the repository. git add writes blobs and updates the index; git commit converts the index into a tree and creates a commit object.

Git object relationships

Working Dir

git add

Index (Stage)

git commit

Commit (DAG)

Commit

points to

Tree

contains

Blobs

Merge vs Rebase

Aspect	Merge	Rebase
History	Preserves full branch topology (merge commit)	Produces a linear history
Conflict resolution	Once, in the merge commit	Per-commit during replay
Shared branches	Safe to use anytime	Never rebase published/shared commits
Use case	Feature integration, preserving context	Cleaning up local work before merging

Git Bisect

git bisect performs a binary search through commit history to find the exact commit that introduced a bug. You mark a known-good and known-bad commit, and Git checks out the midpoint for you to test. This reduces an N-commit search to O(log N) steps. It can be fully automated with git bisect run <test-script>.

02 / Continuous Integration

CI: Fast Feedback on Every Commit

Continuous Integration is the practice of merging developer work into a shared mainline frequently -- ideally multiple times per day. Each integration triggers an automated pipeline that builds, tests, and reports results within minutes.

CI Pipeline Flow

Commit

→

Build

→

Unit Tests

→

Static Analysis

→

Report

The Fast Feedback Loop

The core principle: developers should know within 10 minutes if their change broke something. This requires fast builds, parallelized test suites, and caching of dependencies and build artifacts. If CI is slow, developers batch changes and integration becomes painful.

Static Analysis (SAST)

Static Application Security Testing analyzes source code without executing it. Tools like SonarQube, Semgrep, and CodeQL scan for security vulnerabilities (SQL injection, XSS, hardcoded secrets), code smells, and style violations. SAST runs early in the pipeline because it needs no running environment.

Dependency Scanning (SCA)

Software Composition Analysis examines your dependency tree (package-lock.json, requirements.txt, go.sum) against known vulnerability databases (CVE/NVD). Tools like Dependabot, Snyk, and Trivy alert on vulnerable transitive dependencies and can auto-generate upgrade PRs.

CI Best Practices

Keep the mainline always green. If a build breaks, fixing it takes priority over new features. Use branch protection rules to require passing CI before merge. Run linting and formatting checks as the first pipeline step -- they're fast and catch the most common issues.

03 / Continuous Delivery & Deployment

CD: From Build to Production

Continuous Delivery ensures that code is always in a deployable state and that releasing to production is a one-click decision. Continuous Deployment goes further: every change that passes the pipeline is automatically deployed with no human gate.

Aspect	Continuous Delivery	Continuous Deployment
Production release	Manual approval / button click	Fully automated
Risk tolerance	Lower -- human reviews each release	Higher -- relies on automated quality gates
Feedback speed	Fast (hours)	Fastest (minutes)
Prerequisite	Solid CI + staging environment	Solid CI + canary/rollback automation

Pipeline Stages

Typical CD Pipeline

CI (build+test)

→

Artifact Store

→

Staging

→

Approval Gate

→

Production

Artifact Management

Build artifacts (Docker images, JARs, binaries) are stored in a versioned registry (Docker Hub, ECR, Artifactory, Nexus). Each artifact is immutable and tagged with the commit SHA or semantic version. The same artifact that passes staging is the one deployed to production -- never rebuild for a different environment.

Environment Promotion

Code progresses through environments: dev → staging → production. Configuration differs per environment (via env vars or config files), but the artifact stays the same. This guarantees that what you tested is what you deploy. Promotion can be automated (continuous deployment) or gated by manual approval.

Common Pitfall

Rebuilding artifacts for each environment introduces drift. A test that passes in staging may fail in production if the binary is different. Always promote the same artifact and inject environment-specific config externally.

04 / Deployment Strategies

Rolling, Blue-Green, Canary & More

How you roll out changes to production determines your blast radius when things go wrong. Each strategy trades off between speed, safety, and infrastructure cost.

Rolling Update

Replace instances one-by-one (or batch-by-batch). Zero downtime, but old and new versions coexist briefly. Default in Kubernetes Deployments.

Blue-Green

Run two identical environments. Deploy to the idle one ("green"), test it, then switch the load balancer. Instant rollback by switching back. Requires 2x infrastructure.

Canary

Route a small percentage of traffic (1-5%) to the new version. Monitor error rates and latency. Gradually increase traffic if metrics look good. Rollback is trivial.

Feature Flags

Deploy code to all users but toggle features on/off at runtime. Decouples deployment from release. Enables A/B testing and gradual rollouts without redeploying.

Database Migrations

Schema changes are the hardest part of zero-downtime deployments. The key principle: make migrations backward-compatible. Use the expand-contract pattern:

Expand-Contract Migration Pattern

1. Add new column

→

2. Deploy code (reads both)

→

3. Backfill data

→

4. Drop old column

Never Do This

Renaming a column or changing its type in a single migration will break running application instances that expect the old schema. Always expand first (add new), migrate data, update code, then contract (remove old).

05 / Infrastructure as Code

IaC: Terraform, CloudFormation & GitOps

Infrastructure as Code (IaC) means defining servers, networks, and services in version-controlled configuration files instead of clicking through consoles. This makes infrastructure reproducible, reviewable, and auditable.

Terraform

Declarative, provider-agnostic. Uses HCL. Maintains a state file that maps config to real resources. Plan before apply. Multi-cloud support.

CloudFormation

AWS-native IaC. JSON/YAML templates. Manages stacks with automatic rollback on failure. Deep AWS integration but single-cloud only.

Pulumi

Define infrastructure in real programming languages (TypeScript, Python, Go). Full IDE support, loops, conditionals. Uses a state backend like Terraform.

GitOps

Git as the single source of truth. ArgoCD/Flux watch a repo and reconcile cluster state to match. Pull-based model. Audit trail is the commit log.

Terraform State

Terraform's state file (terraform.tfstate) is a JSON mapping between your HCL resources and real-world infrastructure IDs. It enables terraform plan to diff desired vs. actual state. State must be stored remotely (S3, GCS, Terraform Cloud) with locking (DynamoDB) to prevent concurrent modifications.

# Terraform remote state backend
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

GitOps: ArgoCD & Flux

GitOps applies the Git workflow to infrastructure. The desired state of your Kubernetes cluster is declared in a Git repository. A controller (ArgoCD or Flux) continuously watches the repo and reconciles the cluster to match. Drift detection is automatic -- if someone manually changes the cluster, the controller reverts it.

GitOps Reconciliation Loop

Git Push

→

ArgoCD detects diff

→

Sync to cluster

→

Healthy state

Key Insight

GitOps uses a pull model: the cluster pulls desired state from Git, rather than CI pushing deployments into the cluster. This means the cluster never needs inbound network access from CI, improving security.

06 / Immutable Infrastructure & Configuration Management

Immutable Infra & Ansible

Two contrasting philosophies for managing servers: treat them as cattle (immutable, replaceable) or pets (mutable, carefully maintained). Modern practice strongly favors the cattle approach.

Aspect	Mutable (Pets)	Immutable (Cattle)
Updates	SSH in, run commands, patch in place	Build new image, replace old instances
Drift	High risk -- servers diverge over time	None -- every instance is from the same image
Rollback	Difficult (undo scripts, hope for the best)	Trivial (deploy previous image)
Debugging	Check each server individually	All instances identical, reproduce locally
Tools	Ansible, Chef, Puppet	Packer + Terraform, Docker, AMIs

Immutable Infrastructure

With immutable infrastructure, you never modify a running server. Instead, you bake a new machine image (AMI, Docker image) with every change, test it, then replace running instances. This eliminates configuration drift and makes rollback as simple as pointing to the previous image version.

Immutable Deployment Flow

Code Change

→

Build Image (Packer/Docker)

→

Test Image

→

Replace Instances

Configuration Management with Ansible

Ansible is an agentless configuration management tool that uses SSH to push configuration to servers. It uses YAML playbooks to declare desired state. While it's traditionally a "mutable infrastructure" tool, it's still widely used for provisioning base images and for environments where full immutability isn't practical.

# Ansible playbook example
- hosts: webservers
  become: yes
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present

    - name: Copy config
      template:
        src: nginx.conf.j2
        dest: /etc/nginx/nginx.conf
      notify: restart nginx

  handlers:
    - name: restart nginx
      service:
        name: nginx
        state: restarted

Modern Best Practice

Use Ansible (or similar) to build your base images, then use Terraform to deploy those images as immutable infrastructure. This combines the strengths of both: declarative provisioning for image creation, and immutable replacement for deployment.

Test Yourself

Score: 0 / 10

Question 01

In Git's object model, which object type stores a directory listing that maps filenames to blob hashes?

A tree object maps filenames (and permissions) to blob hashes and sub-tree hashes, representing a directory snapshot. Blobs store only file contents with no name metadata.

Question 02

Why should you never rebase commits that have been pushed to a shared branch?

Rebase replays commits with new SHA-1 hashes. If others have based work on the original commits, their histories diverge, leading to duplicated commits and painful conflict resolution.

Question 03

What is the time complexity of finding a bug-introducing commit using git bisect across N commits?

Git bisect performs a binary search through commit history. Each step halves the search space, so it requires at most log2(N) steps to find the offending commit.

Question 04

What does SAST stand for, and when does it run in a CI pipeline?

SAST (Static Application Security Testing) analyzes source code, bytecode, or binaries without executing the program. It runs early in CI because it needs no running environment, catching vulnerabilities like SQL injection and XSS at the code level.

Question 05

What is the key difference between Continuous Delivery and Continuous Deployment?

In Continuous Delivery, every change is proven deployable but a human decides when to release. In Continuous Deployment, every change that passes the automated pipeline goes to production automatically with no manual gate.

Question 06

In a blue-green deployment, how is rollback performed?

Blue-green keeps the previous environment running. Rollback is instant: just point the load balancer back to the "blue" (old) environment. No rebuild or redeployment needed.

Question 07

Why should you avoid rebuilding artifacts for each environment (dev, staging, prod)?

Each build can produce subtly different output (different dependency resolution, timestamps, non-deterministic compilation). Promoting the exact same artifact guarantees that what you tested is what you deploy.

Question 08

What is the purpose of Terraform's state file?

The state file is a JSON document that records the mapping between your HCL resource declarations and the real infrastructure (e.g., AWS resource IDs). This allows terraform plan to diff desired state against actual state and determine what changes to make.

Question 09

What distinguishes GitOps from traditional CI/CD push-based deployments?

GitOps uses a pull model: a controller like ArgoCD or Flux runs inside the cluster, watches a Git repo, and reconciles cluster state to match. This eliminates the need for CI to have direct cluster access, improving security.

Question 10

What is the main advantage of immutable infrastructure over mutable (in-place patching)?

With mutable infrastructure, each server accumulates unique patches and changes over time, causing "configuration drift" where nominally identical servers behave differently. Immutable infrastructure eliminates this by replacing rather than modifying servers.