Control Plane & Worker Nodes
A Kubernetes cluster has two halves: the control plane that makes global decisions (scheduling, detecting failures, responding to events) and worker nodes that run your actual application containers.
Control Plane
kube-apiserver is the front door -- every kubectl command, every internal component, and every webhook hits this REST API. It validates requests, persists state to etcd, and serves as the hub for all cluster communication.
etcd is the single source of truth: a distributed key-value store holding all cluster state. Losing etcd without backups means losing the cluster.
kube-scheduler watches for newly created Pods with no assigned node, then picks the best node based on resource requirements, affinity rules, taints, and other constraints.
kube-controller-manager runs a collection of control loops (Deployment controller, ReplicaSet controller, Node controller, etc.) that continuously reconcile desired state with actual state.
Worker Nodes
kubelet is the node agent. It takes PodSpecs from the API server and ensures the described containers are running and healthy. It reports node status back to the control plane.
kube-proxy maintains network rules on each node, implementing the Service abstraction by programming iptables or IPVS rules so that traffic to a Service ClusterIP reaches the right Pod.
Container runtime (containerd, CRI-O) does the actual work of pulling images, creating containers, and managing their lifecycle via the Container Runtime Interface (CRI).
Pods, Deployments, Services & Ingress
A Pod is the smallest deployable unit -- one or more containers sharing the same network namespace and storage volumes. Pods are ephemeral; they get an IP but that IP dies with the Pod.
A Deployment manages a ReplicaSet, which in turn manages Pods. It gives you declarative updates, rollback history, and scaling. You almost never create Pods directly.
Service Types
Because Pod IPs are transient, a Service provides a stable virtual IP (ClusterIP) and DNS name that load-balances traffic across matching Pods.
| Service Type | Scope | Use Case |
|---|---|---|
ClusterIP | Internal only | Default; inter-service communication within the cluster |
NodePort | External via node IP:port | Dev/testing; exposes a static port (30000-32767) on every node |
LoadBalancer | External via cloud LB | Production; provisions a cloud load balancer automatically |
Ingress
Ingress sits in front of Services and provides HTTP/HTTPS routing -- host-based and path-based rules, TLS termination, and more. It requires an Ingress Controller (NGINX, Traefik, or a cloud-native one) to function.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
spec:
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-svc
port:
number: 80
ConfigMap, Secret, Storage & Specialized Workloads
Key-value config data injected as environment variables or mounted as files. Not for secrets -- data is stored in plaintext.
Base64-encoded sensitive data (passwords, tokens). Not encrypted at rest by default -- enable EncryptionConfiguration or use external secret managers.
PersistentVolume (PV) is a cluster-level storage resource. PersistentVolumeClaim (PVC) is a user's request for storage. Decouples storage provisioning from consumption.
Like a Deployment but gives each Pod a stable hostname (pod-0, pod-1) and persistent storage. Essential for databases and stateful apps.
Ensures a copy of a Pod runs on every node (or a subset). Used for log collectors, monitoring agents, and CNI plugins.
Job runs a Pod to completion. CronJob schedules Jobs on a cron schedule. Use for batch processing, backups, and periodic tasks.
Virtual cluster partitioning. Provides scope for names, resource quotas, and RBAC policies. Default namespaces: default, kube-system, kube-public.
Pod Networking, CNI & Service Mesh
Kubernetes networking has one fundamental rule: every Pod gets its own IP, and any Pod can reach any other Pod without NAT. This flat network model is implemented by a CNI plugin.
CNI Plugins
| Plugin | Approach | Strengths |
|---|---|---|
Calico | BGP routing or VXLAN overlay | Mature, strong NetworkPolicy support, scales to thousands of nodes |
Cilium | eBPF-based dataplane | High performance, L7 visibility, transparent encryption, identity-based policies |
Network Policies
By default, all Pods can talk to all Pods. A NetworkPolicy is a firewall rule scoped to a namespace that restricts ingress and/or egress traffic based on Pod labels, namespace selectors, or CIDR blocks.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
spec:
podSelector: {} # applies to all pods in namespace
policyTypes:
- Ingress # deny all inbound by default
Service Mesh
A service mesh (Istio, Linkerd) adds a sidecar proxy to every Pod, giving you mutual TLS, traffic shaping, retries, circuit breaking, and distributed tracing -- all without changing application code.
Autoscaling, Resource Management & Pod Placement
Autoscaling
HPA (Horizontal Pod Autoscaler) adds or removes Pod replicas based on CPU, memory, or custom metrics. VPA (Vertical Pod Autoscaler) adjusts resource requests/limits on existing Pods. Cluster Autoscaler adds or removes nodes when Pods cannot be scheduled or nodes are underutilized.
| Autoscaler | What it Scales | When to Use |
|---|---|---|
| HPA | Pod count (horizontal) | Stateless workloads with variable traffic |
| VPA | Pod CPU/memory (vertical) | When you don't know right-sized requests; typically not used with HPA on same metric |
| Cluster Autoscaler | Node count | When pending Pods exist due to insufficient cluster capacity |
Resource Requests & Limits
requests are what the scheduler uses to find a node with enough capacity. limits are the hard ceiling enforced by the kubelet -- exceed memory limits and your container gets OOMKilled; exceed CPU limits and it gets throttled.
Guaranteed -- requests == limits for all containers. Highest priority, last to be evicted.
Burstable -- at least one container has requests < limits. Medium priority.
BestEffort -- no requests or limits set. First to be evicted under pressure.
Scheduling Constraints
Taints & Tolerations: A node taint repels Pods unless the Pod has a matching toleration. Used to reserve nodes (e.g., GPU nodes, dedicated tenant nodes).
Node Affinity: Schedule Pods to nodes matching label expressions (required or preferred). Like a more expressive nodeSelector.
Pod Affinity / Anti-Affinity: Co-locate Pods together (affinity) or spread them apart (anti-affinity) based on topology domains (zone, node). Anti-affinity is critical for HA -- ensuring replicas land on different nodes or zones.
PodDisruptionBudget (PDB): Limits how many Pods in a Deployment can be down simultaneously during voluntary disruptions (node drains, cluster upgrades). For example, minAvailable: 2 ensures at least 2 replicas stay running.
Rolling, Blue-Green & Canary
Rolling Update (default)
Kubernetes gradually replaces old Pods with new ones, controlled by maxUnavailable and maxSurge. Zero-downtime by default. Easy rollback with kubectl rollout undo.
Blue-Green
Run two identical environments (blue = current, green = new). Once the green environment is verified, switch the Service selector to point at green. Instant rollback by switching back. Downside: requires double the resources during transition.
Canary
Route a small percentage of traffic (e.g., 5%) to the new version. Monitor error rates and latency. Gradually increase traffic if metrics look good. Requires either weighted Service routing (via Istio, Linkerd, or Argo Rollouts) or manual ReplicaSet manipulation.
| Strategy | Downtime | Resource Cost | Rollback Speed |
|---|---|---|---|
| Rolling | None | Low (gradual) | Moderate (roll forward/back) |
| Blue-Green | None | High (2x resources) | Instant (switch selector) |
| Canary | None | Low-Medium | Fast (shift traffic back) |