Interactive guide to metrics-driven scaling, target utilization, and how HPA changes replica counts automatically.
Horizontal Pod Autoscaling is a control loop that reacts to observed metrics. It depends on a metrics pipeline and on realistic workload resource settings.
Core Model
Understand the Concept First
Repository YAML Files:
k8s/labs/workloads/hpa.yaml — HorizontalPodAutoscaler targeting a workload with CPU-based scale rules.
k8s/labs/workloads/app-hpa.yaml — Sample app Deployment paired with HPA-friendly resource requests for labs.
Automatic scaling
HPA adjusts replica counts in response to changing resource usage or other supported metrics.
Metrics-dependent
Built-in autoscaling typically depends on Metrics Server for CPU and memory metrics.
Request-aware
CPU utilization targets are meaningful only when resource requests are set sensibly.
Lifecycle Flow
Autoscaling Control Loop
HPA is not a one-time action. It continuously re-evaluates the workload against the current metrics picture.