⌂ Home

Jobs and Batch Processing

Interactive guide to run-to-completion workloads, retries, completions, and parallel batch execution in Kubernetes.

A Job is about successful completion, not continuous availability. That changes how Kubernetes measures success, handles retries, and cleans up Pods.

Core Model

Understand the Concept First

Repository YAML Files:
  • k8s/labs/workloads/jobs.yaml — Job manifest demonstrating run-to-completion work with Pod template and completion settings.
Run to completion

Jobs are the right workload for migrations, reports, calculations, and one-time administrative tasks.

Retry-aware

A failed Pod can be retried until backoffLimit is reached.

Parallel capable

Jobs can run one Pod or many Pods in parallel depending on the task.

Lifecycle Flow

Job Controller Lifecycle

1

Create Job

The Job declares a Pod template plus success and retry settings.

2

Launch Pods

The Job controller creates Pods to execute the batch task.

3

Track success

Completed Pods count toward the desired number of successful completions.

4

Retry failures

Failed runs may be retried depending on restartPolicy and backoffLimit.

5

Finish and retain state

The Job reaches a completed or failed state once its criteria are met.

Jobs are controller-driven just like Deployments, but the success metric is completions rather than continuously running replicas.
Visual Diagrams

Interactive Job Patterns

Simple Job Lifecycle

Create Job kubectl apply Job resource created in API server Pod Running Executing task Job controller creates and manages pod Completed Exit code 0 Pod finished successfully Job Done Retained state Job reaches completed state Time →

Parallel Job Execution (parallelism: 3)

Job Controller completions: 6 Job creates up to 3 pods at a time Pod 1 Running Pod 2 Running Pod 3 Running Pod 1 ✓ Done Pod 2 ✓ Done Pod 3 ✓ Done Pod 4 Starting... Pod 5-6 Waiting... 3 Pods Run Concurrently New pods start as previous ones complete

Failure Handling with Backoff Limit

Attempt 1 Pod Running Failed Exit code 1 Retry 1 Attempt 2 Pod Running Failed Exit code 1 Retry 2 Attempt 3 Pod Running Job Failed Backoff limit: 2 backoffLimit: 2 (max retries) After 2 failures, Job stops retrying and marks as Failed restartPolicy: Never Each failure creates a new Pod
Click the buttons above to explore different Job execution patterns. Hover over diagram elements for tooltips.
YAML and Commands

Examples You Can Recognize Quickly

Basic Job
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
Useful Commands
kubectl get jobs
kubectl describe job pi
kubectl get pods -l job-name=pi
kubectl logs -l job-name=pi
Decision Guide

Job vs Deployment

Feature Job Deployment
Purpose Run to completion Keep application running
Lifecycle Ends when work finishes Runs indefinitely
Success criteria Completion count reached Desired replicas remain available
Restart policy Never or OnFailure Controller keeps Pods replaced continuously
If the workload should stop when the work is done, it should usually be a Job rather than a Deployment.
Use It Well

Practice and Real-World Thinking

Database migrations

Run schema changes before or during rollout in a controlled way.

Batch reports

Generate output files, analytics, or long-running calculations without long-lived Pods.

Administrative automation

Backups, repair tasks, and one-time system operations fit naturally into Jobs.