Kubernetes Cordon & Drain Workflow

3-State Node Lifecycle

Ready

Schedulable

→

Cordoned

SchedulingDisabled

→

Drained

Empty & Disabled

→

Ready

Schedulable

Cordon

Blocks new pods
Existing pods run
Reversible instantly
No disruption

Drain

Evicts all pods
Respects PDB
Graceful shutdown
Auto cordons node

Uncordon

Re-enables scheduling
Pods can schedule
No auto migration
Post-maintenance

Pod Migration Timeline

Before Drain:

worker-node-1

worker-node-2

During Drain:

worker-node-1

worker-node-2

After Drain:

worker-node-1

worker-node-2

Note: Drain respects PodDisruptionBudget (PDB) to ensure application availability. DaemonSets remain on the node unless --ignore-daemonsets is used.

Cordon a Node (Block New Pods)

kubectl cordon worker-node-1

NAME             STATUS                     ROLES    AGE    VERSION
master           Ready                      control  10d    v1.27.4
worker-node-1    Ready,SchedulingDisabled   <none>   10d    v1.27.4
worker-node-2    Ready                      <none>   10d    v1.27.4

Drain a Node (Evict All Pods)

kubectl drain worker-node-1 --ignore-daemonsets --force

Flags:

--ignore-daemonsets - Skip DaemonSet pods (system components)
--force - Force eviction for pods without controllers
--delete-emptydir-data - Delete pods with emptyDir volumes
--grace-period=30 - Graceful termination period (seconds)

Verify Pod Migration

kubectl get pods -o wide

NAME          READY   STATUS    RESTARTS   AGE   NODE
sample-pod    1/1     Running   0          2m    worker-node-2

Uncordon a Node (Re-enable Scheduling)

kubectl uncordon worker-node-1

NAME             STATUS    ROLES    AGE    VERSION
master           Ready     control  10d    v1.27.4
worker-node-1    Ready     <none>   10d    v1.27.4
worker-node-2    Ready     <none>   10d    v1.27.4

Example Pod YAML

apiVersion: v1
kind: Pod
metadata:
  name: sample-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.23
    ports:
    - containerPort: 80

✓

Deploy & Verify

kubectl apply -f sample-pod.yaml
kubectl get pods -o wide

Best Practices & Common Scenarios

⚠

Pre-Drain Checklist

Verify cluster has sufficient capacity for migrated pods
Check PodDisruptionBudgets (PDBs) won't block eviction
Review DaemonSets - use --ignore-daemonsets flag
Ensure StatefulSets have proper replica management
Backup critical pod configurations before draining

🔄

Node Maintenance Workflow

# Step 1: Cordon the node
kubectl cordon worker-node-1

# Step 2: Drain workloads
kubectl drain worker-node-1 --ignore-daemonsets --force

# Step 3: Perform maintenance (OS upgrade, patches, etc.)
# ... maintenance operations ...

# Step 4: Uncordon when ready
kubectl uncordon worker-node-1

# Step 5: Verify node is schedulable
kubectl get nodes
kubectl describe node worker-node-1

📋

Use Case: Rolling Node Upgrades

# Upgrade worker-node-1
kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data
# Upgrade Kubernetes version or OS
kubectl uncordon worker-node-1

# Wait for node to be Ready, then repeat for next node
kubectl drain worker-node-2 --ignore-daemonsets --delete-emptydir-data
# Upgrade worker-node-2
kubectl uncordon worker-node-2

🚨

Troubleshooting Common Issues

Issue: Drain hangs or fails

Check PodDisruptionBudget constraints: kubectl get pdb
Review pods blocking eviction: kubectl get pods --field-selector spec.nodeName=worker-node-1
Use --force for orphaned pods without controllers
Increase grace period: --grace-period=60

Issue: Pods not rescheduling

Check cluster resource availability: kubectl top nodes
Verify node affinity/taints aren't blocking scheduling
Review scheduler events: kubectl get events --sort-by='.lastTimestamp'

Issue: DaemonSet pods remain after drain

This is expected behavior - DaemonSets run on all nodes
Use --ignore-daemonsets to skip them during drain
DaemonSets (monitoring, logging agents) should stay active

Critical Reminder: Always verify node status with kubectl get nodes and pod distribution with kubectl get pods -o wide after each operation. Never drain all nodes simultaneously - maintain cluster availability.

Summary Table

Command	Purpose	Effect on Pods
`kubectl cordon`	Mark node unschedulable	Existing pods unaffected
`kubectl drain`	Evict pods & mark unschedulable	Running pods rescheduled
`kubectl uncordon`	Mark node schedulable	Allows new pod scheduling