⌂ Home

Kubernetes Cordon & Drain Workflow

Node Maintenance & Pod Migration Guide

3-State Node Lifecycle

Ready
Schedulable
Cordoned
SchedulingDisabled
Drained
Empty & Disabled
Ready
Schedulable

Cordon

  • Blocks new pods
  • Existing pods run
  • Reversible instantly
  • No disruption

Drain

  • Evicts all pods
  • Respects PDB
  • Graceful shutdown
  • Auto cordons node

Uncordon

  • Re-enables scheduling
  • Pods can schedule
  • No auto migration
  • Post-maintenance

Pod Migration Timeline

Before Drain:
worker-node-1
P1
P2
P3
P4
worker-node-2
P5
P6
During Drain:
worker-node-1
P1
P2
P3
P4
worker-node-2
P5
P6
P1
P2
P3
P4
After Drain:
worker-node-1
worker-node-2
P5
P6
P1
P2
P3
P4
Note: Drain respects PodDisruptionBudget (PDB) to ensure application availability. DaemonSets remain on the node unless --ignore-daemonsets is used.
1
Cordon a Node (Block New Pods)
kubectl cordon worker-node-1
NAME             STATUS                     ROLES    AGE    VERSION
master           Ready                      control  10d    v1.27.4
worker-node-1    Ready,SchedulingDisabled   <none>   10d    v1.27.4
worker-node-2    Ready                      <none>   10d    v1.27.4
2
Drain a Node (Evict All Pods)
kubectl drain worker-node-1 --ignore-daemonsets --force
Flags:
  • --ignore-daemonsets - Skip DaemonSet pods (system components)
  • --force - Force eviction for pods without controllers
  • --delete-emptydir-data - Delete pods with emptyDir volumes
  • --grace-period=30 - Graceful termination period (seconds)
3
Verify Pod Migration
kubectl get pods -o wide
NAME          READY   STATUS    RESTARTS   AGE   NODE
sample-pod    1/1     Running   0          2m    worker-node-2
4
Uncordon a Node (Re-enable Scheduling)
kubectl uncordon worker-node-1
NAME             STATUS    ROLES    AGE    VERSION
master           Ready     control  10d    v1.27.4
worker-node-1    Ready     <none>   10d    v1.27.4
worker-node-2    Ready     <none>   10d    v1.27.4

Example Pod YAML

apiVersion: v1
kind: Pod
metadata:
  name: sample-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.23
    ports:
    - containerPort: 80
Deploy & Verify
kubectl apply -f sample-pod.yaml
kubectl get pods -o wide

Best Practices & Common Scenarios

Pre-Drain Checklist
  • Verify cluster has sufficient capacity for migrated pods
  • Check PodDisruptionBudgets (PDBs) won't block eviction
  • Review DaemonSets - use --ignore-daemonsets flag
  • Ensure StatefulSets have proper replica management
  • Backup critical pod configurations before draining
🔄
Node Maintenance Workflow
# Step 1: Cordon the node
kubectl cordon worker-node-1

# Step 2: Drain workloads
kubectl drain worker-node-1 --ignore-daemonsets --force

# Step 3: Perform maintenance (OS upgrade, patches, etc.)
# ... maintenance operations ...

# Step 4: Uncordon when ready
kubectl uncordon worker-node-1

# Step 5: Verify node is schedulable
kubectl get nodes
kubectl describe node worker-node-1
📋
Use Case: Rolling Node Upgrades
# Upgrade worker-node-1
kubectl drain worker-node-1 --ignore-daemonsets --delete-emptydir-data
# Upgrade Kubernetes version or OS
kubectl uncordon worker-node-1

# Wait for node to be Ready, then repeat for next node
kubectl drain worker-node-2 --ignore-daemonsets --delete-emptydir-data
# Upgrade worker-node-2
kubectl uncordon worker-node-2
🚨
Troubleshooting Common Issues
Issue: Drain hangs or fails
  • Check PodDisruptionBudget constraints: kubectl get pdb
  • Review pods blocking eviction: kubectl get pods --field-selector spec.nodeName=worker-node-1
  • Use --force for orphaned pods without controllers
  • Increase grace period: --grace-period=60
Issue: Pods not rescheduling
  • Check cluster resource availability: kubectl top nodes
  • Verify node affinity/taints aren't blocking scheduling
  • Review scheduler events: kubectl get events --sort-by='.lastTimestamp'
Issue: DaemonSet pods remain after drain
  • This is expected behavior - DaemonSets run on all nodes
  • Use --ignore-daemonsets to skip them during drain
  • DaemonSets (monitoring, logging agents) should stay active
Critical Reminder: Always verify node status with kubectl get nodes and pod distribution with kubectl get pods -o wide after each operation. Never drain all nodes simultaneously - maintain cluster availability.
Summary Table
Command Purpose Effect on Pods
kubectl cordon Mark node unschedulable Existing pods unaffected
kubectl drain Evict pods & mark unschedulable Running pods rescheduled
kubectl uncordon Mark node schedulable Allows new pod scheduling