Affinity and Anti-Affinity in Kubernetes

Core Mental Model

Why Do You Need Affinity Rules?

Repository YAML Files:

k8s/labs/scheduling/affinity/node-affinity.yaml — Pod with required node affinity matching node labels (e.g. env=production).
k8s/labs/scheduling/affinity/node-affinity-2.yaml — Pod with required node affinity targeting nodes labeled env=staging.
k8s/labs/scheduling/affinity/pod-anti-affinity.yaml — Pod anti-affinity to spread Pods across nodes or topology domains.
k8s/labs/scheduling/affinity/pod-anti-affinity-2.yaml — Second anti-affinity example with different constraints.

The problem

By default, the scheduler picks any node with enough CPU and memory. You have no control over which node a Pod lands on, whether related Pods end up together, or whether replicas spread out.

Node Affinity

"Run me on this kind of node"

Attract a Pod to nodes matching specific node labels (e.g. env=production, disktype=ssd, gpu=true). It's like NodeSelector, but far more expressive.

Pod Affinity

"Run me next to that Pod"

Schedule a Pod on the same node (or zone) as another Pod. Use when a cache sidecar needs to sit alongside the app, or a logging agent must co-locate for low latency.

Pod Anti-Affinity

"Keep me away from that Pod"

Prevent a Pod from landing on a node (or zone) that already has a matching Pod. Critical for spreading database replicas or web frontends across failure domains.

Real-World Analogy: Seating at a Conference

Node Affinity = "I want a seat in the VIP section" (choose your area by room labels).
Pod Affinity = "Seat me next to my colleague" (co-locate with a specific person).
Pod Anti-Affinity = "Don't seat two managers at the same table" (spread leaders across tables for broader coverage).

Hard vs Soft Constraints

Required (requiredDuringScheduling...) = "I absolutely must sit in VIP — if VIP is full, I'll wait in the lobby (Pending) rather than sit elsewhere."
Preferred (preferredDuringScheduling...) = "I'd like VIP, but if it's full, I'll take any available seat."

Key insight: "IgnoredDuringExecution" means if node or Pod labels change after scheduling, the already-running Pod is not evicted. Affinity rules only affect the initial placement decision.

The Three Mechanisms

Node Affinity, Pod Affinity, Pod Anti-Affinity

Node Affinity

What it matches: Node labels (characteristics of the machine)

Spec path: spec.affinity.nodeAffinity

When to use:

Require GPU nodes for ML workloads
Prefer SSD nodes for databases
Isolate production from staging by node labels
Target specific regions or zones

Topology key: Not used — rules match node labels directly.

Pod Affinity (Co-locate)

What it matches: Labels of other Pods already running

Spec path: spec.affinity.podAffinity

When to use:

Place cache next to the app for low-latency access
Co-locate a logging sidecar with the data producer
Keep tightly coupled microservices on the same node

Topology key: Required — defines the "same location" boundary (hostname, zone, region).

Pod Anti-Affinity (Spread)

What it matches: Labels of other Pods already running

Spec path: spec.affinity.podAntiAffinity

When to use:

Spread 3 database replicas across 3 nodes
Ensure web frontends don't share a failure domain
Prevent resource contention between heavy workloads

Topology key: Required — defines the "different location" boundary.

Required (Hard Constraint)

requiredDuringSchedulingIgnoredDuringExecution

The rule must be satisfied. If no node qualifies, the Pod stays Pending indefinitely. Use for non-negotiable placement needs.

Preferred (Soft Constraint)

preferredDuringSchedulingIgnoredDuringExecution

The scheduler tries to satisfy the rule. If it can't, the Pod is still scheduled on the best available node. Uses a weight (1-100) to rank preferences.

topologyKey defines the scope for Pod Affinity / Anti-Affinity. Common values:
kubernetes.io/hostname — per-node granularity (most common).
topology.kubernetes.io/zone — per availability zone (for HA across zones).
topology.kubernetes.io/region — per region.

Visual Walkthrough

How Each Affinity Type Works

Click each scenario to see the scheduling outcome.

Node Affinity (required): The Pod declares nodeAffinity: env In [production]. The scheduler evaluates each node's labels. Only Worker Node 1 matches, so the Pod lands there. If no node had matched, the Pod would stay Pending.

YAML and Commands

Copy-Paste Examples

Node Affinity — Required (must run on production nodes)

apiVersion: v1
kind: Pod
metadata:
  name: production-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: env
            operator: In
            values:
            - production
  containers:
  - name: nginx
    image: nginx

The Pod must land on a node with label env=production. If no such node exists, the Pod stays Pending.

Node Affinity — Preferred (prefer SSD, accept anything)

apiVersion: v1
kind: Pod
metadata:
  name: preferred-ssd-pod
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 80
        preference:
          matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
      - weight: 20
        preference:
          matchExpressions:
          - key: env
            operator: In
            values:
            - production
  containers:
  - name: app
    image: nginx

Scheduler scores nodes: SSD nodes get +80, production nodes get +20. Best match wins, but the Pod still schedules even if neither matches.

Pod Affinity — Co-locate cache with the app

apiVersion: v1
kind: Pod
metadata:
  name: cache-pod
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - myapp
        topologyKey: kubernetes.io/hostname
  containers:
  - name: redis
    image: redis:7

This Redis cache Pod must land on the same node as a Pod labeled app=myapp. The topologyKey = hostname means "same node".

Pod Anti-Affinity — Spread web replicas across nodes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-frontend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web
            topologyKey: kubernetes.io/hostname
      containers:
      - name: nginx
        image: nginx

Each replica must be on a different node. With 3 replicas you need at least 3 nodes. If only 2 nodes exist, the third replica stays Pending.

Combined — Production nodes + spread across zones

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ha-database
spec:
  replicas: 3
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: env
                operator: In
                values:
                - production
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - database
              topologyKey: topology.kubernetes.io/zone
      containers:
      - name: postgres
        image: postgres:16

Must be on production nodes (hard node affinity). Prefers to spread replicas across different zones (soft anti-affinity). Classic HA database pattern.

Match Expressions

Operators and Topology Keys

Label Match Operators

Operator	Meaning	Example	Available In
In	Label value is one of the listed values	`key: env, values: [production, staging]`	Node + Pod affinity
NotIn	Label value is NOT one of the listed values	`key: env, values: [dev]` — excludes dev	Node + Pod affinity
Exists	Label key exists (any value)	`key: gpu` — any node that has a gpu label	Node + Pod affinity
DoesNotExist	Label key must NOT exist	`key: temporary` — exclude temporary nodes	Node + Pod affinity
Gt	Label value greater than (integer)	`key: cpu-cores, values: [4]` — nodes with >4 cores	Node affinity only
Lt	Label value less than (integer)	`key: cpu-cores, values: [16]` — nodes with <16 cores	Node affinity only

Common Topology Keys

Topology Key	Scope	Use Case
`kubernetes.io/hostname`	Per node	Spread replicas across individual nodes (most common)
`topology.kubernetes.io/zone`	Per availability zone	Spread across AZs for zone-level HA
`topology.kubernetes.io/region`	Per region	Spread across regions (multi-region clusters)
Custom label key	Custom domain	Any node label can serve as a topology key (e.g. `rack`)

Logic rules: Multiple matchExpressions within a single term are ANDed together. Multiple nodeSelectorTerms are ORed — the node must match at least one term.

Decision Guide

Side-by-Side Comparison

Aspect	Node Affinity	Pod Affinity	Pod Anti-Affinity
Purpose	Attract Pods to specific nodes	Co-locate Pods together	Spread Pods apart
Matches against	Node labels	Pod labels	Pod labels
Topology key	Not used	Required	Required
Operators	In, NotIn, Exists, DoesNotExist, Gt, Lt	In, NotIn, Exists, DoesNotExist	In, NotIn, Exists, DoesNotExist
Required?	Yes	Yes	Yes
Preferred?	Yes (with weights 1-100)	Yes (with weights 1-100)	Yes (with weights 1-100)
Common use case	GPU nodes, SSD nodes, env isolation	Cache with app, sidecar co-location	HA replicas, zone spread
Performance note	Lightweight — checks node labels	Heavier — scans existing Pods	Heavier — scans existing Pods

NodeSelector vs Node Affinity

Feature	NodeSelector	Node Affinity
Complexity	Simple key=value match	Rich expressions with 6 operators
Soft constraints	No — always required	Yes — preferred with weights
Multiple rules	AND only	AND within a term, OR across terms
Best for	Simple placement needs	Complex, multi-criteria scheduling

kubectl Reference

Useful Commands for Affinity Work

Action	Command
Label a node	`kubectl label nodes worker-1 env=production`
Remove a node label	`kubectl label nodes worker-1 env-`
Show node labels	`kubectl get nodes --show-labels`
Show specific label column	`kubectl get nodes -L env,disktype`
Check Pod placement	`kubectl get pods -o wide`
View affinity rules on a Pod	`kubectl get pod NAME -o yaml \| grep -A 25 affinity`
Which node is a Pod on?	`kubectl get pod NAME -o jsonpath='{.spec.nodeName}'`
Pods sorted by node	`kubectl get pods -o wide --sort-by=.spec.nodeName`
Scheduling events	`kubectl get events --sort-by=.metadata.creationTimestamp \| grep -i schedul`
Describe Pod (FailedScheduling)	`kubectl describe pod NAME`
Show Pod labels	`kubectl get pods --show-labels`
Node topology labels	`kubectl get nodes -L topology.kubernetes.io/zone`

Use It Well

Practice and Real-World Thinking

GPU workloads

Label GPU nodes with hardware=gpu. Use required node affinity so ML training Pods only land on GPU nodes and never waste time on CPU-only machines.

HA database replicas

Deploy a 3-replica StatefulSet with required pod anti-affinity on kubernetes.io/hostname. Each replica is guaranteed a separate node.

Cache co-location

Use pod affinity to place a Redis sidecar on the same node as the app Pod. Eliminates network hops for cache reads.

Zone-level HA

Use preferred pod anti-affinity with topology.kubernetes.io/zone to spread web frontend replicas across availability zones.

Environment isolation

Label nodes env=production or env=staging. Use required node affinity to ensure production workloads never land on staging infrastructure.

Prefer SSD, accept HDD

Use preferred node affinity with weight: 80 for disktype=ssd. The database gets SSD when available but can still run on HDD if SSD nodes are full.