← Kubernetes Course11 / 16

Scheduling Pods onto Nodes

By default the scheduler places pods wherever they fit. Steer that decision with labels and nodeSelector, affinity rules, taints and tolerations, and spread pods across failure domains.

Ad 728×90

Steer pods with nodeSelector

Why: sometimes a pod must land on a particular kind of node — one with an SSD, a GPU, or in a certain zone. Label the nodes, then a nodeSelector tells the scheduler to only consider nodes carrying that label. It is the simplest placement control; affinity (next) is the flexible version.

# First label a node:  kubectl label node <name> disktype=ssd
spec:
  nodeSelector:
    disktype: ssd             # only schedule onto nodes labelled so
  containers:
    - name: app
      image: myapp:1.0

Affinity and anti-affinity

Why: affinity expresses richer rules than a flat match. Node affinity is a more expressive nodeSelector. Pod anti-affinity keeps replicas apart — here, "do not put two web pods on the same node", so losing one node never takes out your whole app. required rules are hard; preferred are best-effort.

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app: web
          topologyKey: kubernetes.io/hostname   # one web pod per node

Taints and tolerations

Why: a taint is the inverse of a selector — it REPELS pods from a node unless they explicitly tolerate it. Use it to reserve nodes: taint a GPU node so only GPU workloads (which carry the matching toleration) land there. Control-plane nodes are tainted this way by default to keep your apps off them.

Taint a node: repel everything without a matching toleration

kubectl taint nodes gpu-1 dedicated=gpu:NoSchedule

Remove the taint later (note the trailing minus)

kubectl taint nodes gpu-1 dedicated=gpu:NoSchedule-

A pod that tolerates the taint

Why: only pods carrying a matching toleration may schedule onto a tainted node. The toleration must line up with the taint's key, value, and effect. This pairing — taint the node, tolerate in the pod — is how you dedicate hardware to specific workloads.

spec:
  tolerations:
    - key: dedicated
      operator: Equal
      value: gpu
      effect: NoSchedule       # matches the taint set above
  containers:
    - name: trainer
      image: gpu-job:1.0

Spread across failure domains

Why: even spread beats clustering. Topology spread constraints ask the scheduler to distribute pods evenly across zones or nodes, so an outage in one domain removes only a fraction of your replicas. maxSkew is how uneven the distribution may get before the scheduler intervenes.

spec:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: web