← Azure Course4 / 13

Scaling & Load Balancing

Make your app survive traffic spikes and failures: define a VM Scale Set that grows and shrinks automatically with autoscale rules, and spread traffic across instances with a load balancer and health probes.

Ad 728×90

The goal: handle more traffic, survive failures

One VM cannot handle a traffic spike or survive its own failure. The fix: run several identical VMs that a load balancer spreads traffic across, and let a Scale Set keep the right number healthy. Why: your app stays up when a VM dies and grows when demand rises — automatically.

This lesson builds, in order: 1. a VM Scale Set (a fleet of identical VMs from one model) 2. a Load Balancer (created automatically with the scale set) 3. Autoscale rules (change the VM count based on load)

echo "Reuse learn-rg and my-vnet from earlier lessons."

VM Scale Sets — a managed fleet of VMs

A Virtual Machine Scale Set (VMSS) launches and manages a group of identical VMs from one definition. Why: it is the unit of scaling — you set a capacity and Azure keeps that many running, replacing any that fail. Creating one with a public load balancer wires the front door at the same time.

Create a scale set of 2 VMs behind a new load balancer

az vmss create \
  --resource-group learn-rg \
  --name web-vmss \
  --image Ubuntu2204 \
  --vm-sku Standard_B1s \
  --instance-count 2 \
  --vnet-name my-vnet --subnet public \
  --admin-username azureuser --generate-ssh-keys \
  --upgrade-policy-mode automatic

Load balancer & health probes

The load balancer gives clients one address and spreads requests across healthy instances. A health probe pings each instance; if one stops replying it is pulled from rotation. Why: clients never hit a broken VM, and you can replace instances without anyone noticing.

Add a health probe on port 80

az network lb probe create --resource-group learn-rg \
  --lb-name web-vmssLB --name http-probe \
  --protocol Http --port 80 --path /

Add a rule forwarding inbound port 80 to the backend pool, using the probe

az network lb rule create --resource-group learn-rg \
  --lb-name web-vmssLB --name http-rule \
  --protocol Tcp --frontend-port 80 --backend-port 80 \
  --probe-name http-probe \
  --backend-pool-name web-vmssLBBEPool

Autoscale rules — grow and shrink automatically

An autoscale setting changes the instance count based on a metric. You set a min, a max, and rules: add a VM when average CPU is high, remove one when it is low. Why: you pay for capacity only when traffic justifies it, and the app keeps up during spikes.

Define the min/max/default for the scale set

az monitor autoscale create --resource-group learn-rg \
  --resource web-vmss --resource-type Microsoft.Compute/virtualMachineScaleSets \
  --name web-autoscale --min-count 2 --max-count 6 --count 2

Scale OUT by 1 when average CPU exceeds 70%

az monitor autoscale rule create --resource-group learn-rg \
  --autoscale-name web-autoscale \
  --condition "Percentage CPU > 70 avg 5m" --scale out 1

Scale IN by 1 when average CPU drops below 30%

az monitor autoscale rule create --resource-group learn-rg \
  --autoscale-name web-autoscale \
  --condition "Percentage CPU < 30 avg 5m" --scale in 1