Temperature, Top-P, and Top-K decide how the model picks each next word — from rigidly predictable to wildly creative. Run one prompt across the range and watch the personality change.
Why: at each step the model ranks every possible next word by probability; the sampling settings decide how boldly it strays from the top choice. When: understand this once and all three parameters below stop being mysterious. Where: these live in the playground Parameters panel.
The model is always choosing the NEXT word from a ranked list:
"The sky is ___" -> blue (78%) clear (9%) grey (5%) ...
Sampling parameters control how often it picks something other
than the top word. Low = safe and repetitive. High = varied and
risky. Run the prompts below to feel the difference.Why: temperature is the master creativity dial — low values make the model almost deterministic, high values make it adventurous and occasionally incoherent. When: use ~0 for facts, extraction, and code; ~0.7-1.0 for brainstorming and creative writing. How: run this prompt at 0 and again at 1.
temperature = 0 — Nearly deterministic — almost always picks the top word. Best for extraction, classification, math, and code where you want the same answer every time.temperature = 0.7 — A balanced default — varied but still coherent. Good for chat and general writing.temperature = 1.0+ — Creative and surprising, with more risk of going off-topic or making things up. Good for brainstorming names, slogans, and fiction.Give me a name for a coffee shop on a rainy harbour.
Run this 3 times at temperature 0 -> nearly identical answers.
Run this 3 times at temperature 1 -> three different vibes.Why: Top-P keeps only the smallest set of top words whose probabilities add up to P, then samples from those — a smarter cap than a fixed count. When: lower it (e.g. 0.5) to keep answers on-topic; 1.0 considers everything. How: most teams tune temperature OR Top-P, not both at once.
top_p = 1.0 — Considers the full distribution of next words. The default.top_p = 0.5 — Only the most probable words that together make up 50% of the probability mass are eligible. Tightens and focuses the output.Set temperature to 1.0, then compare top_p = 1.0 vs top_p = 0.3:
"Continue the story: The lighthouse keeper opened the door and ___"
At 0.3 the continuation stays safe and predictable; at 1.0 it can
go anywhere.Why: Top-K is the simplest cap — only the K most likely next words are ever considered. When: use it as a blunt limit on randomness; many modern APIs prefer Top-P, but Top-K still appears in Google and open-source models. How: small K (e.g. 5) is conservative, large K is loose.
top_k = 1 — Greedy — always the single most likely word. Equivalent to temperature 0.top_k = 40 — A common default — samples from the 40 most likely words, balancing variety and coherence.If your playground exposes top_k, set top_k = 1 and run any
prompt twice — the answers are identical (greedy decoding).
Then set top_k = 50 and run again — variety returns.