Sampling Parameters

Temperature, Top-P, and Top-K decide how the model picks each next word — from rigidly predictable to wildly creative. Run one prompt across the range and watch the personality change.

Ad 728×90

How the model picks words

Why: at each step the model ranks every possible next word by probability; the sampling settings decide how boldly it strays from the top choice. When: understand this once and all three parameters below stop being mysterious. Where: these live in the playground Parameters panel.

The model is always choosing the NEXT word from a ranked list:

  "The sky is ___"   ->   blue (78%)  clear (9%)  grey (5%)  ...

Sampling parameters control how often it picks something other
than the top word. Low = safe and repetitive. High = varied and
risky. Run the prompts below to feel the difference.

Temperature

Why: temperature is the master creativity dial — low values make the model almost deterministic, high values make it adventurous and occasionally incoherent. When: use ~0 for facts, extraction, and code; ~0.7-1.0 for brainstorming and creative writing. How: run this prompt at 0 and again at 1.

Attributes

temperature = 0 — Nearly deterministic — almost always picks the top word. Best for extraction, classification, math, and code where you want the same answer every time.

temperature = 0.7 — A balanced default — varied but still coherent. Good for chat and general writing.

temperature = 1.0+ — Creative and surprising, with more risk of going off-topic or making things up. Good for brainstorming names, slogans, and fiction.

Give me a name for a coffee shop on a rainy harbour.

Run this 3 times at temperature 0  -> nearly identical answers.
Run this 3 times at temperature 1  -> three different vibes.

Top-P (nucleus sampling)

Why: Top-P keeps only the smallest set of top words whose probabilities add up to P, then samples from those — a smarter cap than a fixed count. When: lower it (e.g. 0.5) to keep answers on-topic; 1.0 considers everything. How: most teams tune temperature OR Top-P, not both at once.

Attributes

top_p = 1.0 — Considers the full distribution of next words. The default.

top_p = 0.5 — Only the most probable words that together make up 50% of the probability mass are eligible. Tightens and focuses the output.

Set temperature to 1.0, then compare top_p = 1.0 vs top_p = 0.3:

"Continue the story: The lighthouse keeper opened the door and ___"

At 0.3 the continuation stays safe and predictable; at 1.0 it can
go anywhere.

Top-K

Why: Top-K is the simplest cap — only the K most likely next words are ever considered. When: use it as a blunt limit on randomness; many modern APIs prefer Top-P, but Top-K still appears in Google and open-source models. How: small K (e.g. 5) is conservative, large K is loose.

Attributes

top_k = 1 — Greedy — always the single most likely word. Equivalent to temperature 0.

top_k = 40 — A common default — samples from the 40 most likely words, balancing variety and coherence.

If your playground exposes top_k, set top_k = 1 and run any
prompt twice — the answers are identical (greedy decoding).

Then set top_k = 50 and run again — variety returns.

How the model picks words

The model is always choosing the NEXT word from a ranked list:

  "The sky is ___"   ->   blue (78%)  clear (9%)  grey (5%)  ...

Sampling parameters control how often it picks something other
than the top word. Low = safe and repetitive. High = varied and
risky. Run the prompts below to feel the difference.

Temperature

Attributes

temperature = 0 — Nearly deterministic — almost always picks the top word. Best for extraction, classification, math, and code where you want the same answer every time.

temperature = 0.7 — A balanced default — varied but still coherent. Good for chat and general writing.

temperature = 1.0+ — Creative and surprising, with more risk of going off-topic or making things up. Good for brainstorming names, slogans, and fiction.

Give me a name for a coffee shop on a rainy harbour.

Run this 3 times at temperature 0  -> nearly identical answers.
Run this 3 times at temperature 1  -> three different vibes.

Top-P (nucleus sampling)

Attributes

top_p = 1.0 — Considers the full distribution of next words. The default.

top_p = 0.5 — Only the most probable words that together make up 50% of the probability mass are eligible. Tightens and focuses the output.

Set temperature to 1.0, then compare top_p = 1.0 vs top_p = 0.3:

"Continue the story: The lighthouse keeper opened the door and ___"

At 0.3 the continuation stays safe and predictable; at 1.0 it can
go anywhere.

Top-K

Attributes

top_k = 1 — Greedy — always the single most likely word. Equivalent to temperature 0.

top_k = 40 — A common default — samples from the 40 most likely words, balancing variety and coherence.

If your playground exposes top_k, set top_k = 1 and run any
prompt twice — the answers are identical (greedy decoding).

Then set top_k = 50 and run again — variety returns.