Guidance Library Quietly Became Essential

Plot twist: while everyone's writing JSON validators that retry three times, Microsoft's Guidance library just forces the model to emit valid tokens. The model literally cannot break the schema.

The Setup

Guidance hooks into the logit stream of a local model and masks out tokens that would violate your constraint. Want JSON? Mask anything non-JSON. Want a regex? Mask anything that doesn't match. Output is guaranteed-valid by construction.

from guidance import models, gen, select

llm = models.LlamaCpp("./ghost-pepper-7b.Q4_K_M.gguf")

lm = llm + f"""
Customer: my roof is leaking after the storm
Severity (1-5): {gen("severity", regex=r"[1-5]")}
Department: {select(["roofing", "general", "emergency"], name="dept")}
"""

print(lm["severity"], lm["dept"])

The Money Pattern

Combine \`gen\` with regex and \`select\` from a literal list, and you've replaced an entire validation layer. No retries, no Pydantic catch blocks, no "the model returned malformed JSON" Slack pings at 2am.

from guidance import gen, select, one_or_more

@guidance
def claim_form(lm, transcript: str):
    lm += f"Transcript: {transcript}
"
    lm += f"Postcode: {gen('postcode', regex=r'[0-9]{4}')}
"
    lm += f"Damage: {select(['hail', 'wind', 'flood', 'fire'], name='damage')}
"
    lm += f"Notes: {gen('notes', max_tokens=80, stop='\n')}"
    return lm

result = llm + claim_form("caller at 4870, hail, garage shed flattened")

The Catch

Guidance needs logit access — that means a local model or a provider that exposes logprobs. OpenAI and Anthropic don't give you the token-mask hook, so this is a llama.cpp / Transformers / vLLM play. On hosted APIs you're stuck with retry loops.

The Verdict

If you're running local models — which on an M4 Mac is finally pleasant — Guidance is the difference between a flaky JSON pipeline and a deterministic one. Pair it with Pydantic for typing on the outside and you've got a bulletproof extractor. Quietly essential.

Dev Tools

Constrained decoding at the token level — your model cannot lie

The Setup

The Money Pattern

The Catch

The Verdict

Let us make some quick suggestions?