DSPy 3.0 — Stop Prompting, Start Programming · Blog

Plot twist: the best prompt for your task wasn't written by a vibes engineer on Twitter, it was compiled by an optimizer at 2am.

The Setup

DSPy treats prompts like code: you declare a signature, hand it a metric, and let an optimizer search for the prompt that maximises your eval score. v3.0 cleaned up the signature DSL so you can actually read it without crying.

pip install dspy
# Bring your own eval dataset. Seriously.

The Money Pattern

Declare what you want. Let DSPy figure out how to ask. The compile step costs API calls but you only pay once — and the resulting prompt is usually 20–40% better than whatever you hand-wrote on a Tuesday.

import dspy

dspy.configure(lm=dspy.LM("anthropic/claude-sonnet-4-5"))

class TriageClaim(dspy.Signature):
    """Decide severity of a Queensland hail claim."""
    claim_notes: str = dspy.InputField()
    severity: str = dspy.OutputField(desc="one of: minor, moderate, severe")

triage = dspy.ChainOfThought(TriageClaim)

# Compile against a labelled set
trainset = [dspy.Example(claim_notes=n, severity=s).with_inputs("claim_notes")
            for n, s in load_labels()]

def metric(ex, pred, trace=None):
    return ex.severity == pred.severity

optimizer = dspy.MIPROv2(metric=metric, auto="light")
compiled = optimizer.compile(triage, trainset=trainset)
print(compiled(claim_notes="Cracked skylights, dents on north roof.").severity)

The Catch

Optimizers cost real money — every compile run hammers your provider. The learning curve is steeper than "write a prompt and pray", and if your eval set is garbage the optimizer will confidently produce a garbage prompt. Evals first, always.

The Verdict

Hand-tuning prompts in 2026 is going to look the way hand-tuning hyperparameters did in 2018 — quaint, expensive, and a tell that you haven't read the docs. DSPy isn't hype, it's the next default.

AI/LLM