Optimizers wrote a better prompt than you did. Yes really.
Plot twist: the best prompt for your task wasn't written by a vibes engineer on Twitter, it was compiled by an optimizer at 2am.
The Setup
DSPy treats prompts like code: you declare a signature, hand it a metric, and let an optimizer search for the prompt that maximises your eval score. v3.0 cleaned up the signature DSL so you can actually read it without crying.
pip install dspy
# Bring your own eval dataset. Seriously.The Money Pattern
Declare what you want. Let DSPy figure out how to ask. The compile step costs API calls but you only pay once — and the resulting prompt is usually 20–40% better than whatever you hand-wrote on a Tuesday.
import dspy
dspy.configure(lm=dspy.LM("anthropic/claude-sonnet-4-5"))
class TriageClaim(dspy.Signature):
"""Decide severity of a Queensland hail claim."""
claim_notes: str = dspy.InputField()
severity: str = dspy.OutputField(desc="one of: minor, moderate, severe")
triage = dspy.ChainOfThought(TriageClaim)
# Compile against a labelled set
trainset = [dspy.Example(claim_notes=n, severity=s).with_inputs("claim_notes")
for n, s in load_labels()]
def metric(ex, pred, trace=None):
return ex.severity == pred.severity
optimizer = dspy.MIPROv2(metric=metric, auto="light")
compiled = optimizer.compile(triage, trainset=trainset)
print(compiled(claim_notes="Cracked skylights, dents on north roof.").severity)The Catch
Optimizers cost real money — every compile run hammers your provider. The learning curve is steeper than "write a prompt and pray", and if your eval set is garbage the optimizer will confidently produce a garbage prompt. Evals first, always.
The Verdict
Hand-tuning prompts in 2026 is going to look the way hand-tuning hyperparameters did in 2018 — quaint, expensive, and a tell that you haven't read the docs. DSPy isn't hype, it's the next default.