Lorebooks, persistent characters, and a UI that gets it
If you've been living under a rock, the local AI scene quietly forked into two camps: the coding nerds running Ollama and the writing nerds running KoboldCpp. Spoiler: the writing nerds are having way more fun.
The Setup
KoboldCpp is llama.cpp under the hood with a single-binary UI built for roleplay, story generation, and long-form writing. No Docker, no Python venv, no nonsense — download one file and run it on your M4 Mac.
# grab the binary, point it at any GGUF
./koboldcpp --model models/midnight-miqu-70b-q4.gguf \
--contextsize 16384 \
--gpulayers 99 \
--port 5001
# open http://localhost:5001 and you're in the Kobold Lite UIThe Money Pattern
Lorebooks are the killer feature. You define entities once — characters, factions, settings — and Kobold injects them into context only when their keywords appear. It's basically poor-man's RAG, but for vibes.
{
"char_name": "Vex",
"description": "A half-elf rogue with a fondness for arson and bad decisions.",
"personality": "sarcastic, loyal, allergic to authority",
"scenario": "The party is camped outside Highrun Pass.",
"first_mes": "Vex flicks open a knife. 'So. Who's robbing us first?'",
"world_info": [
{ "keys": ["Highrun"], "content": "A frontier town run by smugglers." },
{ "keys": ["Vex's dagger"], "content": "Stolen from a Duke. Cursed." }
]
}The Catch
This is not a coding tool. The UI assumes you're writing fiction, not refactoring TypeScript. There's no agent loop, no tool calling, no MCP. If you're trying to ship a SaaS, you're in the wrong repo.
The Verdict
For Dungeon Masters, novelists, and anyone who wants a local AI that doesn't feel like a corporate help desk, KoboldCpp is genuinely the best thing going. Pair it with a 70B Miqu finetune and a decent lorebook and you've got a co-writer that runs offline forever. Niche, but unbeatable in its niche.