Bun 1.2 Has A Native LLM Module

If you've been living under a rock, Bun has been quietly eating Node's lunch on DX for a year. Plot twist: Bun 1.2 just shipped a stdlib LLM module that loads ggml weights directly. No FFI, no install, just an import.

The Setup

The new built-in lives at `bun:llm` and wraps llama.cpp under the hood. Drop a GGUF file on disk, point at it, and you're streaming tokens in twelve lines.

import { llm } from "bun:llm";

const model = await llm.load("./models/qwen2.5-7b-instruct-q4_k_m.gguf", {
  contextSize: 8192,
  gpuLayers: 35, // M4 Metal offload
});

const stream = model.stream({
  prompt: "write me a zod schema for an invoice",
  maxTokens: 512,
});

for await (const token of stream) {
  process.stdout.write(token);
}

The Money Pattern

The killer use case is local-first scripts. I've got a CSV cleanup pipeline that runs against Pipedrive exports — now it lives as a single `bun run clean.ts` with the model loaded inline. No Ollama daemon, no HTTP hop, just direct memory access on the M4.

// bun run enrich-leads.ts
import { llm } from "bun:llm";
import { read } from "bun";

const model = await llm.load("./models/llama-3.2-3b.gguf");
const csv = await read("./leads.csv").text();

const cleaned = await model.complete({
  prompt: `normalise these company names to title case:\n${csv}`,
  maxTokens: 2048,
});

await Bun.write("./leads-clean.csv", cleaned);

The Catch

It's flagged experimental, the API is going to churn, and it only takes ggml-compatible weights — no safetensors, no MLX. Production deploys should still go through a real inference server. This is a scripting hammer, not a serving layer.

The Verdict

For local dev scripts and one-shot data jobs this is a game changer. Bun keeps shipping features that should have been in Node a decade ago. If you're not at least kicking the tyres on Bun for side projects in 2026, you're working harder than you need to.

Dev Tools

Local inference as a stdlib import. Wild times.

The Setup

The Money Pattern

The Catch

The Verdict

Let us make some quick suggestions?