Dev Tools

Deno Just Added llama.cpp Bindings

All articles
🦕🦙🔌

Bun started a fight and Deno just answered

If you've been living under a rock for forty-eight hours, Bun shipped a native LLM module on Monday. Plot twist: Deno 2.1 dropped on Wednesday with official llama.cpp FFI bindings. The runtime wars are getting spicy.

The Setup

Deno's approach is slightly different — they're shipping a typed FFI wrapper around the system llama.cpp shared lib instead of bundling it. Cleaner architecturally, slightly more setup to get going.

// main.ts — run with: deno serve --allow-ffi --allow-read main.ts
import { LlamaModel } from "jsr:@deno/llama";

const model = await LlamaModel.load({
  path: "./models/qwen2.5-7b.gguf",
  contextSize: 8192,
});

export default {
  async fetch(req: Request): Promise<Response> {
    const { prompt } = await req.json();
    const stream = model.stream({ prompt, maxTokens: 512 });
    return new Response(stream, {
      headers: { "content-type": "text/event-stream" },
    });
  },
};

The Money Pattern

`deno serve` plus a streaming LLM endpoint is basically a one-file Ollama replacement. I spun this up locally on the M4 and had a working SSE endpoint feeding a Tailwind 4 chat UI in under twenty minutes. Permissions model means you can actually trust the script too.

# install llama.cpp via brew first
brew install llama.cpp

# then ship a public endpoint
deno serve --allow-ffi --allow-read --port 8080 main.ts

# stream a completion
curl -N -X POST localhost:8080 \
  -H "content-type: application/json" \
  -d '{"prompt":"write a haiku about TypeScript"}'

The Catch

Deno's still playing catch-up here. The FFI bindings work great on macOS and Linux, Windows is rough. The Bun approach of bundling llama.cpp is more zero-setup. And honestly, most of the LLM ecosystem is still writing tutorials for Bun first.

The Verdict

If you're already on Deno for the permissions and JSR ecosystem, this is great news — you no longer need to shell out to a Python sidecar. If you're picking a runtime from scratch for LLM work, Bun is still the smoother ride. Either way, Node 24 is looking lonely.

Let us make some quick suggestions?
Please provide your full name.
Please provide your phone number.
Please provide a valid phone number.
Please provide your email address.
Please provide a valid email address.
Please provide your brand name or website.
Please provide your brand name or website.