Falcon 3 Just Went Open Weight

If you've been living under a rock since Falcon 2 launched and underwhelmed, TII just shipped Falcon 3 in 7B, 40B and 180B sizes under a permissive license. The UAE is back in the open-weight game and this time the benches don't make you laugh.

The Setup

Falcon 3 is on HuggingFace, weights are downloadable from my M4 Mac without a clickwrap dance, and the tokenizer finally handles code without exploding. The 7B is the daily-driver size — fits in 8GB VRAM at Q4.

huggingface-cli download tiiuae/falcon-3-7b-instruct \
  --local-dir ./models/falcon-3-7b

# or pull a GGUF for llama.cpp / ollama
ollama pull falcon3:7b
ollama run falcon3:7b "write a postgres rls policy for tenant_id"

The Money Pattern

Behold — the chat template just works in transformers. No custom tokenizer surgery, no jinja gymnastics. I plugged it into a tiny Astro 5 + Netlify Functions endpoint for a Rebuild Relief internal tool and it shipped in an afternoon.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("tiiuae/falcon-3-7b-instruct")
model = AutoModelForCausalLM.from_pretrained(
    "tiiuae/falcon-3-7b-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

msgs = [
    {"role": "system", "content": "You are a SQL expert."},
    {"role": "user", "content": "write me a CTE for daily active tenants"},
]
inputs = tok.apply_chat_template(msgs, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=True))

The Catch

Plot twist — it trails Llama 3.3 70B and Qwen 3 on the standard benches. Falcon 3 7B is competitive with Llama 3 8B, not the 70B class. The 180B is a beast but you need an 8xH100 box to serve it usefully, which kills the hobbyist story.

The Verdict

Falcon 3 is the most credible TII release in two years. The 7B is a fine daily driver, the 40B is genuinely useful for self-hosted RAG, and the license actually lets you ship. Won't dethrone Llama or Qwen on raw benches, but for sovereign-deploy or multilingual workloads it's a real option.

Open Source

UAE-built, Apache-flavored, actually downloadable

The Setup

The Money Pattern

The Catch

The Verdict

Let us make some quick suggestions?