Vector + BM25, batteries included, GraphQL still flexing
Plot twist: Weaviate v1.30 quietly made hybrid search the default behavior instead of an opt-in flag. Combined with the built-in embedding modules, you can stand up a real RAG pipeline without writing a line of embedding code.
The Setup
Spin up Weaviate, tell it which embedding module to use (`text2vec-openai`, `text2vec-cohere`, `text2vec-transformers`), and it handles the vectorisation on ingest and query. No more two-step embed-then-upsert dance.
{`# docker-compose snippet
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:1.30.0
environment:
DEFAULT_VECTORIZER_MODULE: text2vec-transformers
ENABLE_MODULES: text2vec-transformers,reranker-transformers
TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080
ports:
- "8080:8080"`}The Money Pattern
The new default hybrid query blends vector similarity with BM25, weighted by an `alpha` parameter. 0.5 is a sane default — pure semantic on one end, pure keyword on the other. GraphQL is still Weaviate's love language and it shows.
{`{
Get {
Claim(
hybrid: {
query: "roof leak after hailstorm in Brisbane"
alpha: 0.5
}
where: { path: ["region"], operator: Equal, valueText: "qld" }
limit: 5
) {
description
region
_additional { score explainScore }
}
}
}`}The Catch
Weaviate is heavier than Chroma or LanceDB. You're running a JVM-adjacent service, you're learning their schema DSL, and the GraphQL API is great until you need to debug it at 1am. Ops cost is real — this isn't a "drop it in a Lambda" tool.
The Verdict
For teams already running real infra who want hybrid search and embedding pipelines as one product, v1.30 makes Weaviate the most batteries-included vector DB on the market. If you're solo on a laptop, it's overkill. If you're shipping a serious search product, it's worth the JVM tax.