Understand rerankers. Then watch one run in your browser.
A reranker re-scores your retrieved candidates so the most relevant passages rise to the top. Learn how it works, compare the popular models, and try a cross-encoder live — with zero API cost and nothing leaving your machine.
Runs on transformers.js · 100% client-side · no key required
Start here
Three short guides take you from “what is a reranker?” to a working reranking stage in your RAG pipeline.
What is a reranker?
The two-stage retrieval pattern, why order matters, and where reranking fits.
Cross-encoder vs bi-encoder
Why bi-encoders are fast and cross-encoders are accurate — and how to use both.
How to add reranking to RAG
Retrieve wide, rerank, keep the best. With code, top-k tips and latency trade-offs.
Rerank in your browser, right now
Paste a query and a few candidate passages. A real cross-encoder downloads once, caches, and scores every pair locally — you watch the ranking reshuffle in milliseconds. No server, no API key, no data leaving the page.
- Real model weights via transformers.js + ONNX Runtime Web
- Zero API cost and zero abuse risk — it’s all on your device
- See exactly how scores reorder your retrieval results
Compare the rerank models
Hosted APIs and open-weight models, side by side — quality, latency, languages and cost.
bge-reranker
Open-weight rerankers from BAAI. Self-host for free, strong multilingual options.
Cohere Rerank
A mature hosted rerank API with strong multilingual quality and simple integration.
Jina Reranker
Hosted API and open weights, including tiny models small enough to run in a browser.
Voyage Rerank
Hosted rerankers tuned for retrieval quality, with domain-specific variants.
Reranking in one diagram
query ─┐
▼
┌───────────────┐ top 100 ┌────────────────┐ top 5 ┌─────┐
│ Retriever │────────────▶│ Reranker │──────────▶│ LLM │
│ (bi-encoder / │ candidates │ (cross-encoder │ best few │ │
│ BM25, fast) │ │ scores pairs) │ └─────┘
└───────────────┘ └────────────────┘
recall-oriented precision-oriented
Retrieve wide for recall, rerank for precision, send only the best to the model.