Reranker · rerank model · rerank for RAG

Understand rerankers. Then watch one run in your browser.

A reranker re-scores your retrieved candidates so the most relevant passages rise to the top. Learn how it works, compare the popular models, and try a cross-encoder live — with zero API cost and nothing leaving your machine.

Runs on transformers.js · 100% client-side · no key required

Start here

Three short guides take you from “what is a reranker?” to a working reranking stage in your RAG pipeline.

The fun part

Rerank in your browser, right now

Paste a query and a few candidate passages. A real cross-encoder downloads once, caches, and scores every pair locally — you watch the ranking reshuffle in milliseconds. No server, no API key, no data leaving the page.

  • Real model weights via transformers.js + ONNX Runtime Web
  • Zero API cost and zero abuse risk — it’s all on your device
  • See exactly how scores reorder your retrieval results
Open the demo →

Compare the rerank models

Hosted APIs and open-weight models, side by side — quality, latency, languages and cost.

See the full comparison →

Reranking in one diagram

query ─┐
       ▼
┌───────────────┐   top 100   ┌────────────────┐   top 5   ┌─────┐
│  Retriever    │────────────▶│   Reranker     │──────────▶│ LLM │
│ (bi-encoder / │  candidates │ (cross-encoder │  best few │     │
│  BM25, fast)  │             │  scores pairs) │           └─────┘
└───────────────┘             └────────────────┘
  recall-oriented               precision-oriented

Retrieve wide for recall, rerank for precision, send only the best to the model.