Voyage Rerank
Hosted APIHigh precisionDomain-specific variants
Voyage AI built their rerankers with a single goal: maximise retrieval precision. voyage-rerank-2 consistently ranks among the highest on BEIR and MTEB retrieval tasks, and the company offers domain-specific variants for code and financial documents — a strong choice when generic quality isn't enough.
On this page
Available models
| Model | Context | Best for |
|---|---|---|
rerank-2 | 16K tokens | General-purpose flagship; top BEIR scores |
rerank-2-lite | 16K tokens | Faster, lower cost; good quality |
rerank-lite-1 | 4K tokens | Legacy lite model |
The 16K context window on rerank-2 is notably large — useful for reranking long legal, medical or financial documents without chunking.
Pricing
| Model | Price |
|---|---|
rerank-2 | ~$0.05 / 1M tokens |
rerank-2-lite | ~$0.02 / 1M tokens |
| Free trial | 200M tokens included on sign-up |
Voyage uses token-based pricing, which is cost-effective at high volume. Check the Voyage AI website for current rates.
Quick start
Python
pip install voyageai
import voyageai
vo = voyageai.Client(api_key="YOUR_API_KEY")
query = "How do I add reranking to my RAG pipeline?"
documents = [
"Rerankers score each query-passage pair with a cross-encoder.",
"BM25 is a classical keyword-based retrieval method.",
"London is the capital of the United Kingdom.",
"Two-stage retrieval: retrieve 50 candidates, rerank to top 5.",
]
result = vo.rerank(
query=query,
documents=documents,
model="rerank-2",
top_k=3,
)
for r in result.results:
print(f"{r.relevance_score:.4f} {documents[r.index][:80]}")
REST (curl)
curl https://api.voyageai.com/v1/rerank \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "rerank-2",
"query": "How do I add reranking to my RAG pipeline?",
"documents": ["Rerankers score...", "BM25 is..."],
"top_k": 3
}'
In a RAG pipeline
import voyageai
vo = voyageai.Client(api_key="YOUR_API_KEY")
def rag_answer(query: str, vector_db, llm) -> str:
# Stage 1: retrieve wide
candidates = vector_db.search(query, top_k=50)
# Stage 2: rerank tight
result = vo.rerank(query=query, documents=candidates, model="rerank-2", top_k=5)
top5 = [candidates[r.index] for r in result.results]
# Stage 3: generate
return llm.complete(f"Context:\n" + "\n\n".join(top5) + f"\n\nQ: {query}")
Pros and cons
Pros
- Top-tier BEIR retrieval precision scores
- 16K token context — great for long documents
- Competitive token-based pricing
- 200M free tokens on sign-up
- Works seamlessly with Voyage embeddings
- Clean Python SDK
Cons
- Hosted-only — no open weights
- Smaller community than Cohere or bge
- SDK is Python-only (REST for other languages)
- No multilingual flagship (general model is multilingual but not marketed as such)
See reranking in action
Our demo runs a cross-encoder in your browser — no API key, no cost, same reranking logic.
Open the demo →