Voyage Rerank

Hosted API · Voyage AI

Hosted APIHigh precisionDomain-specific variants

Voyage AI built their rerankers with a single goal: maximise retrieval precision. voyage-rerank-2 consistently ranks among the highest on BEIR and MTEB retrieval tasks, and the company offers domain-specific variants for code and financial documents — a strong choice when generic quality isn't enough.

On this page

Available models
Pricing
Quick start
Pros and cons

Available models

Model	Context	Best for
`rerank-2`	16K tokens	General-purpose flagship; top BEIR scores
`rerank-2-lite`	16K tokens	Faster, lower cost; good quality
`rerank-lite-1`	4K tokens	Legacy lite model

The 16K context window on rerank-2 is notably large — useful for reranking long legal, medical or financial documents without chunking.

Pricing

Model	Price
`rerank-2`	~$0.05 / 1M tokens
`rerank-2-lite`	~$0.02 / 1M tokens
Free trial	200M tokens included on sign-up

Voyage uses token-based pricing, which is cost-effective at high volume. Check the Voyage AI website for current rates.

Quick start

Python

pip install voyageai

import voyageai

vo = voyageai.Client(api_key="YOUR_API_KEY")

query = "How do I add reranking to my RAG pipeline?"
documents = [
    "Rerankers score each query-passage pair with a cross-encoder.",
    "BM25 is a classical keyword-based retrieval method.",
    "London is the capital of the United Kingdom.",
    "Two-stage retrieval: retrieve 50 candidates, rerank to top 5.",
]

result = vo.rerank(
    query=query,
    documents=documents,
    model="rerank-2",
    top_k=3,
)

for r in result.results:
    print(f"{r.relevance_score:.4f}  {documents[r.index][:80]}")

REST (curl)

curl https://api.voyageai.com/v1/rerank \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-2",
    "query": "How do I add reranking to my RAG pipeline?",
    "documents": ["Rerankers score...", "BM25 is..."],
    "top_k": 3
  }'

In a RAG pipeline

import voyageai

vo = voyageai.Client(api_key="YOUR_API_KEY")

def rag_answer(query: str, vector_db, llm) -> str:
    # Stage 1: retrieve wide
    candidates = vector_db.search(query, top_k=50)
    # Stage 2: rerank tight
    result = vo.rerank(query=query, documents=candidates, model="rerank-2", top_k=5)
    top5 = [candidates[r.index] for r in result.results]
    # Stage 3: generate
    return llm.complete(f"Context:\n" + "\n\n".join(top5) + f"\n\nQ: {query}")

Pros and cons

Pros

Top-tier BEIR retrieval precision scores
16K token context — great for long documents
Competitive token-based pricing
200M free tokens on sign-up
Works seamlessly with Voyage embeddings
Clean Python SDK

Cons

Hosted-only — no open weights
Smaller community than Cohere or bge
SDK is Python-only (REST for other languages)
No multilingual flagship (general model is multilingual but not marketed as such)

See reranking in action

Our demo runs a cross-encoder in your browser — no API key, no cost, same reranking logic.

Open the demo →