Voyage Rerank

Hosted API · Voyage AI

Hosted APIHigh precisionDomain-specific variants

Voyage AI built their rerankers with a single goal: maximise retrieval precision. voyage-rerank-2 consistently ranks among the highest on BEIR and MTEB retrieval tasks, and the company offers domain-specific variants for code and financial documents — a strong choice when generic quality isn't enough.

Available models

ModelContextBest for
rerank-216K tokensGeneral-purpose flagship; top BEIR scores
rerank-2-lite16K tokensFaster, lower cost; good quality
rerank-lite-14K tokensLegacy lite model

The 16K context window on rerank-2 is notably large — useful for reranking long legal, medical or financial documents without chunking.

Pricing

ModelPrice
rerank-2~$0.05 / 1M tokens
rerank-2-lite~$0.02 / 1M tokens
Free trial200M tokens included on sign-up

Voyage uses token-based pricing, which is cost-effective at high volume. Check the Voyage AI website for current rates.

Quick start

Python

pip install voyageai
import voyageai

vo = voyageai.Client(api_key="YOUR_API_KEY")

query = "How do I add reranking to my RAG pipeline?"
documents = [
    "Rerankers score each query-passage pair with a cross-encoder.",
    "BM25 is a classical keyword-based retrieval method.",
    "London is the capital of the United Kingdom.",
    "Two-stage retrieval: retrieve 50 candidates, rerank to top 5.",
]

result = vo.rerank(
    query=query,
    documents=documents,
    model="rerank-2",
    top_k=3,
)

for r in result.results:
    print(f"{r.relevance_score:.4f}  {documents[r.index][:80]}")

REST (curl)

curl https://api.voyageai.com/v1/rerank \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-2",
    "query": "How do I add reranking to my RAG pipeline?",
    "documents": ["Rerankers score...", "BM25 is..."],
    "top_k": 3
  }'

In a RAG pipeline

import voyageai

vo = voyageai.Client(api_key="YOUR_API_KEY")

def rag_answer(query: str, vector_db, llm) -> str:
    # Stage 1: retrieve wide
    candidates = vector_db.search(query, top_k=50)
    # Stage 2: rerank tight
    result = vo.rerank(query=query, documents=candidates, model="rerank-2", top_k=5)
    top5 = [candidates[r.index] for r in result.results]
    # Stage 3: generate
    return llm.complete(f"Context:\n" + "\n\n".join(top5) + f"\n\nQ: {query}")

Pros and cons

Pros

  • Top-tier BEIR retrieval precision scores
  • 16K token context — great for long documents
  • Competitive token-based pricing
  • 200M free tokens on sign-up
  • Works seamlessly with Voyage embeddings
  • Clean Python SDK

Cons

  • Hosted-only — no open weights
  • Smaller community than Cohere or bge
  • SDK is Python-only (REST for other languages)
  • No multilingual flagship (general model is multilingual but not marketed as such)

See reranking in action

Our demo runs a cross-encoder in your browser — no API key, no cost, same reranking logic.

Open the demo →

Other models