Cohere Rerank

Hosted API · Cohere

Hosted APIMultilingualFree tier

Cohere Rerank was one of the first commercial reranking APIs and remains among the most widely deployed. The current generation — rerank-v3.5 — delivers strong multilingual precision on BEIR benchmarks, a clean REST + SDK interface, and a free tier generous enough for development and low-volume production.

Available models

Model IDLanguagesContextNotes
rerank-v3.5100+ langs4096 tokens/docCurrent best; multilingual flagship
rerank-english-v3.0English4096 tokens/docSlightly faster, English-only
rerank-multilingual-v3.0100+ langs4096 tokens/docPrevious multilingual generation

Use rerank-v3.5 by default. Downgrade to rerank-english-v3.0 only if you're English-only and every millisecond matters.

Pricing

TierPriceLimits
Free$01,000 API calls/month
Pay-as-you-go~$2 / 1,000 searchesNo limit
EnterpriseCustomSLA, private deployment options

Check the official Cohere pricing page for current rates — figures above may be outdated.

Quick start

Python

pip install cohere
import cohere

co = cohere.Client("YOUR_API_KEY")

query = "How do I add reranking to my RAG pipeline?"
documents = [
    "Rerankers score each query-passage pair jointly with a cross-encoder.",
    "BM25 is a classical keyword-based retrieval method.",
    "London is the capital of the United Kingdom.",
    "Two-stage retrieval: retrieve 50 candidates, rerank to top 5.",
]

results = co.rerank(
    model="rerank-v3.5",
    query=query,
    documents=documents,
    top_n=3,
)

for r in results.results:
    print(f"{r.relevance_score:.4f}  {documents[r.index][:80]}")

Node.js

import Cohere from "cohere-ai";
const co = new Cohere.Client({ token: "YOUR_API_KEY" });

const result = await co.rerank({
  model: "rerank-v3.5",
  query: "How do I add reranking to my RAG pipeline?",
  documents: ["...", "..."],
  topN: 3,
});

result.results.forEach((r) => console.log(r.relevanceScore, documents[r.index]));

REST (curl)

curl https://api.cohere.com/v2/rerank \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-v3.5",
    "query": "How do I add reranking to my RAG pipeline?",
    "documents": ["Rerankers score query-passage pairs...", "BM25 is..."],
    "top_n": 3
  }'

Pros and cons

Pros

  • Mature, production-grade API with an SLA
  • Top multilingual BEIR scores
  • SDK support: Python, Node, Java, Go, curl
  • Generous free tier for prototyping
  • 4096-token document context window
  • Simple, predictable pricing

Cons

  • Closed weights — you depend on Cohere
  • Per-call cost adds up at high volume
  • No self-hosted option (for most plans)
  • Latency is network-bound (~100–200 ms)

Try reranking without an API key

Our demo runs a cross-encoder entirely in your browser — no signup, no cost.

Open the demo →

Other models