Jina Reranker
Open weightsHosted APIFree tierBrowser-runnable tiny
Jina AI's reranker family is unique in offering both open-weight models you can self-host and a hosted API — with the same model. Their v1-tiny variant is small enough to run in the browser via transformers.js, which is exactly what powers this site's live demo.
On this page
Model variants
| Model | Size | Languages | Notes |
|---|---|---|---|
jina-reranker-v2-base-multilingual | ~278 MB | 100+ langs | Flagship; strong multilingual BEIR |
jina-reranker-v1-base-en | ~278 MB | English | Good English baseline |
jina-reranker-v1-tiny-en | ~33 MB | English | Tiny; runs in browser / on edge |
Pricing
| Tier | Price |
|---|---|
| Free tier | 1 M tokens/month free — no credit card |
| Pay-as-you-go | ~$0.018 / 1M tokens |
Token-based pricing is friendlier for long documents than per-call pricing. Check the Jina AI website for current rates.
Quick start
Hosted API (Python)
import requests
def rerank(query: str, documents: list[str], top_n: int = 5) -> list[str]:
resp = requests.post(
"https://api.jina.ai/v1/rerank",
headers={"Authorization": "Bearer YOUR_KEY", "Content-Type": "application/json"},
json={
"model": "jina-reranker-v2-base-multilingual",
"query": query,
"documents": documents,
"top_n": top_n,
},
).json()
return [documents[r["index"]] for r in resp["results"]]
Self-hosted (sentence-transformers)
from sentence_transformers import CrossEncoder
# Open weights — same model, self-hosted
model = CrossEncoder("jinaai/jina-reranker-v2-base-multilingual",
trust_remote_code=True, max_length=1024)
scores = model.predict([(query, doc) for doc in documents])
ranked = sorted(zip(scores, documents), reverse=True)
Browser / edge use
The v1-tiny model (33 MB quantised) loads via transformers.js in under 10 seconds on a typical broadband connection and runs scoring at ~200 ms for a 10-candidate batch. This is what powers our demo:
// transformers.js (ES module in the browser)
import { AutoTokenizer, AutoModelForSequenceClassification }
from "https://cdn.jsdelivr.net/npm/@huggingface/transformers@3";
const tokenizer = await AutoTokenizer.from_pretrained(
"jinaai/jina-reranker-v1-tiny-en", { dtype: "q8" }
);
const model = await AutoModelForSequenceClassification.from_pretrained(
"jinaai/jina-reranker-v1-tiny-en", { dtype: "q8" }
);
const inputs = tokenizer([query, query], {
text_pair: [doc1, doc2],
padding: true, truncation: true,
});
const { logits } = await model(inputs);
const scores = logits.sigmoid().tolist();
Pros and cons
Pros
- Dual-mode: same model as API or self-hosted weights
- Tiny variant runs in the browser — unique in the space
- Generous free tier (1M tokens/month, no card needed)
- Strong multilingual quality on v2
- Token-based pricing suits long documents
Cons
- v1-tiny is English-only and lower quality
- Smaller company than Cohere — less ecosystem tooling
- Self-hosted requires
trust_remote_code=True - Token pricing can be opaque for short passages
jina-reranker-v1-tiny powers this demo
See it score your own passages live in the browser — no API key, no data leaving the page.
Open the demo →