Jina Reranker

Open weights + Hosted API · Jina AI

Open weightsHosted APIFree tierBrowser-runnable tiny

Jina AI's reranker family is unique in offering both open-weight models you can self-host and a hosted API — with the same model. Their v1-tiny variant is small enough to run in the browser via transformers.js, which is exactly what powers this site's live demo.

On this page

Model variants
Pricing
Quick start
Browser / edge use
Pros and cons

Model variants

Model	Size	Languages	Notes
`jina-reranker-v2-base-multilingual`	~278 MB	100+ langs	Flagship; strong multilingual BEIR
`jina-reranker-v1-base-en`	~278 MB	English	Good English baseline
`jina-reranker-v1-tiny-en`	~33 MB	English	Tiny; runs in browser / on edge

Pricing

Tier	Price
Free tier	1 M tokens/month free — no credit card
Pay-as-you-go	~$0.018 / 1M tokens

Token-based pricing is friendlier for long documents than per-call pricing. Check the Jina AI website for current rates.

Quick start

Hosted API (Python)

import requests

def rerank(query: str, documents: list[str], top_n: int = 5) -> list[str]:
    resp = requests.post(
        "https://api.jina.ai/v1/rerank",
        headers={"Authorization": "Bearer YOUR_KEY", "Content-Type": "application/json"},
        json={
            "model": "jina-reranker-v2-base-multilingual",
            "query": query,
            "documents": documents,
            "top_n": top_n,
        },
    ).json()
    return [documents[r["index"]] for r in resp["results"]]

Self-hosted (sentence-transformers)

from sentence_transformers import CrossEncoder

# Open weights — same model, self-hosted
model = CrossEncoder("jinaai/jina-reranker-v2-base-multilingual",
                     trust_remote_code=True, max_length=1024)

scores = model.predict([(query, doc) for doc in documents])
ranked = sorted(zip(scores, documents), reverse=True)

Browser / edge use

The v1-tiny model (33 MB quantised) loads via transformers.js in under 10 seconds on a typical broadband connection and runs scoring at ~200 ms for a 10-candidate batch. This is what powers our demo:

// transformers.js (ES module in the browser)
import { AutoTokenizer, AutoModelForSequenceClassification }
  from "https://cdn.jsdelivr.net/npm/@huggingface/transformers@3";

const tokenizer = await AutoTokenizer.from_pretrained(
  "jinaai/jina-reranker-v1-tiny-en", { dtype: "q8" }
);
const model = await AutoModelForSequenceClassification.from_pretrained(
  "jinaai/jina-reranker-v1-tiny-en", { dtype: "q8" }
);

const inputs = tokenizer([query, query], {
  text_pair: [doc1, doc2],
  padding: true, truncation: true,
});
const { logits } = await model(inputs);
const scores = logits.sigmoid().tolist();

Pros and cons

Pros

Dual-mode: same model as API or self-hosted weights
Tiny variant runs in the browser — unique in the space
Generous free tier (1M tokens/month, no card needed)
Strong multilingual quality on v2
Token-based pricing suits long documents

Cons

v1-tiny is English-only and lower quality
Smaller company than Cohere — less ecosystem tooling
Self-hosted requires trust_remote_code=True
Token pricing can be opaque for short passages

jina-reranker-v1-tiny powers this demo

See it score your own passages live in the browser — no API key, no data leaving the page.

Open the demo →