Massive Text Embedding Benchmark — 56+ tasks across 8 categories (retrieval, semantic textual similarity, classification, clustering, reranking, pair classification, summarization, bitext mining), covering 112 languages. The default index for evaluating text embedding models.
Mteb Score is the reported evaluation metric for MTEB. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Muted rows were not state of the art when published — an earlier or same-year result already scored better.
Avg Score is the reported evaluation metric for MTEB. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Muted rows were not state of the art when published — an earlier or same-year result already scored better.
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | NV-Embed-v2 | verified | 72.31 | 2024 | Paper ↗ | Looks wrong? |
| 02 | GTE-Qwen2-7B-instruct | verified | 72.05 | 2024 | Source ↗ | Looks wrong? |
| 03 | voyage-3-large | verified | 70.32 | 2025 | Source ↗ | Looks wrong? |
| 04 | E5-Mistral-7B-instruct | verified | 66.63 | 2024 | Source ↗ | Looks wrong? |
| 05 | jina-embeddings-v3 | verified | 65.18 | 2024 | Paper ↗ | Looks wrong? |
| 06 | text-embedding-3-large | verified | 64.6 | 2024 | Source ↗ | Looks wrong? |