Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota - NLP - Named Entity RecognitionCoNLL-2003 - OntoNotes - MultiNERDTask page
00 - Named Entity Recognition

Named entity recognition task router

NER extracts entity spans and types from text. It is useful only when spans, offsets, and stable schemas matter. Use fine-tuned token classifiers for scale, GLiNER-style models for custom labels, and LLMs when the schema changes often.

Benchmark
CoNLL-2003 - OntoNotes - MultiNERD
Current pick
Fine-tuned DeBERTa v3
personorgplace
01 - Explainer

What this task measures.

Named entity recognition finds text spans and assigns schema labels such as PERSON, ORG, location, product, statute, disease, or ticker. The hard part is not only spotting names; it is returning stable offsets, resolving boundary ambiguity, and matching the entity taxonomy your downstream system expects.

02 - Benchmarks

Use a benchmark ladder.

One leaderboard rarely captures the task. Use the canonical benchmark for lineage, then add harder or more domain-specific checks before choosing a model.

BenchmarkRoleMetricCaveat
CoNLL-2003Classic English NEREntity-level F1Narrow four-label newswire schema; useful for lineage, weak for modern entity extraction.
OntoNotes 5.0Broader English schemaSpan F1More entity types, but still not enough for legal, medical, finance, or product catalogs.
MultiNERDMultilingual NERMacro / micro F1Better language coverage; still requires local checks for aliases and domain terms.
Local gold setProduction gateSpan F1 + schema error rateThe only reliable way to measure custom labels and costly false positives.
03 - Evaluation

What to compare.

The public benchmark is a shortlist signal. Production choice still depends on latency, cost, domain drift, and how expensive mistakes are.

AxisValueWhy it matters
Canonical benchmarkCoNLL-2003Classic English PERSON, ORG, LOC, MISC benchmark; useful but narrow.
Broader schemaOntoNotes / MultiNERDMore entity types and multilingual coverage for modern extraction.
Production metricSpan F1 + schema error rateWrong boundaries and wrong labels break downstream knowledge graphs.
Failure modeDomain entities missedCompany tickers, statutes, products, drugs, and aliases need local examples.
04 - Routing

Pick by task shape.

Known schema at scale

Fine-tuned encoder

Fast, cheap, stable offsets, and strong F1 when labels are fixed.

Custom labels quickly

GLiNER / span model

Extract new entity types without a full annotation project.

Messy document extraction

LLM structured output

Useful when labels are semantic and context-heavy, but validate spans.

Regulated pipeline

Audited token classifier

Offsets, deterministic behavior, and testable schemas matter.

05 - Related

Need implementation details?

Open the lower-level explainer for architecture, code examples, and implementation options.

Open NER explainer ->