Scene Text Detection
Detecting text regions in natural scene images
Scene Text Detection is a key task in computer vision. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.
Benchmarks & SOTA
ICDAR 2015
ICDAR 2015 Incidental Scene Text
1000 training + 500 test images captured with wearable cameras. Industry standard for scene text detection.
State of the Art
TextFuseNet (ResNeXt-101)
93.96
precision
Total-Text
Total-Text
Curved text benchmark. 1555 images with polygon annotations.
State of the Art
FAST-T-448
152.8
fps
msra-td500
Dataset from Papers With Code
State of the Art
FAST-T-512
137.2
fps
icdar-2013
Dataset from Papers With Code
State of the Art
CRAFT
97.4
precision
icdar-2017-mlt
Dataset from Papers With Code
State of the Art
PMTD*
84.42
precision
coco-text
Dataset from Papers With Code
State of the Art
CLIP4STR-L
81.9
1-1-accuracy
ic19-art
Dataset from Papers With Code
State of the Art
CLIP4STR-L (DataComp-1B)
86.4
accuracy
ic19-rects
Dataset from Papers With Code
State of the Art
BDN
93.36
f-measure
CTW1500
Curved Text in the Wild 1500
1500 images with curved text annotations. Focus on arbitrary-shaped text.
No results tracked yet
ICDAR 2019 ArT
ICDAR 2019 Arbitrary-Shaped Text
Text in arbitrary shapes including curved and rotated text. 10,166 images total.
No results tracked yet
Related Tasks
General OCR Capabilities
Comprehensive benchmarks covering multiple aspects of OCR performance.
Polish OCR
OCR for Polish language including historical documents, gothic fonts, and diacritic recognition.
Image Classification
Categorizing images into predefined classes (ImageNet, CIFAR).
Object Detection
Locating and classifying objects in images (COCO, Pascal VOC).