Image Classification
Image classification is the task that launched modern deep learning — AlexNet's 2012 ImageNet win cut error rates in half overnight and triggered the entire neural network renaissance. The progression from VGGNet to ResNet to Vision Transformers traces the intellectual history of the field itself. Today's frontier models like EVA-02 and SigLIP push top-1 accuracy above 91% on ImageNet, but the real action has shifted to efficiency (MobileNet, EfficientNet) and robustness under distribution shift. Still the default benchmark for new architectures, and the foundation that every other vision task builds on.
ImageNet-1K
1.28M training images, 50K validation images across 1,000 object classes. The standard benchmark for image classification since 2012.
Top 10
Leading models on ImageNet-1K.
| Rank | Model | top-1-accuracy | Year | Source |
|---|---|---|---|---|
| 1 | coca-finetuned | 91.0 | 2025 | paper |
| 2 | vit-g-14 | 90.5 | 2025 | paper |
| 3 | EVA-02-L | 90.1 | 2026 | paper |
| 4 | EVA-Giant | 89.8 | 2026 | paper |
| 5 | InternImage-H | 89.6 | 2026 | paper |
| 6 | SigLIP-SO400M | 89.4 | 2026 | paper |
| 7 | convnext-v2-huge | 88.9 | 2025 | paper |
| 8 | ViT-H/14 CLIP (LAION-2B) | 88.6 | 2026 | paper |
| 9 | ConvNeXt-XXLarge (CLIP LAION) | 88.6 | 2026 | paper |
| 10 | vit-h-14 | 88.5 | 2025 | paper |
All datasets
4 datasets tracked for this task.
Related tasks
Other tasks in Computer Vision.
Looking to run a model? HuggingFace hosts inference for this task type.