Model card
InternImage-H.
Shanghai AI Labopen-sourceDeformable Convolution1 current SOTA
Large-scale vision foundation model based on deformable convolutions. Huge variant (1B params) achieves 65.4 mask AP on LVIS v1.0 minival with DINO head. CVPR 2023. arXiv:2211.05778.
§ 01 · Benchmarks
Every benchmark InternImage-H has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | ADE20K | Computer Vision · Semantic Segmentation | mIoU | 62.9% | #1 | — | source ↗ |
| 02 | LVIS v1.0 | Computer Vision · Object Detection | mask-ap | 65.4% | #3 | 2022-11-10 | source ↗ |
| 03 | COCO | Computer Vision · Object Detection | mAP | 65.4% | #3 | — | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where InternImage-H actually performs.
§ 03 · Papers
1 paper with results for InternImage-H.
- 2022-11-10· Computer Vision· 1 result
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
§ 04 · Related models
Other Shanghai AI Lab models scored on Codesota.
§ 05 · Sources & freshness
Where these numbers come from.
arxiv-paper
2
results
arxiv
1
result
1 of 3 rows marked verified.