Codesota · Models · Mask2Former (Swin-L)Meta AI / UIUC3 results · 2 benchmarks
Model card

Mask2Former (Swin-L).

Meta AI / UIUCopen-sourceTransformer

Universal image segmentation model. On LVIS v1.0 minival achieves 56.1 mask AP. Uses masked attention in Transformer decoder to focus on predicted foreground regions. CVPR 2022. arXiv:2112.01527.

§ 01 · Benchmarks

Every benchmark Mask2Former (Swin-L) has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01LVIS v1.0Computer Vision · Object Detectionmask-ap-rare53.5%#3/32021-12-02source ↗
02ADE20KComputer Vision · Semantic SegmentationmIoU57.3%#4/6source ↗
03LVIS v1.0Computer Vision · Object Detectionmask-ap56.1%#6/92021-12-02source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Mask2Former (Swin-L) actually performs.

Computer Vision
2
benchmarks
avg rank #4.3
§ 03 · Papers

1 paper with results for Mask2Former (Swin-L).

  1. 2021-12-02· Computer Vision· 2 results

    Masked-attention Mask Transformer for Universal Image Segmentation

§ 05 · Sources & freshness

Where these numbers come from.

arxiv
2
results
arxiv-paper
1
result
2 of 3 rows marked verified.