Codesota · Models · APE-LargeTsinghua / MEGVII3 results · 1 benchmarks
Model card

APE-Large.

Tsinghua / MEGVIIopen-sourceUnknown paramsAligned vision encoder + region-text alignment with EVA-02 ViT-L backbone

Aligning and Prompting Everything All at Once for Universal Visual Perception. Achieves 66.4 mask AP on LVIS v1.0 minival with open-vocabulary capabilities. CVPR 2024. arXiv:2312.02153.

§ 01 · Benchmarks

Every benchmark APE-Large has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01LVIS v1.0Computer Vision · Object Detectionmask-ap-rare65.4%#1/32023-12-04source ↗
02LVIS v1.0Computer Vision · Object Detectionbox-ap70.3%#2/42023-12-04source ↗
03LVIS v1.0Computer Vision · Object Detectionmask-ap66.4%#2/92023-12-04source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where APE-Large actually performs.

Computer Vision
1
benchmark
avg rank #1.7
§ 03 · Papers

1 paper with results for APE-Large.

  1. 2023-12-04· Computer Vision· 3 results

    APE: Aligning and Prompting Everything All at Once for Universal Visual Perception

§ 05 · Sources & freshness

Where these numbers come from.

arxiv
3
results
3 of 3 rows marked verified.