Codesota · Models · RoDLAChen, Zhang et al.1 results · 1 benchmarks
Model card

RoDLA.

Chen, Zhang et al.open-sourceUnknown paramsDINO-based detector with InternImage backbone + channel attention blocks

RoDLA: Benchmarking the Robustness of Document Layout Analysis Models. DINO-based architecture using InternImage backbone (ImageNet-22K pretrained) with channel attention and average pooling in encoder for perturbation-resistant features. 96.0 mAP on clean PubLayNet-val. CVPR 2024. arXiv 2403.14442.

§ 01 · Benchmarks

Every benchmark RoDLA has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01publaynet-valComputer Vision · Document Layout AnalysisOverall1.0%#2/2source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where RoDLA actually performs.

Computer Vision
1
benchmark
avg rank #2.0
§ 05 · Sources & freshness

Where these numbers come from.

cvpr-2024
1
result
0 of 1 rows marked verified.