SoViT-400m/14.

Google DeepMindopen-source400M paramsVision Transformer (Shape-Optimized)

Shape-Optimized ViT derived from compute-optimal scaling laws. 400M params, patch14. Pre-trained on WebLI (contrastive + reconstruction head), finetuned on ImageNet-1K at 224px. NeurIPS 2023. Paper: arxiv:2305.13035.

§ 01 · Benchmarks

Every benchmark SoViT-400m/14 has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	ImageNet-1K	Computer Vision · Image Classification	top-1-accuracy	90.3%	#3/20	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 02 · Strengths by area