Robotsrobotics

Robotics

End-to-end robotics — learning perception, planning, and control in a single model — entered a new era with vision-language-action (VLA) models. Google's RT-2 (2023) showed that a web-pretrained VLM could directly output robot actions, and the open-source Open X-Embodiment dataset (2023) unified data from 22 robot types across 21 institutions. The key tension is generalization: lab demos on specific robots are plentiful, but a single policy that transfers across embodiments, tasks, and environments remains the holy grail, with π₀ (Physical Intelligence, 2024) and Google's RT-X pushing this frontier.

2
Datasets
3
Results
success-rate
Canonical metric
Canonical Benchmark

RLBench

Large-scale robot learning benchmark with 100 manipulation tasks

Primary metric: success-rate
View full leaderboard

Top 10

Leading models on RLBench.

RankModelsuccess-rateYearSource
1
RVT-2
81.42026paper
2
RVT
62.92026paper
3
PerAct
43.42026paper

All datasets

2 datasets tracked for this task.

Related tasks

Other tasks in Robots.

Run Inference

Looking to run a model? HuggingFace hosts inference for this task type.

HuggingFace