Reinforcement Learning

Training agents to make decisions? Benchmark your policies on game playing, continuous control, and offline learning tasks.

3 tasks2 datasets

Tasks in Reinforcement Learning

Playing Atari video games (Atari 2600 benchmark).

Control tasks with continuous action spaces (MuJoCo).

Learning from fixed datasets without environment interaction.

Building systems that understand images and video? Find benchmarks for recognition, detection, segmentation, and document analysis tasks.

Processing and understanding text? Evaluate your models on language understanding, generation, translation, and information extraction benchmarks.

Testing if your model can think logically? Benchmark math problem solving, commonsense understanding, and multi-step reasoning capabilities.

Developing AI coding assistants? Test code generation, completion, translation, bug detection, and repair capabilities.