Agents & Tool Use
Tool calling, web and desktop agents, browser automation, long-horizon autonomy, multi-agent coordination, and agent safety.
Tasks in Agents & Tool Use
Tool Calling
Choosing and invoking tools or functions correctly.
Function Calling
Producing valid structured function calls.
Web Agents
Completing tasks in web browsers and websites.
Desktop Agents
Operating desktop applications and environments.
Browser Automation
Navigating and acting in browser interfaces.
Computer-Use Agents
Using GUIs, files, and applications to complete tasks.
Research Agents
Planning and executing research workflows.
Customer-Service Agents
Resolving support tasks with tools and context.
Long-Horizon Autonomy
Completing tasks requiring many steps and delayed feedback.
Software-Engineering Agents
Using tools to modify and validate software projects.
Multi-Agent Coordination
Coordinating multiple agents or roles.
Agent Safety / Prompt Injection
Evaluating agent robustness against unsafe or adversarial instructions.
Explore Other Areas
Language & Knowledge
Language understanding, retrieval, QA, RAG, factuality, information extraction, multilingual evaluation, and knowledge-heavy reasoning.
Vision & Documents
Images, video frames, OCR, layout, tables, document parsing, detection, segmentation, and visual anomaly detection.
Audio & Speech
ASR, TTS, speaker intelligence, music, sound events, audio-language understanding, and audio safety.
Multimodal Media
Cross-modal image, text, audio, video, and 3D tasks where input and output span multiple media types.