Recent Papers / arXiv:2606.03829
BigFinanceBench: A Workflow-Grounded Benchmark for Financial-Research Agents
Authors pending
Abstract
928 expert-authored financial tasks with point-weighted rubrics; best agent scores only 58.8%.
Tasks
editResults
No benchmark results recorded yet.
Benchmark results referencing this paper haven't been added to the registry yet. If you have a reproduction, submit it →
CodeSOTA extraction
Benchmark evidence
- Extract BigFinanceBench rubric scores per workflow step to localize where agents fail (best system 58.8%).