Recent Papers / arXiv:2604.23099
ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation
Authors pending
Abstract
Transfer-learning framework using Gaussian Processes; estimates performance within 1% of ground truth with 8-65x fewer samples.
Tasks
editResults
No benchmark results recorded yet.
Benchmark results referencing this paper haven't been added to the registry yet. If you have a reproduction, submit it →
CodeSOTA extraction
Benchmark evidence
Link this paper to benchmark rows, datasets, model cards, and reproduced results as evidence is extracted.