What I Read: aggregation

Posted on 2026-06-24 :: Tags: large language model, natural language processing, evaluation, metric, rank, optimization, preference optimization, utility, benchmark

https://mlbenchmarks.org/12-problem-aggregation.html
The problem of aggregation
Moritz Hardt
"Therefore, multi-task benchmarks are analogous to voting systems where tasks are voters and models are candidates."