7 Articles
7 Articles


New study accuses LM Arena of gaming its popular AI benchmark
The popular AI vibe test may not be as fair as it seems.
Cohere Labs head calls “unreliable” AI leaderboard rankings a “crisis” in the field
The head of Cohere’s research division is concerned that alleged unreliability in the rankings of a popular chatbot leaderboard amounts to a “crisis” in artificial intelligence (AI) development. A new study co-authored by Sara Hooker, head of Cohere Labs, along with researchers at Cohere and leading universities, claims that large AI companies have been “gaming” the crowd-sourced chatbot ranking platform LM Arena to boost the ranking of their l…
Popular AI benchmark LMArena allegedly favors large providers, study claims
Researchers say the ranking system favors major providers like OpenAI, Google, and Meta. LMArena disputes the claims. The article Popular AI benchmark LMArena allegedly favors large providers, study claims appeared first on THE DECODER.
Study Accuses LM Arena of Helping Top AI Labs Game Its Benchmark
An anonymous reader shares a report: A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI benchmark Chatbot Arena, of helping a select group of AI companies achieve better leaderboard scores at the expense of rivals. According...
Coverage Details
Bias Distribution
- 100% of the sources are Center
To view factuality data please Upgrade to Premium
Ownership
To view ownership data please Upgrade to Vantage