Benchmark Model Studies

AI agent benchmarks are misleading, study warns

AI agents are becoming a promising new research direction with potential applications in the real world. These agents use foundation models such as large language models (LLMs) and vision language ...

Live Science

AI benchmarking platform is helping top companies rig their model performances, study claims

LMArena, a popular benchmark for large language models, has been accused of giving preferential treatment to AIs made by big tech firms, potentially enabling them to game their results. When you ...

Computerworld

Leaderboard illusion: How big tech skewed AI rankings on Chatbot Arena

Meta, Google, and OpenAI allegedly exploited undisclosed private testing on Chatbot Arena to secure top rankings, raising concerns about fairness and transparency in AI model benchmarking. A handful ...

RFID Journal

EECC Benchmark Study Finds UHF Tag Performance Better Than Ever

The European EPC Competence Center (EECC), a Germany-based provider of RFID services, released this week the newest edition of its annual UHF Tag Performance Survey (UTPS). In the 2016 edition, EECC ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results