Community Blog & Articles

Community Articles
view all
audiospeechleaderboard

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

19
November 21, 2025
leaderboardevaluationnlp

Arabic Leaderboards: Introducing Arabic Instruction Following, Updating AraGen, and More

  • +2
20
April 8, 2025
math-verifyopen-llm-leaderboardleaderboard

Fixing Open LLM Leaderboard with Math-Verify

30
February 14, 2025
nlpresearchleaderboard

The Open Arabic LLM Leaderboard 2

  • +3
36
February 10, 2025
open-llm-leaderboardleaderboardenergy_efficiency

CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard

21
January 9, 2025
leaderboardresearchcollaboration

Evaluating Audio Reasoning with Big Bench Audio

26
December 20, 2024
leaderboardevaluationnlp

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

  • +1
38
December 4, 2024
communityresearchnlp

Letting Large Models Debate: The First Multilingual LLM Debate Competition

  • +8
33
November 20, 2024
communityresearchnlp

Introducing the Open Leaderboard for Japanese LLMs!

  • +2
39
November 20, 2024
leaderboardarenacollaboration

Judge Arena: Benchmarking LLMs as Evaluators

  • +4
60
November 19, 2024
leaderboardcollaborationcommunity

Introducing the Open FinLLM Leaderboard

  • +9
79
October 4, 2024
nlpresearchleaderboard

🇨🇿 BenCzechMark - Can your LLM Understand Czech?

  • +7
23
October 1, 2024
ai4mathnlpcommunity

How NuminaMath Won the 1st AIMO Progress Prize

  • +4
123
July 11, 2024
agentssmolagentsnlp

Our Transformers Code Agent beats the GAIA benchmark 🏅

98
July 1, 2024