Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17 • 13
Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings Paper • 2509.14405 • Published Sep 17 • 2
Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans Paper • 2506.22439 • Published May 29 • 3
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17 • 13
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America Paper • 2507.00999 • Published Jul 1 • 1
ConLID: Supervised Contrastive Learning for Low-Resource Language Identification Paper • 2506.15304 • Published Jun 18 • 1
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 75
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9 • 9
It's the same but not the same: Do LLMs distinguish Spanish varieties? Paper • 2504.20049 • Published Apr 8
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation Paper • 2412.03304 • Published Dec 4, 2024 • 21
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units Paper • 2411.02280 • Published Nov 4, 2024 • 1
DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests Paper • 2501.04671 • Published Jan 8
PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection Paper • 2412.11923 • Published Dec 16, 2024
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation Paper • 2412.03304 • Published Dec 4, 2024 • 21
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
CRAB: Assessing the Strength of Causal Relationships Between Real-world Events Paper • 2311.04284 • Published Nov 7, 2023
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction Paper • 1906.05317 • Published Jun 12, 2019
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models Paper • 2310.03084 • Published Oct 4, 2023
RECKONING: Reasoning through Dynamic Knowledge Encoding Paper • 2305.06349 • Published May 10, 2023 • 1