Collections
Discover the best community collections!
Collections including paper arxiv:2502.02737
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 200 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper β’ 2303.03915 β’ Published β’ 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249
-
Reinforcement Pre-Training
Paper β’ 2506.08007 β’ Published β’ 262 -
A Survey on Latent Reasoning
Paper β’ 2507.06203 β’ Published β’ 93 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 18 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper β’ 1910.10683 β’ Published β’ 15
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 200 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249
-
GAIA: a benchmark for General AI Assistants
Paper β’ 2311.12983 β’ Published β’ 241 -
Zephyr: Direct Distillation of LM Alignment
Paper β’ 2310.16944 β’ Published β’ 122 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249 -
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
Paper β’ 2412.03304 β’ Published β’ 21
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 200 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper β’ 2303.03915 β’ Published β’ 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249
-
Reinforcement Pre-Training
Paper β’ 2506.08007 β’ Published β’ 262 -
A Survey on Latent Reasoning
Paper β’ 2507.06203 β’ Published β’ 93 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 18 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper β’ 1910.10683 β’ Published β’ 15
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 75 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 200 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249
-
GAIA: a benchmark for General AI Assistants
Paper β’ 2311.12983 β’ Published β’ 241 -
Zephyr: Direct Distillation of LM Alignment
Paper β’ 2310.16944 β’ Published β’ 122 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 249 -
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
Paper β’ 2412.03304 β’ Published β’ 21