DFlash Collection Block Diffusion for Flash Speculative Decoding • 3 items • Updated 12 days ago • 12
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 26 items • Updated 7 days ago • 99
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated Dec 16, 2025 • 42
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated Dec 15, 2025 • 8
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated about 4 hours ago • 29
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 149
Trinity Collection Collection of Arcee AI models in the Trinity family • 8 items • Updated Dec 11, 2025 • 25
Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated Dec 23, 2025 • 33
BERT Hash Nano Models Collection Set of BERT models with a modified embeddings layer • 8 items • Updated 5 days ago • 9
TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments Paper • 2510.01179 • Published Oct 1, 2025 • 26
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated 1 day ago • 137