FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 72
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications Paper • 2508.16279 • Published Aug 22 • 53
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15 • 104
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 128
ST-Raptor: LLM-Powered Semi-Structured Table Question Answering Paper • 2508.18190 • Published Aug 25 • 6
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 189
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22 • 158
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization Paper • 2509.13313 • Published Sep 16 • 79
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 114
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published 19 days ago • 63
Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable Paper • 2503.00555 • Published Mar 1
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Paper • 2403.03206 • Published Mar 5, 2024 • 71
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 9 days ago • 144
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 9 days ago • 18