Rethinking Entropy Regularization in Large Reasoning Models Paper • 2509.25133 • Published Sep 29, 2025 • 4
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8, 2025 • 30