ARGenSeg: Image Segmentation with Autoregressive Image Generation Model Paper • 2510.20803 • Published Oct 23 • 9
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation Paper • 2509.23736 • Published Sep 28 • 1
Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28 • 37
Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28 • 37
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22 • 114
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model Paper • 2510.18855 • Published Oct 21 • 71
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction Paper • 2505.02471 • Published May 5 • 15
Learning Implicit Entity-object Relations by Bidirectional Generative Alignment for Multimodal NER Paper • 2308.02570 • Published Aug 3, 2023
Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping Paper • 2503.21817 • Published Mar 26 • 1
Ming-Omni: A Unified Multimodal Model for Perception and Generation Paper • 2506.09344 • Published Jun 11 • 28
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning Paper • 2507.08306 • Published Jul 11
GUI-Shepherd: Reliable Process Reward and Verification for Long-Sequence GUI Tasks Paper • 2509.23738 • Published Sep 28 • 1
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8 • 72
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8 • 72