Running on Zero Featured 303 Depth Anything 3 🏢 303 Generate depth maps from images using GPU acceleration
Improving Token-Based World Models with Parallel Observation Prediction Paper • 2402.05643 • Published Feb 8, 2024 • 1
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Paper • 2411.02359 • Published Nov 4, 2024 • 13
Classification Done Right for Vision-Language Pre-Training Paper • 2411.03313 • Published Nov 5, 2024
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models Paper • 2412.14058 • Published Dec 18, 2024 • 1
Image Understanding Makes for A Good Tokenizer for Image Generation Paper • 2411.04406 • Published Nov 7, 2024
$\text{M}^{\text{3}}$: A Modular World Model over Streams of Tokens Paper • 2502.11537 • Published Feb 17
Improving and Benchmarking Offline Reinforcement Learning Algorithms Paper • 2306.00972 • Published Jun 1, 2023
Decoupling Representation and Classifier for Long-Tailed Recognition Paper • 1910.09217 • Published Oct 21, 2019
Trace Anything: Representing Any Video in 4D via Trajectory Fields Paper • 2510.13802 • Published Oct 15 • 30
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 23 days ago • 92
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 23 days ago • 92
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots Paper • 2509.02530 • Published Sep 2 • 10
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots Paper • 2509.02530 • Published Sep 2 • 10