ARE: Scaling Up Agent Environments and Evaluations
Paper
•
2509.17158
•
Published
•
35
3D Mesh Generation via Compositional Latent Diffusion
Dense Grounded Understanding of Images and Videos
Generate captions for images
Detect, segment, classify objects in images and videos