YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding

This repo contains the models for paper Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding. Code is available at: https://github.com/GWxuan/TSP3D

Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
Wenxuan Guo*, Xiuwei Xu*, Ziwei Wang, Jianjiang Feng†, Jie Zhou, Jiwen Lu

* Equal contribution † Corresponding author

In this work, we propose an efficient multi-level convolution architecture for 3D visual grounding. TSP3D achieves superior performance compared to previous approaches in both inference speed and accuracy.

Main Results

We provide the checkpoints for quick reproduction of the results reported in the paper.

Benchmark Pipeline [email protected] [email protected] Inference Speed (FPS) Downloads

ScanRefer Single-stage 56.45 46.71 12.43 model

Benchmark Pipeline [email protected] [email protected] Downloads

Nr3d Single-stage 48.7 37.0 model

Sr3d Single-stage 57.1 44.1 model
Comparison of 3DVG methods on ScanRefer dataset:

Benchmark	Pipeline	[email protected]	[email protected]	Inference Speed (FPS)	Downloads
ScanRefer	Single-stage	56.45	46.71	12.43	model

Benchmark	Pipeline	[email protected]	[email protected]	Downloads
Nr3d	Single-stage	48.7	37.0	model
Sr3d	Single-stage	57.1	44.1	model

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for gwx22/TSP3D

Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding

Paper • 2502.10392 • Published Feb 14, 2025 • 6