Spaces:
Running
on
Zero
Running
on
Zero
Commit
·
533f466
1
Parent(s):
10b6682
update
Browse files
app.py
CHANGED
|
@@ -68,20 +68,35 @@ WIDTH = 384
|
|
| 68 |
|
| 69 |
MARKDOWN = \
|
| 70 |
"""
|
| 71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
-
|
| 74 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
This Demo is on the Video Diffusion Model part.
|
| 77 |
-
Only GestureNet is provided in this Gradio Demo, you can check the full test code for all pretrained weight available.
|
| 78 |
|
| 79 |
-
### Note: The index we put the gesture point by default here is [4, 10] for two gesture points or [4] for one gesture point.
|
| 80 |
-
### Note: The
|
| 81 |
### Note: Click "Clear All" to restart everything; Click "Undo Point" to cancel the point you put
|
| 82 |
### Note: The first run may be long. Click "Clear All" for each run is the safest choice.
|
| 83 |
|
| 84 |
If **This&That** is helpful, please help star the [GitHub Repo](https://github.com/Kiteretsu77/This_and_That_VDM). Thanks!
|
|
|
|
| 85 |
"""
|
| 86 |
|
| 87 |
|
|
|
|
| 68 |
|
| 69 |
MARKDOWN = \
|
| 70 |
"""
|
| 71 |
+
<div align='center'>
|
| 72 |
+
<h1> This&That: Language-Gesture Controlled Video Generation for Robot Planning </h1> \
|
| 73 |
+
<h2 style='font-weight: 450; font-size: 1rem; margin: 0rem'>\
|
| 74 |
+
<a href='https://kiteretsu77.github.io/boyang.github.io/'>Boyang Wang</a>, \
|
| 75 |
+
<a href='https://www.linkedin.com/in/niksridhar/'>Nikhil Sridhar</a>, \
|
| 76 |
+
<a href='https://cfeng16.github.io/'>Chao Feng</a>, \
|
| 77 |
+
<a href='https://mvandermerwe.github.io/'>Mark Van der Merwe</a>, \
|
| 78 |
+
<a href='https://fishbotics.com/'>Adam Fishman</a>, \
|
| 79 |
+
<a href='https://www.mmintlab.com/people/nima-fazeli/'>Nima Fazeli</a>, \
|
| 80 |
+
<a href='https://jjparkcv.github.io/'>Jeong Joon Park</a> \
|
| 81 |
+
</h2> \
|
| 82 |
|
| 83 |
+
<a style='font-size:18px;color: #000000' href='https://github.com/Kiteretsu77/This_and_That_VDM'> [Github] </a> \
|
| 84 |
+
<a style='font-size:18px;color: #000000' href='http://arxiv.org/abs/2407.05530'> [ArXiv] </a> \
|
| 85 |
+
<a style='font-size:18px;color: #000000' href='https://cfeng16.github.io/this-and-that/'> [Project Page] </a> </div> \
|
| 86 |
+
</div>
|
| 87 |
+
|
| 88 |
+
This&That is a Robotics scenario (Bridge-dataset-based for this demo) Language-Gesture-Image-conditioned Video Generation Model for Robot Planning.
|
| 89 |
|
| 90 |
This Demo is on the Video Diffusion Model part.
|
| 91 |
+
Only GestureNet is provided in this Gradio Demo, but you can check the full test code for all pretrained weight available.
|
| 92 |
|
| 93 |
+
### Note: The index we put the gesture point by default here is [4, 10] (5th and 11th) for two gesture points or [4] (5th) for one gesture point.
|
| 94 |
+
### Note: The resolution now only support is 256x384.
|
| 95 |
### Note: Click "Clear All" to restart everything; Click "Undo Point" to cancel the point you put
|
| 96 |
### Note: The first run may be long. Click "Clear All" for each run is the safest choice.
|
| 97 |
|
| 98 |
If **This&That** is helpful, please help star the [GitHub Repo](https://github.com/Kiteretsu77/This_and_That_VDM). Thanks!
|
| 99 |
+
|
| 100 |
"""
|
| 101 |
|
| 102 |
|