Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.23660

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 123
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 74
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 97
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21 • 54

about 24 hours ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 74
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21 • 54
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22 • 22
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29 • 34

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 77

Interesting Papers

ReZero: Enhancing LLM search ability by trying one-more-time

Paper • 2504.11001 • Published Apr 15 • 16
FonTS: Text Rendering with Typography and Style Controls

Paper • 2412.00136 • Published Nov 28, 2024 • 1
GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

ByteDance/InfiniteYou

Text-to-Image • Updated Jul 25 • 1.51k • 636
Comfy-Org/stable-diffusion-v1-5-archive

Updated Oct 13 • 184k • 76
xey/sldr_flux_nsfw_v2-studio

Text-to-Image • Updated Jan 14 • 10.4k • • 290
strangerzonehf/Ghibli-Flux-Cartoon-LoRA

Text-to-Image • Updated Apr 1 • 609 • • 36

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 26
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 104

30\ Interesting.what is this ? how it works?

Paused

Featured

308

TikSlop

🍿

308

The 100% Latent Video App
sshh12/Mistral-7B-LoRA-AudioCLAP

Updated Dec 13, 2023 • 5 • 4
microsoft/phi-1_5

Text Generation • 1B • Updated 15 days ago • 83.3k • 1.35k
stabilityai/stablecode-instruct-alpha-3b

Text Generation • 3B • Updated Aug 8, 2023 • 51 • 302

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Interesting Papers

ReZero: Enhancing LLM search ability by trying one-more-time

Paper • 2504.11001 • Published Apr 15 • 16
FonTS: Text Rendering with Typography and Style Controls

Paper • 2412.00136 • Published Nov 28, 2024 • 1
GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 123
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 74
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 97
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21 • 54

ByteDance/InfiniteYou

Text-to-Image • Updated Jul 25 • 1.51k • 636
Comfy-Org/stable-diffusion-v1-5-archive

Updated Oct 13 • 184k • 76
xey/sldr_flux_nsfw_v2-studio

Text-to-Image • Updated Jan 14 • 10.4k • • 290
strangerzonehf/Ghibli-Flux-Cartoon-LoRA

Text-to-Image • Updated Apr 1 • 609 • • 36

about 24 hours ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 74
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21 • 54
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22 • 22
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29 • 34

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 26
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 104

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 77

30\ Interesting.what is this ? how it works?

Paused

Featured

308

TikSlop

🍿

308

The 100% Latent Video App
sshh12/Mistral-7B-LoRA-AudioCLAP

Updated Dec 13, 2023 • 5 • 4
microsoft/phi-1_5

Text Generation • 1B • Updated 15 days ago • 83.3k • 1.35k
stabilityai/stablecode-instruct-alpha-3b

Text Generation • 3B • Updated Aug 8, 2023 • 51 • 302

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs