Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.12948

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

distil-whisper/distil-large-v2

Automatic Speech Recognition • 0.8B • Updated Mar 6 • 7.61k • 511
Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 179
Qwen/QwQ-32B-Preview

Text Generation • 33B • Updated Jan 12 • 83.1k • • 1.74k
Xkev/Llama-3.2V-11B-cot

Image-Text-to-Text • 11B • Updated 24 days ago • 360 • 158

Runtime error

562

OpenAI TTS New

📊

562
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 68
zed-industries/zeta

8B • Updated Oct 8 • 34.7k • 371

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

Grandmaster-Level Chess Without Search

Paper • 2402.04494 • Published Feb 7, 2024 • 69
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 32
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 27
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 62

content generation

Running

10.8k

AI Comic Factory

👩

10.8k

Create your own AI comic with a single prompt
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Paper • 2307.01952 • Published Jul 4, 2023 • 90
Running

Featured

922

QwQ-32B-Preview

🔍

922

QwQ-32B-Preview
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429

Text-to-3D using Gaussian Splatting

Paper • 2309.16585 • Published Sep 28, 2023 • 30
FP8-LM: Training FP8 Large Language Models

Paper • 2310.18313 • Published Oct 27, 2023 • 33
Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 29

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Grandmaster-Level Chess Without Search

Paper • 2402.04494 • Published Feb 7, 2024 • 69
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 32
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 27
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 62

distil-whisper/distil-large-v2

Automatic Speech Recognition • 0.8B • Updated Mar 6 • 7.61k • 511
Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 179
Qwen/QwQ-32B-Preview

Text Generation • 33B • Updated Jan 12 • 83.1k • • 1.74k
Xkev/Llama-3.2V-11B-cot

Image-Text-to-Text • 11B • Updated 24 days ago • 360 • 158

content generation

Running

10.8k

AI Comic Factory

👩

10.8k

Create your own AI comic with a single prompt
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Paper • 2307.01952 • Published Jul 4, 2023 • 90
Running

Featured

922

QwQ-32B-Preview

🔍

922

QwQ-32B-Preview
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429

Runtime error

562

OpenAI TTS New

📊

562
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 68
zed-industries/zeta

8B • Updated Oct 8 • 34.7k • 371

Text-to-3D using Gaussian Splatting

Paper • 2309.16585 • Published Sep 28, 2023 • 30
FP8-LM: Training FP8 Large Language Models

Paper • 2310.18313 • Published Oct 27, 2023 • 33
Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 29

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

Previous
1
...
10
11
12
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs