1 37 12

Naman Anand

naman5a

AI & ML interests

RAG , LLMs

Recent Activity

upvoted an article 6 days ago

The Optimal Architecture for Small Language Models

upvoted a paper 17 days ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

upvoted an article 22 days ago

Automatic Prompt Optimization with DSPy and Cross Encoders

View all activity

Organizations

upvoted an article 6 days ago

Article

The Optimal Architecture for Small Language Models

8 days ago

•

upvoted a paper 17 days ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published 23 days ago • 113

upvoted an article 22 days ago

Article

Automatic Prompt Optimization with DSPy and Cross Encoders

Aug 2, 2025

•

upvoted an article 28 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

30 days ago

•

554

commented on Continuous batching from first principles about 1 month ago

Love this article :) @ArthurZ

upvoted 3 articles about 1 month ago

Article

Continuous batching from first principles

Nov 25, 2025

•

291

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

Jun 3, 2025

•

Article

20x Faster TRL Fine-tuning with RapidFire AI

Nov 21, 2025

•

upvoted a collection 4 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28, 2025 • 104

commented a paper 5 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263 •

upvoted a paper 5 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

liked a model 5 months ago

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.65M • • 4.15k

upvoted 4 articles 6 months ago

Article

How to train a new language model from scratch using Transformers and Tokenizers

Feb 14, 2020

•

Article

Introducing HELMET: Holistically Evaluating Long-context Language Models

Apr 16, 2025

•

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Feb 4, 2025

•

186

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

715

upvoted a paper 7 months ago

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26, 2025 • 92

liked 2 models 8 months ago

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated Nov 27, 2025 • 614k • 1.39k

docling-project/SmolDocling-256M-preview

Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 55.7k • 1.6k

upvoted a collection 8 months ago

GLM-4-0414

Collection

GLM-4-0414 series model • 8 items • Updated Jun 30, 2025 • 133

Naman Anand

AI & ML interests

Recent Activity

Organizations

naman5a's activity

The Optimal Architecture for Small Language Models

Automatic Prompt Optimization with DSPy and Cross Encoders

We Got Claude to Fine-Tune an Open Source LLM

Continuous batching from first principles

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

20x Faster TRL Fine-tuning with RapidFire AI

How to train a new language model from scratch using Transformers and Tokenizers

Introducing HELMET: Holistically Evaluating Long-context Language Models

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Finally, a Replacement for BERT: Introducing ModernBERT