2 3 9

Max Belitsky

mbelitsky

AI & ML interests

None yet

Recent Activity

commented on a paper about 2 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

upvoted a paper about 2 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

authored a paper 5 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

View all activity

Organizations

None yet

commented a paper about 2 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Paper • 2510.13876 • Published Oct 13 • 11 •

upvoted a paper about 2 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Paper • 2510.13876 • Published Oct 13 • 11

authored a paper 5 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

commented a paper 5 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40 •

upvoted a paper 5 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11 • 40

liked 4 datasets 7 months ago

upvoted a paper 7 months ago

Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation

Paper • 2505.06027 • Published May 9 • 18

liked a Space 9 months ago

The Ultra-Scale Playbook

🌌

3.55k

The ultimate guide to training LLM on large GPU Clusters

liked 3 datasets 10 months ago

truthfulqa/truthful_qa

Viewer • Updated Jan 4, 2024 • 1.63k • 62.3k • 268

abacusai/MetaMathFewshot

Viewer • Updated Jan 17, 2024 • 395k • 157 • 27

AwesomeEmerald/OpenSpatialLogic

Viewer • Updated Apr 4, 2024 • 36 • 34 • 8

updated a dataset over 1 year ago

mbelitsky/wikipedia_subset

Viewer • Updated May 14, 2024 • 1.04M • 71

liked a dataset over 2 years ago

Cohere/wikipedia-22-12-simple-embeddings

Viewer • Updated Mar 22, 2023 • 486k • 395 • 57

Max Belitsky

AI & ML interests

Recent Activity

Organizations

mbelitsky's activity

The Ultra-Scale Playbook