Dang Kai's picture

Dang Kai

dangkai-nk

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 6 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

upvoted a paper 7 months ago

WorldPM: Scaling Human Preference Modeling

View all activity

Organizations

upvoted a paper 8 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 9 days ago • 83

upvoted a paper 6 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187

upvoted 2 papers 7 months ago

WorldPM: Scaling Human Preference Modeling

Paper • 2505.10527 • Published May 15 • 34

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 317

liked 4 datasets 9 months ago

ByteDance/ComTQA

Viewer • Updated Oct 16, 2024 • 9.07k • 90 • 19

arc-agi-community/arc-agi-2

Viewer • Updated Apr 2 • 1.12k • 128 • 11

lmms-lab/Omni_Bench

Viewer • Updated Mar 27 • 1.14k • 24 • 1

PhoenixZ/MM-AlignBench

Updated Mar 1 • 27 • 4

liked 12 datasets 10 months ago

czh-up/CoMT

Viewer • Updated Feb 10 • 3.85k • 968 • 9

jonathan-roberts1/zerobench

Viewer • Updated Mar 6 • 434 • 893 • 28

USC-GVL/PhysBench

Updated Mar 5 • 377 • 15

jan-hq/Maze-Reasoning

Viewer • Updated Feb 6 • 100k • 115 • 20

TIGER-Lab/AceCode-87K

Viewer • Updated Feb 8 • 87.1k • 1.36k • 47

simplescaling/s1K

Viewer • Updated Feb 11 • 1k • 957 • 230

allenai/RLVR-IFeval

Viewer • Updated Nov 21, 2024 • 15k • 891 • 25

OpenCoder-LLM/opc-sft-stage2

Viewer • Updated Nov 24, 2024 • 436k • 1.66k • 97

likaixin/APPS-verified

Viewer • Updated Nov 17, 2024 • 4.21k • 22 • 5

likaixin/TACO-verified

Viewer • Updated Apr 17 • 12.9k • 613 • 18

GTSinger/GTSinger

Viewer • Updated Feb 9 • 69 • 4.82k • 35

cais/hle

Viewer • Updated Sep 10 • 2.5k • 22k • 548