Wei He's picture

8 7 4

Wei He

hewei2001

·

https://hwcoder.top/about

hewei2001

AI & ML interests

LLM

Recent Activity

liked a dataset 26 days ago

meituan-longcat/VitaBench

upvoted a paper 27 days ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

upvoted a paper about 1 month ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

View all activity

Organizations

liked a dataset 26 days ago

meituan-longcat/VitaBench

Updated Nov 5 • 622 • 12

upvoted a paper 27 days ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published about 1 month ago • 208

upvoted a paper about 1 month ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published Oct 21 • 83

authored a paper about 2 months ago

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Paper • 2510.08189 • Published Oct 9 • 26

upvoted a paper about 2 months ago

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Paper • 2510.08189 • Published Oct 9 • 26

authored 5 papers 2 months ago

Better Process Supervision with Bi-directional Rewarding Signals

Paper • 2503.04618 • Published Mar 6

LongCat-Flash Technical Report

Paper • 2509.01322 • Published Sep 1 • 6

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

LongCat-Flash-Thinking Technical Report

Paper • 2509.18883 • Published Sep 23 • 4

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Paper • 2509.26490 • Published Sep 30 • 19

New activity in meituan-longcat/VitaBench 2 months ago

Update README.md

#4 opened 2 months ago by

Update README.md

#3 opened 2 months ago by

upvoted a paper 2 months ago

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Paper • 2509.26490 • Published Sep 30 • 19

commented a paper 2 months ago

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Paper • 2509.26490 • Published Sep 30 • 19 •

New activity in meituan-longcat/VitaBench 2 months ago

init dataset

#1 opened 2 months ago by

init dataset

#2 opened 2 months ago by

authored 4 papers 3 months ago

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision

Paper • 2411.16579 • Published Nov 25, 2024 • 3

Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling

Paper • 2411.00750 • Published Nov 1, 2024 • 1

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Paper • 2402.05808 • Published Feb 8, 2024

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

Paper • 2503.12854 • Published Mar 17