Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
5
1
Tao Gui
guixiaotao
Follow
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
upvoted
a
paper
5 months ago
Pre-Trained Policy Discriminators are General Reward Models
authored
a paper
11 months ago
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
View all activity
Organizations
None yet
guixiaotao
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
12 months ago
hewei2001/ReachQA
Viewer
•
Updated
Sep 5
•
22k
•
278
•
8