Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
2
32
2
Runpeng Dai
Leo-Dai
Follow
TongZheng1999's profile picture
1 follower
·
3 following
AI & ML interests
None yet
Recent Activity
authored
a paper
2 days ago
Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling
upvoted
a
paper
3 days ago
Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling
authored
a paper
25 days ago
DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification
View all activity
Organizations
Leo-Dai
's models
17
Sort: Recently updated
Leo-Dai/PPO_BL_250_critic
4B
•
Updated
Aug 15, 2025
•
1
Leo-Dai/PPO_BL_200_critic
Updated
Aug 15, 2025
•
5
Leo-Dai/PPO_BL_300_actor
Updated
Aug 15, 2025
Leo-Dai/PPO_BL_250_actor
Updated
Aug 15, 2025
Leo-Dai/PPO_BL_300_critic
Updated
Aug 15, 2025
Leo-Dai/GRPO_BL_40
4B
•
Updated
Aug 15, 2025
•
1
Leo-Dai/GRPO_BL_30
4B
•
Updated
Aug 15, 2025
•
1
Leo-Dai/GRPO_BL_20
4B
•
Updated
Aug 15, 2025
Leo-Dai/GRPO_BL_400
4B
•
Updated
Aug 15, 2025
Leo-Dai/GRPO_BL_10
4B
•
Updated
Aug 15, 2025
Leo-Dai/GRPO_BL_350
4B
•
Updated
Aug 15, 2025
Leo-Dai/GRPO_BL_200
4B
•
Updated
Aug 13, 2025
Leo-Dai/GRPO_BL_150
4B
•
Updated
Aug 13, 2025
•
1
Leo-Dai/GRPO_BL_100
4B
•
Updated
Aug 13, 2025
•
1
Leo-Dai/GRPO_BL_300
4B
•
Updated
Aug 13, 2025
•
1
Leo-Dai/GRPO_BL_250
4B
•
Updated
Aug 13, 2025
•
1
Leo-Dai/GRPO_BL_50
4B
•
Updated
Aug 13, 2025
•
1