Runpeng Dai's picture

Runpeng Dai

Leo-Dai

·

AI & ML interests

None yet

Recent Activity

authored a paper 2 days ago

Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling

upvoted a paper 3 days ago

Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling

authored a paper 25 days ago

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

View all activity

Organizations

Leo-Dai 's models 17

Leo-Dai/PPO_BL_250_critic

4B • Updated Aug 15, 2025 • 1

Leo-Dai/PPO_BL_200_critic

Updated Aug 15, 2025 • 5

Leo-Dai/PPO_BL_300_actor

Updated Aug 15, 2025

Leo-Dai/PPO_BL_250_actor

Updated Aug 15, 2025

Leo-Dai/PPO_BL_300_critic

Updated Aug 15, 2025

Leo-Dai/GRPO_BL_40

4B • Updated Aug 15, 2025 • 1

Leo-Dai/GRPO_BL_30

4B • Updated Aug 15, 2025 • 1

Leo-Dai/GRPO_BL_20

4B • Updated Aug 15, 2025

Leo-Dai/GRPO_BL_400

4B • Updated Aug 15, 2025

Leo-Dai/GRPO_BL_10

4B • Updated Aug 15, 2025

Leo-Dai/GRPO_BL_350

4B • Updated Aug 15, 2025

Leo-Dai/GRPO_BL_200

4B • Updated Aug 13, 2025

Leo-Dai/GRPO_BL_150

4B • Updated Aug 13, 2025 • 1

Leo-Dai/GRPO_BL_100

4B • Updated Aug 13, 2025 • 1

Leo-Dai/GRPO_BL_300

4B • Updated Aug 13, 2025 • 1

Leo-Dai/GRPO_BL_250

4B • Updated Aug 13, 2025 • 1

Leo-Dai/GRPO_BL_50

4B • Updated Aug 13, 2025 • 1