RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated 13 days ago • 102
RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated 13 days ago • 102
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Updated 13 days ago • 118
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Updated 13 days ago • 118
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated 14 days ago • 1.55k • 18
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated 14 days ago • 1.55k • 18