Tan's picture

5 1

Tan

RiccardTo

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

upvoted a paper 5 months ago

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

upvoted a paper 5 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

View all activity

Organizations

None yet

upvoted a paper 8 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 10 days ago • 204

upvoted 2 papers 5 months ago

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 99

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 66

upvoted a paper 12 months ago

The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

Paper • 2505.22653 • Published May 28, 2025 • 43

upvoted a paper over 1 year ago

Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22, 2025 • 44

liked a model about 2 years ago

Ori/llama-2-13b-peft-strategyqa-with-ret-at-1

Updated Sep 22, 2023 • 2 • 1