HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs Paper • 2605.28398 • Published 7 days ago • 15
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 13 days ago • 169
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 22 days ago • 195
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 15 days ago • 185
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 21 days ago • 269
LiVeAction: a Lightweight, Versatile, and Asymmetric Neural Codec Design for Real-time Operation Paper • 2605.06628 • Published 27 days ago • 6
abhayesian/ryan-greenblatt-simulator-segment18-wrong-author-buck-scaffold Viewer • Updated 26 days ago • 1 • 76 • 1
TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents Paper • 2604.24005 • Published Apr 27 • 8