Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3, 2025 • 33
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 151
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26, 2025 • 26
Dr. Bench: A Multidimensional Evaluation for Deep Research Agents, from Answers to Reports Paper • 2510.02190 • Published Jan 29 • 19
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing Paper • 2510.15349 • Published Oct 17, 2025
From Illusion to Intention: Visual Rationale Learning for Vision-Language Reasoning Paper • 2511.23031 • Published Nov 28, 2025 • 1
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published Jan 22 • 92
Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining Paper • 2603.11103 • Published Mar 11 • 9
SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Understanding Paper • 2603.16124 • Published Mar 17 • 3
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 21 days ago • 143
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 7 days ago • 100
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 7 days ago • 100
EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning Paper • 2603.12698 • Published Mar 13 • 1
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published about 1 month ago • 66
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 94