Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 12 days ago • 189
ReactiveGWM: Steering NPC in Reactive Game World Models Paper • 2605.15256 • Published 10 days ago • 28
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 18 days ago • 99
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 21 days ago • 162
Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics Paper • 2604.08503 • Published Apr 9 • 7
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 325
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU Paper • 2603.16428 • Published Mar 17 • 51
MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model Paper • 2603.26357 • Published Mar 27 • 4
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training Paper • 2603.28858 • Published Mar 30 • 9
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351