LPM 1.0: Video-based Character Performance Model Paper • 2604.07823 • Published 7 days ago • 68
FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On Paper • 2604.08526 • Published 7 days ago • 20
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 7 days ago • 112
RewardFlow: Generate Images by Optimizing What You Reward Paper • 2604.08536 • Published 7 days ago • 5
KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation Paper • 2604.08455 • Published 7 days ago • 44
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling Paper • 2604.07209 • Published 8 days ago • 35
Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization Paper • 2604.07343 • Published 8 days ago • 13
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 8 days ago • 38
How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings Paper • 2604.04323 • Published 10 days ago • 40
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 10 days ago • 232
Swift-SVD: Theoretical Optimality Meets Practical Efficiency in Low-Rank LLM Compression Paper • 2604.01609 • Published 14 days ago • 11
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published 9 days ago • 41
VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published 14 days ago • 9
Running 65 Unfolding Robotics: Open-Source Shirt Folding from Data to Deployment 🤖 65 Explore the open-source guide to robot shirt folding
LightThinker++: From Reasoning Compression to Memory Management Paper • 2604.03679 • Published 12 days ago • 34
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 10 days ago • 107
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published 10 days ago • 119
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 10 days ago • 200