Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published Mar 23 • 124
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published 14 days ago • 48
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 9 days ago • 109
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published 3 days ago • 79
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 15 days ago • 240
ELT: Elastic Looped Transformers for Visual Generation Paper • 2604.09168 • Published 14 days ago • 19
view article Article The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics Mar 16 • 29