EarlyTom: Early Token Compression Completes Fast Video Understanding Paper • 2605.30010 • Published 3 days ago • 23
Q-ARVD: Quantizing Autoregressive Video Diffusion Models Paper • 2605.21072 • Published 11 days ago • 21
Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs Paper • 2605.20315 • Published 12 days ago • 28
On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment Paper • 2605.11882 • Published 19 days ago • 16
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks Paper • 2605.10977 • Published 22 days ago • 10
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks Paper • 2605.10977 • Published 22 days ago • 10
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published about 1 month ago • 90
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms Paper • 2604.23775 • Published Apr 26 • 45
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers Paper • 2603.27666 • Published Mar 29 • 18
AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration Paper • 2604.01014 • Published Apr 1 • 11
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published May 24, 2025 • 28 • 4
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2, 2025 • 18 • 3