FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 7 days ago • 86 • 5
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 7 days ago • 86
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 7 days ago • 86
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 7 days ago • 86 • 5
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 7 days ago • 86
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 13 days ago • 40
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery Paper • 2604.01658 • Published 13 days ago • 54
Terminal Agents Suffice for Enterprise Automation Paper • 2604.00073 • Published 14 days ago • 93
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 20 days ago • 96
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 20 days ago • 96
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 20 days ago • 96 • 5
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 148