Foundation Protocol: A Coordination Layer for Agentic Society Paper • 2605.23218 • Published 10 days ago • 77
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 11 days ago • 169
Running Agents 13 Training Time Calculator 🚀 13 Calculates the amount of time it will take to train a model
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers Paper • 2604.17632 • Published Apr 19 • 11
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation Paper • 2604.18240 • Published Apr 20 • 16
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language Paper • 2604.19667 • Published Apr 21 • 22
TEMPO: Scaling Test-time Training for Large Reasoning Models Paper • 2604.19295 • Published Apr 21 • 35
SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents Paper • 2604.17308 • Published Apr 19 • 22
When Can LLMs Learn to Reason with Weak Supervision? Paper • 2604.18574 • Published Apr 20 • 25
Running on CPU Upgrade Featured 388 ML Intern 🤖 388 Get AI‑powered help with machine‑learning questions