OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published Apr 13 • 66
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351
Grad2Reward: From Sparse Judgment to Dense Rewards for Improving Open-Ended LLM Reasoning Paper • 2602.01791 • Published Feb 2 • 2
view article Article Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs +3 lapp0, LouisCastricato, ScottieFox, shahbuland, xAesthetics • Apr 9 • 29
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs nielsr • Apr 7 • 62