Perry the Platypus's picture

Perry the Platypus PRO

AgPerry

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

upvoted a paper 1 day ago

Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback

updated a dataset 4 days ago

TIGER-Lab/ClawBench

View all activity

Organizations

upvoted 2 papers 1 day ago

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

Paper • 2605.26340 • Published 20 days ago • 36

Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback

Paper • 2606.06113 • Published 10 days ago • 13

upvoted a paper 11 days ago

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Paper • 2605.30288 • Published 16 days ago • 22

upvoted a paper 28 days ago

RewardHarness: Self-Evolving Agentic Post-Training

Paper • 2605.08703 • Published May 9 • 10

upvoted 4 collections about 1 month ago

eval-papers-collection

8 items • Updated Apr 13 • 1

Reading list

5 items • Updated May 10 • 1

Papers

4 items • Updated Apr 28 • 1

ClawBench — Browser Agent Benchmark Suite

Benchmark dataset (V1+V2), live leaderboard Space, and full V1 execution traces — everything you need to run, regrade, or compare on ClawBench. • 5 items • Updated May 12 • 1

upvoted 3 papers about 1 month ago

Dr. Bench: A Multidimensional Evaluation for Deep Research Agents, from Answers to Reports

Paper • 2510.02190 • Published Jan 29 • 20

Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

Paper • 2605.05242 • Published May 3 • 123

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published Apr 30 • 90

upvoted 2 papers about 2 months ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13, 2025 • 25

upvoted 5 collections about 2 months ago

Vision

40 items • Updated 4 days ago • 2

Saved

5 items • Updated Apr 10 • 1

Paper

133 items • Updated Apr 23 • 2

Video understanding

57 items • Updated 3 days ago • 5

tanosi

3 items • Updated Apr 13 • 1

upvoted a collection 2 months ago

To read

227 items • Updated 2 days ago • 5

upvoted a paper 2 months ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 263