Зайцев Кирилл's picture

Зайцев Кирилл

levismithru

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

liked a dataset 3 days ago

ubetu/self-oss-instruct-sft

liked a dataset 4 days ago

CodePit/OnchainPlanBench-Seed

View all activity

Organizations

None yet

upvoted a paper 2 days ago

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

Paper • 2605.24202 • Published 16 days ago • 17

upvoted a paper 4 days ago

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Paper • 2605.30611 • Published 10 days ago • 190

upvoted a paper 6 days ago

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

Paper • 2605.26086 • Published 13 days ago • 23

upvoted a paper 14 days ago

Delta Attention Residuals

Paper • 2605.18855 • Published 25 days ago • 8

upvoted a paper 19 days ago

Can Muon Fine-tune Adam-Pretrained Models?

Paper • 2605.10468 • Published 27 days ago • 6

upvoted a paper 23 days ago

Revisiting DAgger in the Era of LLM-Agents

Paper • 2605.12913 • Published 25 days ago • 6

upvoted a paper 26 days ago

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published about 1 month ago • 233

upvoted 4 papers about 2 months ago

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

Paper • 2604.08545 • Published Apr 9 • 41

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 506

Training a Student Expert via Semi-Supervised Foundation Model Distillation

Paper • 2604.03841 • Published Apr 4 • 11

Video Models Reason Early: Exploiting Plan Commitment for Maze Solving

Paper • 2603.30043 • Published Mar 31 • 14

upvoted 3 papers 2 months ago

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 352

Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

Paper • 2603.17051 • Published Mar 17 • 109

upvoted a paper 3 months ago

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published Mar 17 • 249