HUMAN-WRITTEN & LEGALLY-SOURCED* Collection Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis *...mostly. • 163 items • Updated about 3 hours ago • 2
Test-Time Scaling Makes Overtraining Compute-Optimal Paper • 2604.01411 • Published 9 days ago • 22
Token Warping Helps MLLMs Look from Nearby Viewpoints Paper • 2604.02870 • Published 7 days ago • 28
Less Detail, Better Answers: Degradation-Driven Prompting for VQA Paper • 2604.04838 • Published 4 days ago • 11
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 87 items • Updated about 5 hours ago • 12
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 4 days ago • 89
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 8 days ago • 161
REAM: Merging Improves Pruning of Experts in LLMs Paper • 2604.04356 • Published 4 days ago • 3 • 3
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 87 items • Updated about 5 hours ago • 12
REAM: Merging Improves Pruning of Experts in LLMs Paper • 2604.04356 • Published 4 days ago • 3
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published 5 days ago • 18
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 2 days ago • 21
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 87 items • Updated about 5 hours ago • 12
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 8 days ago • 33 • 5
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 87 items • Updated about 5 hours ago • 12
IMU-1: Sample-Efficient Pre-training of Small Language Models Paper • 2602.02522 • Published Jan 25 • 7
COSMOS: Predictable and Cost-Effective Adaptation of LLMs Paper • 2505.01449 • Published Apr 30, 2025 • 4