Strong Teacher Not Needed? On Distillation in LLM Pretraining Paper • 2605.23857 • Published 23 days ago • 1
i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models Paper • 2606.11289 • Published 5 days ago • 9
Every Language Counts: Learn and Unlearn in Multilingual LLMs Paper • 2406.13748 • Published Jun 19, 2024
Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell Paper • 2406.14673 • Published Jun 20, 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF Paper • 2406.07971 • Published Jun 12, 2024