I like how explanation for GRPO is just a giant formula wth no explanation of what it does
Ivan Nikishev
dpe1
AI & ML interests
he he he
Recent Activity
commentedon an article 5 days ago
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge new activity 5 days ago
HuggingFaceTB/nanowhale-100m:Nice new activity 17 days ago
arnir0/Tiny-LLM:tiny-llm