Qwen/Qwen3.5-397B-A17B Image-Text-to-Text β’ 403B β’ Updated about 1 month ago β’ 782k β’ β’ 1.44k
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic Text Generation β’ 8B β’ Updated 26 days ago β’ 31.6k β’ 9
view article Article Building Tensors from Scratch in Rust (Part 1.2): View Operations Jun 18, 2025 β’ 4
Running 595 Scaling test-time compute π 595 Run advanced search strategies to boost LLM problem solving
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B Text Generation β’ 8B β’ Updated May 29, 2025 β’ 286k β’ β’ 1.05k
Search-R1 Collection Preliminary checkpoints with outcome-only RL. β’ 15 items β’ Updated Aug 12, 2025 β’ 17
Skywork/Skywork-Reward-Llama-3.1-8B-v0.2 Text Classification β’ 8B β’ Updated Oct 25, 2024 β’ 86.8k β’ 42
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper β’ 2502.11089 β’ Published Feb 16, 2025 β’ 170
meta-llama/Llama-3.3-70B-Instruct Text Generation β’ 71B β’ Updated Dec 21, 2024 β’ 449k β’ β’ 2.7k