gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-multimodal
This new release now makes this finetune listed and tuned correctly for multimodality, now ultra capable
Full parameter fine-tune of gemma 4 31b on ~12,000 Claude Opus 4.6 reasoning traces. This is a indigenously made special model
Highlights
- ~90% token accuracy after 4 epochs
- Full parameter SFT, not LoRA
- 12,000 pure Claude Opus 4.6 traces — consistent reasoning style, no mixed-model data
- Native Gemma 4 thinking format — uses standard built-in thinking tokens
Excellent Performance
Reasoning & Knowledge
| Benchmark | S7 Score |
|---|---|
| MMLU Pro | 90.3% |
| GPQA Diamond | 89.4% |
| BigBench Extra Hard | 78.9% |
| MMMLU (Multilingual) | 93.7% |
| HLE (no tools) | 20.7% |
| HLE (with search) | 28.1% |
Mathematics & Coding
| Benchmark | S7 Score |
|---|---|
| AIME 2026 (no tools) | 94.6% |
| LiveCodeBench v6 | 84.8% |
| Codeforces ELO | 2279 |
| HumanEval | 96.7% |
| MBPP Plus | 94.0% |
Multimodal (Vision & Medical)
| Benchmark | S7 Score |
|---|---|
| MMMU Pro | 81.5% |
| MATH-Vision | 90.7% |
| MedXPertQA MM | 65.0% |
Agentic & Long Context
| Benchmark | S7 Score |
|---|---|
| τ²-bench (Average) | 81.5% |
| τ²-bench (Retail) | 91.6% |
| MRCR v2 (8-needle 128k) | 70.4% |
Overall Improvement - 6%
Model Specifications
- Parameters: 30.7B (Dense)
- Architecture: 60 Layers
- Context Window: 256K tokens
- Vocabulary Size: 262,144
- Native Modalities: Text, Image, Video (Frame sequences)
Training Data (~12,000 samples)
Hardware Requirements
| Format | VRAM | Device |
|---|---|---|
| bf16 | ~65GB | 1x A100/H100 80GB |
| Q8 | ~35GB | 2x RTX 4090 |
| Q4_K_M | ~20GB | RTX 4090 |
| Q3_K_M | ~15GB | RTX 4080 |
License
MIT
- Downloads last month
- 280
Model tree for shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7
Base model
google/gemma-4-31B-it