Safetensors
qwen3_5_moe
reasoning
chain-of-thought
distillation
qwen3.5
unsloth

Qwen3.5-35B-A3B-Opus-Reasoning-Distilled-Uncensored

Model Description

This model is fine-tuned from Li101/Qwen3.5-35B-A3B-Uncensored-Aggressive-safetensors with enhanced reasoning capabilities.

Training Details

  • Base Model: Li101/Qwen3.5-35B-A3B-Uncensored-Aggressive-safetensors
  • Method: bf16 LoRA + response-only (train_on_responses_only)
  • LoRA Rank: 16
  • Epochs: 2
  • Max Sequence Length: 4096
  • Learning Rate: 2e-5
  • Framework: Unsloth + TRL

Datasets

  • nohurry/Opus-4.6-Reasoning-3000x-filtered
  • Jackrong/Qwen3.5-reasoning-700x
  • Roman1111111/claude-opus-4.6-10000x

Format

The model uses <think>...</think> tags for chain-of-thought reasoning.

Downloads last month
27
Safetensors
Model size
36B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ponytang3/Qwen3.5-35B-A3B-Opus-Reasoning-Distilled

Finetuned
(1)
this model
Quantizations
1 model

Datasets used to train ponytang3/Qwen3.5-35B-A3B-Opus-Reasoning-Distilled