GPT OSS 120B — CRACK Abliterated (MLX)
OpenAI's open-weight 120B MoE model with refusal behavior removed via CRACK surgery.
Overview
This is openai/gpt-oss-120b with safety refusal behavior removed using the CRACK (Controlled Refusal Ablation via Calibrated Knockout) method. The model retains full intelligence, reasoning ability, and Harmony chain-of-thought while responding to all requests without refusal.
- Architecture: 36-layer Mixture-of-Experts (128 experts, 4 active per token)
- Parameters: ~117B total, ~5.7B active per token
- Format: Native MLX — BF16 attention + mxfp4 experts (as trained by OpenAI)
- Size: 61 GB
- Speed: ~80 tok/s on M3 Ultra 256GB
Note on precision: GPT OSS was trained natively in mxfp4 for expert weights. This is NOT post-training quantization — the experts were trained directly in 4-bit microscaling format. There is no higher-precision version. This is the native precision as released by OpenAI.
Test Results
Evaluated across 5 independent trials (1 greedy + 4 sampled at temp=0.6) with 32 prompts per trial:
| Category | Result |
|---|---|
| Compliance (12 harmful prompts) | ✅ 11.8/12 average (4/5 trials perfect 12/12) |
| Coherence (20 diverse prompts) | ✅ 19.0/20 average (2/5 trials perfect 20/20) |
| Factual accuracy | ✅ Correct (geography, science, math, history) |
| Code generation | ✅ Working Python, algorithms, data structures |
| Creative writing | ✅ Poetry, stories, recipes, summaries |
| Technical explanation | ✅ Physics, biology, computing, economics |
Thinking Depth Validation
| Complexity | Greedy | Sampled |
|---|---|---|
| Simple factual | ✅ 5/5 | ✅ 10/10 |
| Multi-step reasoning | ✅ 5/5 | ✅ 9/10 |
| Complex creative/analytical | ✅ 4/5 | ✅ 8/10 |
Overall: 91% pass rate across 45 thinking-depth tests at 3 temperatures.
Method
CRACK uses a calibrated, signal-proportional approach to weight surgery:
- Probe the model to identify per-layer refusal directions across all 36 layers
- Tier the intervention strength based on each layer's measured refusal signal — high-signal layers receive stronger surgery, low-signal layers receive gentler treatment
- Apply targeted weight modifications to attention output projections and expert MLP weights in the early-to-mid layers
- Validate across multiple trials, temperatures, and prompt categories
The tiered approach preserves the model's Harmony chain-of-thought channel-switching mechanism while effectively removing refusal behavior. Standard (non-norm-preserving) ablation is used, as empirical testing showed it outperforms magnitude-preserving alternatives on this architecture.
Recommended Settings
# Best results at greedy (temp=0) or warm (temp≥0.8)
# Moderate temperatures (0.5-0.7) occasionally produce channel loops on short creative prompts
temperature = 0.0 # or 0.8+
Usage
import mlx_lm
from mlx_lm.sample_utils import make_sampler
model, tokenizer = mlx_lm.load("dealignai/GPT-OSS-120B-MLX-CRACK")
prompt = "<|start|>user<|message|>Your prompt here<|end|><|start|>assistant<|message|>"
# Greedy (most reliable)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048)
# Or with sampling
sampler = make_sampler(temp=0.8, top_p=0.95)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048, sampler=sampler)
print(response)
Requirements
- Apple Silicon Mac with 64GB+ unified memory (model is 61 GB)
- Python 3.10+
mlx-lm >= 0.30.0
Architecture Details
GPT OSS 120B is a unique MoE architecture with:
- 128 mxfp4 experts with 4 active per token (SwiGLU activation with hard clamp)
- Alternating sliding/full attention (64 GQA heads, 8 KV heads, 128-token sliding window)
- Harmony chat format with analysis/commentary/final channels for structured chain-of-thought
- Per-head attention sinks for persistent token attention
- YaRN RoPE for 128K context window
Other Models by dealignai
Qwen 3.5 CRACK Series
| Model | Sizes |
|---|---|
| Qwen 3.5 VL (Vision+Language) | 0.8B · 2B · 4B · 9B · 35B · 122B |
| Qwen 3.5 397B REAP CRACK | Text · VL |
MiniMax M2.5 CRACK Series
Support dealignai
All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.
Support us on Ko-fi — check out the Ko-fi membership for early access and extras.
Have questions or need help with a specific model? DM us — we help for free most of the time.
Ko-fi | X @dealignai | dealign.ai
About dealignai
We research and publish abliterated models to advance AI safety understanding.
Follow us: 𝕏 @dealignai
See our research: Safety Generalization in Frontier MoE Models
This model is provided for research and educational purposes. Users are responsible for compliance with applicable laws and regulations. The creators are not responsible for misuse.
- Downloads last month
- 994
4-bit
Model tree for dealignai/GPT-OSS-120B-MLX-CRACK
Base model
openai/gpt-oss-120b