GPT OSS 120B — CRACK Abliterated (MLX)

OpenAI's open-weight 120B MoE model with refusal behavior removed via CRACK surgery.

Overview

This is openai/gpt-oss-120b with safety refusal behavior removed using the CRACK (Controlled Refusal Ablation via Calibrated Knockout) method. The model retains full intelligence, reasoning ability, and Harmony chain-of-thought while responding to all requests without refusal.

Architecture: 36-layer Mixture-of-Experts (128 experts, 4 active per token)
Parameters: ~117B total, ~5.7B active per token
Format: Native MLX — BF16 attention + mxfp4 experts (as trained by OpenAI)
Size: 61 GB
Speed: ~80 tok/s on M3 Ultra 256GB

Note on precision: GPT OSS was trained natively in mxfp4 for expert weights. This is NOT post-training quantization — the experts were trained directly in 4-bit microscaling format. There is no higher-precision version. This is the native precision as released by OpenAI.

Test Results

Evaluated across 5 independent trials (1 greedy + 4 sampled at temp=0.6) with 32 prompts per trial:

Category	Result
Compliance (12 harmful prompts)	✅ 11.8/12 average (4/5 trials perfect 12/12)
Coherence (20 diverse prompts)	✅ 19.0/20 average (2/5 trials perfect 20/20)
Factual accuracy	✅ Correct (geography, science, math, history)
Code generation	✅ Working Python, algorithms, data structures
Creative writing	✅ Poetry, stories, recipes, summaries
Technical explanation	✅ Physics, biology, computing, economics

Thinking Depth Validation

Complexity	Greedy	Sampled
Simple factual	✅ 5/5	✅ 10/10
Multi-step reasoning	✅ 5/5	✅ 9/10
Complex creative/analytical	✅ 4/5	✅ 8/10

Overall: 91% pass rate across 45 thinking-depth tests at 3 temperatures.

Method

CRACK uses a calibrated, signal-proportional approach to weight surgery:

Probe the model to identify per-layer refusal directions across all 36 layers
Tier the intervention strength based on each layer's measured refusal signal — high-signal layers receive stronger surgery, low-signal layers receive gentler treatment
Apply targeted weight modifications to attention output projections and expert MLP weights in the early-to-mid layers
Validate across multiple trials, temperatures, and prompt categories

The tiered approach preserves the model's Harmony chain-of-thought channel-switching mechanism while effectively removing refusal behavior. Standard (non-norm-preserving) ablation is used, as empirical testing showed it outperforms magnitude-preserving alternatives on this architecture.

Recommended Settings

# Best results at greedy (temp=0) or warm (temp≥0.8)
# Moderate temperatures (0.5-0.7) occasionally produce channel loops on short creative prompts
temperature = 0.0  # or 0.8+

Usage

import mlx_lm
from mlx_lm.sample_utils import make_sampler

model, tokenizer = mlx_lm.load("dealignai/GPT-OSS-120B-MLX-CRACK")

prompt = "<|start|>user<|message|>Your prompt here<|end|><|start|>assistant<|message|>"

# Greedy (most reliable)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048)

# Or with sampling
sampler = make_sampler(temp=0.8, top_p=0.95)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048, sampler=sampler)
print(response)

Requirements

Apple Silicon Mac with 64GB+ unified memory (model is 61 GB)
Python 3.10+
mlx-lm >= 0.30.0

Architecture Details

GPT OSS 120B is a unique MoE architecture with:

128 mxfp4 experts with 4 active per token (SwiGLU activation with hard clamp)
Alternating sliding/full attention (64 GQA heads, 8 KV heads, 128-token sliding window)
Harmony chat format with analysis/commentary/final channels for structured chain-of-thought
Per-head attention sinks for persistent token attention
YaRN RoPE for 128K context window

Other Models by dealignai

Qwen 3.5 CRACK Series

Model	Sizes
Qwen 3.5 VL (Vision+Language)	0.8B · 2B · 4B · 9B · 35B · 122B
Qwen 3.5 397B REAP CRACK	Text · VL

MiniMax M2.5 CRACK Series

Model	Quants
MiniMax 172B REAP CRACK	Q4 · Q6 · Q8
MiniMax 139B REAP CRACK	Q4 · Q6 · Q8

Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai

About dealignai

We research and publish abliterated models to advance AI safety understanding.

See our research: Safety Generalization in Frontier MoE Models

This model is provided for research and educational purposes. Users are responsible for compliance with applicable laws and regulations. The creators are not responsible for misuse.

Downloads last month: 994

Safetensors

Model size

117B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for dealignai/GPT-OSS-120B-MLX-CRACK

Base model

openai/gpt-oss-120b

Quantized

(85)

this model