dealign.ai

GPT OSS 120B — CRACK Abliterated (MLX)

OpenAI's open-weight 120B MoE model with refusal behavior removed via CRACK surgery.

Overview

This is openai/gpt-oss-120b with safety refusal behavior removed using the CRACK (Controlled Refusal Ablation via Calibrated Knockout) method. The model retains full intelligence, reasoning ability, and Harmony chain-of-thought while responding to all requests without refusal.

  • Architecture: 36-layer Mixture-of-Experts (128 experts, 4 active per token)
  • Parameters: ~117B total, ~5.7B active per token
  • Format: Native MLX — BF16 attention + mxfp4 experts (as trained by OpenAI)
  • Size: 61 GB
  • Speed: ~80 tok/s on M3 Ultra 256GB

Note on precision: GPT OSS was trained natively in mxfp4 for expert weights. This is NOT post-training quantization — the experts were trained directly in 4-bit microscaling format. There is no higher-precision version. This is the native precision as released by OpenAI.

Test Results

Evaluated across 5 independent trials (1 greedy + 4 sampled at temp=0.6) with 32 prompts per trial:

Category Result
Compliance (12 harmful prompts) ✅ 11.8/12 average (4/5 trials perfect 12/12)
Coherence (20 diverse prompts) ✅ 19.0/20 average (2/5 trials perfect 20/20)
Factual accuracy ✅ Correct (geography, science, math, history)
Code generation ✅ Working Python, algorithms, data structures
Creative writing ✅ Poetry, stories, recipes, summaries
Technical explanation ✅ Physics, biology, computing, economics

Thinking Depth Validation

Complexity Greedy Sampled
Simple factual ✅ 5/5 ✅ 10/10
Multi-step reasoning ✅ 5/5 ✅ 9/10
Complex creative/analytical ✅ 4/5 ✅ 8/10

Overall: 91% pass rate across 45 thinking-depth tests at 3 temperatures.

Method

CRACK uses a calibrated, signal-proportional approach to weight surgery:

  1. Probe the model to identify per-layer refusal directions across all 36 layers
  2. Tier the intervention strength based on each layer's measured refusal signal — high-signal layers receive stronger surgery, low-signal layers receive gentler treatment
  3. Apply targeted weight modifications to attention output projections and expert MLP weights in the early-to-mid layers
  4. Validate across multiple trials, temperatures, and prompt categories

The tiered approach preserves the model's Harmony chain-of-thought channel-switching mechanism while effectively removing refusal behavior. Standard (non-norm-preserving) ablation is used, as empirical testing showed it outperforms magnitude-preserving alternatives on this architecture.

Recommended Settings

# Best results at greedy (temp=0) or warm (temp≥0.8)
# Moderate temperatures (0.5-0.7) occasionally produce channel loops on short creative prompts
temperature = 0.0  # or 0.8+

Usage

import mlx_lm
from mlx_lm.sample_utils import make_sampler

model, tokenizer = mlx_lm.load("dealignai/GPT-OSS-120B-MLX-CRACK")

prompt = "<|start|>user<|message|>Your prompt here<|end|><|start|>assistant<|message|>"

# Greedy (most reliable)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048)

# Or with sampling
sampler = make_sampler(temp=0.8, top_p=0.95)
response = mlx_lm.generate(model, tokenizer, prompt=prompt, max_tokens=2048, sampler=sampler)
print(response)

Requirements

  • Apple Silicon Mac with 64GB+ unified memory (model is 61 GB)
  • Python 3.10+
  • mlx-lm >= 0.30.0

Architecture Details

GPT OSS 120B is a unique MoE architecture with:

  • 128 mxfp4 experts with 4 active per token (SwiGLU activation with hard clamp)
  • Alternating sliding/full attention (64 GQA heads, 8 KV heads, 128-token sliding window)
  • Harmony chat format with analysis/commentary/final channels for structured chain-of-thought
  • Per-head attention sinks for persistent token attention
  • YaRN RoPE for 128K context window

Other Models by dealignai

Qwen 3.5 CRACK Series

Model Sizes
Qwen 3.5 VL (Vision+Language) 0.8B · 2B · 4B · 9B · 35B · 122B
Qwen 3.5 397B REAP CRACK Text · VL

MiniMax M2.5 CRACK Series

Model Quants
MiniMax 172B REAP CRACK Q4 · Q6 · Q8
MiniMax 139B REAP CRACK Q4 · Q6 · Q8

Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai


About dealignai

Dealign.AI Mascot

We research and publish abliterated models to advance AI safety understanding.

Follow us: 𝕏 @dealignai

See our research: Safety Generalization in Frontier MoE Models

dealign.ai

This model is provided for research and educational purposes. Users are responsible for compliance with applicable laws and regulations. The creators are not responsible for misuse.

Downloads last month
994
Safetensors
Model size
117B params
Tensor type
BF16
·
U8
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/GPT-OSS-120B-MLX-CRACK

Quantized
(85)
this model