Question 1

What is General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF?

Accepted Answer

--- license: apache-2.0 base_model: - Qwen/Qwen3.5-122B-A10B tags: - gguf - llama.cpp - mixture-of-experts - quantized - iq3_xxs - instinctrazor pipeline_tag: text-generation --- # InstinctRazor — Qwen3.5-122B-A10B · IQ3_XXS GGUF A 122B hybrid Gated-DeltaNet MoE (256 experts, 8 active) — packed to **48 GiB** so it runs on **one 80 GB GPU** (or a small card + CPU offload). Quantized **from the original BF16** with an importance matrix (math + code + general calibration), via [llama.cpp](https://github.com/ggml-org/llama.cpp). Framework, recipe, and full reproduction: **https://github.com/General-Instinct/InstinctRazor** ## Speed (llama.cpp, this artifact) - **1× H100-80GB**, all layers on GPU: **115.9 tok/s** decode (prefill ≈2541 tok/s). - **Small card + CPU expert-offload** (`--n-cpu-moe 48`, peak ≈7.6 GiB VRAM): **45.7 tok/s** decode — runs on an 8 GB GPU + ≈48 GiB system RAM. ## Run …

Question 2

What license applies to General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF?

Accepted Answer

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Question 3

How do I run General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF need?

Accepted Answer

Runs locally from ~870.0 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

File	Type	Quantization	Size	Link
InstinctRazor-Qwen3.5-122B-A10B-IQ3_XXS.gguf	GGUF	IQ3_XXS	48.05 GB	Download
InstinctRazor-Qwen3.5-122B-A10B-mmproj-f16.gguf	GGUF	F16	870.0 MB	Download

General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF overview

Repository Files & Downloads

Model Details

Model README

InstinctRazor — Qwen3.5-122B-A10B · IQ3_XXS GGUF

Speed (llama.cpp, this artifact)

Run

Scope & roadmap

Attribution

Run General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF with guIDE

Model ID	General-Instinct/InstinctRazor-Qwen3.5-122B-A10B-GGUF
Author	General-Instinct
Pipeline	text-generation
License	apache-2.0
Base model	Qwen/Qwen3.5-122B-A10B
Last modified	2026-06-06T21:41:22.000Z