zTrojan/Qwen3.5-122B-A10B-REAP20-APEX-GGUF overview
Qwen3.5 122B A10B REAP20 APEX GGUF Physical REAP20 pruned APEX GGUF builds of Qwen/Qwen3.5 122B A10B . This repository is focused on APEX quantizations of a ph…
Runs locally from ~34.63 GB disk (32 GB+ VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
Model Details
| Model ID | zTrojan/Qwen3.5-122B-A10B-REAP20-APEX-GGUF |
|---|---|
| Author | zTrojan |
| Pipeline | text-generation |
| License | other |
| Base model | Qwen/Qwen3.5-122B-A10B |
| Last modified | 2026-06-12T06:55:21.000Z |
Model README
---
license: other
base_model: Qwen/Qwen3.5-122B-A10B
language:
- en
- uk
library_name: llama.cpp
pipeline_tag: text-generation
tags:
- gguf
- qwen
- qwen3.5
- moe
- reap
- reap20
- apex
- apex-mini
- apex-quant
- llama.cpp
- quantized
---
Qwen3.5-122B-A10B REAP20 APEX GGUF
Physical REAP20-pruned APEX GGUF builds of Qwen/Qwen3.5-122B-A10B.
This repository is focused on APEX quantizations of a physically expert-pruned REAP20 version of Qwen3.5-122B-A10B.
Available quants
| File | Description |
|---|---|
| Qwen3.5-122B-A10B-REAP20-APEX-Mini.gguf | Production APEX Mini quant with imatrix |
Source model
Base model:
Qwen/Qwen3.5-122B-A10B
REAP pruning
Method: physical REAP expert pruning
Compression ratio: 20%
Original experts per MoE layer: 256
Retained experts per MoE layer: 205
Layers: 48
Experts per token: 8
Observation: 32 batches
Distance: cosine
Seed: 42
Calibration source: custom calibration-v2
APEX Mini quantization
APEX Mini was generated with llama.cpp llama-quantize using a tensor-type map and imatrix.
Tensor config: configs/qwen35_122b_mini.txt
Base fallback type: Q3_K_M
Imatrix: production REAP20 imatrix
Imatrix context: -c 4096
Imatrix chunks: --chunks 128
Imatrix calibration: 64MB shuffled calibration-v2
Imatrix model source: REAP20 BF16 GGUF
The Mini profile uses Q3_K edge experts and IQ2_S middle routed experts, so imatrix is required.
Example: llama.cpp
llama-cli \
-m Qwen3.5-122B-A10B-REAP20-APEX-Mini.gguf \
-p "<|im_start|>user
Привіт. Напиши один короткий параграф українською. /no_think
<|im_end|>
<|im_start|>assistant
" \
-n 128 \
-c 4096 \
-ngl 40 \
--temp 0.6 \
--top-p 0.95
Notes
This is an experimental physical expert-pruned build. It is intended for testing, iteration, and comparison with other REAP/APEX variants.
Future files may include:
APEX I-Compact
APEX I-Balanced
APEX I-Quality
other GGUF quantizationsRun zTrojan/Qwen3.5-122B-A10B-REAP20-APEX-GGUF with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models