ai2-alliance/qwen3-coder-next-reap-48b-a3b-gguf Q6_K_XL GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
ai2-alliance/qwen3-coder-next-reap-48b-a3b-gguf overview
!Qwen3-coder-next-reap Qwen3-Coder-Next-REAP-48B-A3B has the following specifications: Number of Linear Attention Heads: 32 for V and 16 for QK Head Dimension: 128 Test video 1 (agentic task) @Q4KXL : https://www.bilibili.com/video/BV1f8cNzcEHV/ Prompt: please clone the repository https://github.com/ggml-org/llama.cpp in /home/lovedheart/llama and review the PR 19435. Test video 2 -> fastllm (int8 quantization) approx. Q80 in GGUF : https://www.bilibili.com/video/BV1hwFJzXEVP/ Prompt: Create a cosmic nebula background using Three.js with the following requirements: a deep black space background with twinkling white stars; 2–3 large semi-transparent purple/pink nebula clouds with a smoky texture; slow rotation animation; optimized for white text display. Implementation details: 1. Starfield: 5000 white particles randomly distributed with subtle twinkling; 2. Nebula: 2–3 large purple particle clusters using additive blending mode; 3. Colors: #8B5CF6, #C084FC, #F472B6 (purple to pink gradient); 4. Animation: overall rotation.y += 0.001, stars' opacity flickering; 5. Setup: WebGLRenderer with alpha:true and black background.
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00001-of-00006.gguf | GGUF | BF16 | 18.17 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00002-of-00006.gguf | GGUF | BF16 | 18.06 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00003-of-00006.gguf | GGUF | BF16 | 18.06 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00004-of-00006.gguf | GGUF | BF16 | 18.13 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00005-of-00006.gguf | GGUF | BF16 | 18.60 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-BF16-00006-of-00006.gguf | GGUF | BF16 | 34.01 MB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q2_K_XL.gguf | GGUF | Q2_K_XL | 21.31 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q3_K_XL.gguf | GGUF | Q3_K_XL | 25.97 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q4_K_XL.gguf | GGUF | Q4_K_XL | 31.08 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q5_K_XL.gguf | GGUF | Q5_K_XL | 35.60 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q6_K_XL.gguf | GGUF | Q6_K_XL | 40.39 GB | Download |
| Qwen3-Coder-Next-REAP-48B-A3B-Q8_0.gguf | GGUF | — | 50.73 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"base_model": [
"Qwen/Qwen3-Coder-Next"
],
"tags": [
"text-generation-inference"
],
"license": "apache-2.0",
"frontmatter": {
"base_model": [
"Qwen/Qwen3-Coder-Next"
],
"tags": [
"text-generation-inference"
],
"license": "apache-2.0"
},
"hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/68121d80da035a609e569a81/PUEUf6Zz1JToRJfgI7HMk.png",
"summary": "!Qwen3-coder-next-reap **Qwen3-Coder-Next-REAP-48B-A3B** has the following specifications: **Number of Linear Attention Heads: 32 for V and 16 for QK **Head Dimension: 128 Test video 1 (agentic task) @Q4_K_XL : https://www.bilibili.com/video/BV1f8cNzcEHV/ Prompt: please clone the repository https://github.com/ggml-org/llama.cpp in /home/lovedheart/llama_ and review the PR 19435. Test video 2 -> fastllm (int8 quantization) approx. Q8_0 in GGUF : https://www.bilibili.com/video/BV1hwFJzXEVP/ Prompt: Create a cosmic nebula background using Three.js with the following requirements: a deep black space background with twinkling white stars; 2–3 large semi-transparent purple/pink nebula clouds with a smoky texture; slow rotation animation; optimized for white text display. Implementation details: 1. Starfield: 5000 white particles randomly distributed with subtle twinkling; 2. Nebula: 2–3 large purple particle clusters using additive blending mode; 3. Colors: #8B5CF6, #C084FC, #F472B6 (purple to pink gradient); 4. Animation: overall rotation.y += 0.001, stars' opacity flickering; 5. Setup: WebGLRenderer with alpha:true and black background.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nbase_model:\n- Qwen/Qwen3-Coder-Next\ntags:\n- text-generation-inference\nlicense: apache-2.0\n---\n\n\n\n\n\n\n**Qwen3-Coder-Next-REAP-48B-A3B** has the following specifications:\n\n- **Type:** Causal Language Models\n- **Number of Parameters**: 48B in total and 3B activated\n- **Hidden Dimension**: 2048\n- **Number of Layers**: 48\n- **Hybrid Layout**: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))\n- **Gated Attention**:\n- **Number of Attention Heads**: 16 for Q and 2 for KV\n- **Head Dimension**: 256\n- **Rotary Position Embedding Dimension**: 64\n- **Gated DeltaNet**: \n **Number of Linear Attention Heads: 32 for V and 16 for QK \n **Head Dimension: 128\n- **Mixture of Experts**:\n- **Number of Experts: 308 (uniformly pruned from 512)\n- **Number of Activated Experts: 10\n- **Number of Shared Experts: 1\n- **Context Length**: 262,144 natively\n- **Compression Method**: REAP (Router-weighted Expert Activation Pruning)\n- **Compression Ratio**: 40% expert pruning\n\nTest video 1 (agentic task) @Q4_K_XL : https://www.bilibili.com/video/BV1f8cNzcEHV/ \nPrompt: please clone the repository https://github.com/ggml-org/llama.cpp in /home/lovedheart/llama_ and review the PR 19435.\n\nTest video 2 -> fastllm (int8 quantization) approx. Q8_0 in GGUF : https://www.bilibili.com/video/BV1hwFJzXEVP/ \nPrompt: Create a cosmic nebula background using Three.js with the following requirements: a deep black space background with twinkling white stars; 2–3 large semi-transparent purple/pink nebula clouds with a smoky texture; slow rotation animation; optimized for white text display. Implementation details: 1. Starfield: 5000 white particles randomly distributed with subtle twinkling; 2. Nebula: 2–3 large purple particle clusters using additive blending mode; 3. Colors: #8B5CF6, #C084FC, #F472B6 (purple to pink gradient); 4. Animation: overall rotation.y += 0.001, stars' opacity flickering; 5. Setup: WebGLRenderer with alpha:true and black background.",
"related_quantizations": []
},
"tags": [
"gguf",
"text-generation-inference",
"base_model:Qwen/Qwen3-Coder-Next",
"base_model:quantized:Qwen/Qwen3-Coder-Next",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 2,
"downloads": 148,
"gated": false,
"private": false,
"last_modified": "2026-02-12T13:36:43.000Z",
"created_at": "2026-02-12T13:36:43.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "698dd76b7ef99fad0e6bb284",
"id": "Ai2-Alliance/Qwen3-Coder-Next-REAP-48B-A3B-GGUF",
"modelId": "Ai2-Alliance/Qwen3-Coder-Next-REAP-48B-A3B-GGUF",
"sha": "239b52073e59478dc82becce8ff2aa7ed8f37b34",
"createdAt": "2026-02-12T13:36:43.000Z",
"lastModified": "2026-02-12T13:36:43.000Z",
"author": "Ai2-Alliance",
"downloads": 148,
"likes": 2,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 14
}