juanml82/huihui-qwen3-next-80b-a3b-thinking-abliterated-gguf Q4_K_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

juanml82/huihui-qwen3-next-80b-a3b-thinking-abliterated-gguf overview

GGUF quants for Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated I've recreated them after the late December 2025 llama.cpp update which speeds up Qwen 3 Next, so these quants should perform better than the early quants for this model. I've uploaded three quants: iQ3M – should fit (tight) in systems with 32gb of ram plus an 8-12gb gpu with ram offloading. Possibly lowest useful quant. MXFP4MOE – a tight fit for systems with 32gb of ram plus a 16gb or more gpu. Or to fully load it in system ram, with cpumoe, in systems with 64gb of ram Q6K – will work well with systems with 64gb of ram plus ram offloading. Quality is supposed to very almost indistinguishable from Q8 I didn't do a Q8. it could be a tight fit in systems with 64gb of ram and a 24gb vram gpu, but I have that system and it's freezing when I try to load it. The q4m file is older and slower than these new three quants, so I see no reason to use it instad of the mxfp4moe Enjoy! --- license: apache-2.0 language: basemodel: pipeline_tag: text-generation tags: ---

ggufenzhbase_model:huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliteratedbase_model:quantized:huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliteratedlicense:apache-2.0endpoints_compatibleregion:usconversational

juanml82/huihui-qwen3-next-80b-a3b-thinking-abliterated-gguf visual

Downloads

665

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

4 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated-q4_K_M.gguf	GGUF	Q4_K_M	45.09 GB	Download
qwen3-next-80b-a3b-thinking-IQ3_M.gguf	GGUF	IQ3_M	32.66 GB	Download
qwen3-next-80b-a3b-thinking-mxfp4_moe.gguf	GGUF	—	40.74 GB	Download
qwen3-next-80b-a3b-thinking-q6_k.gguf	GGUF	Q6_K	61.03 GB	Download

Model Details Live

Model Slug

juanml82/huihui-qwen3-next-80b-a3b-thinking-abliterated-gguf

Author

juanml82

Pipeline Task

—

Library

—

Created

2025-12-12

Last Modified

2026-01-16

Gated

Private

HF SHA

7bed9b6d5b42ea74b47ad87c3eb2356a0b001416

License

apache-2.0

Language

en, zh

Base Model

huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "license": "apache-2.0",
    "language": [
      "en",
      "zh"
    ],
    "base_model": [
      "huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated"
    ],
    "frontmatter": {
      "license": "apache-2.0",
      "language": [
        "en",
        "zh"
      ],
      "base_model": [
        "huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated"
      ]
    },
    "hero_image_url": "",
    "summary": "GGUF quants for Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated I've recreated them after the late December 2025 llama.cpp update which speeds up Qwen 3 Next, so these quants should perform better than the early quants for this model. I've uploaded three quants: iQ3_M – should fit (tight) in systems with 32gb of ram plus an 8-12gb gpu with ram offloading. Possibly lowest useful quant. MXFP4_MOE – a tight fit for systems with 32gb of ram plus a 16gb or more gpu. Or to fully load it in system ram, with cpu_moe, in systems with 64gb of ram Q6K – will work well with systems with 64gb of ram plus ram offloading. Quality is supposed to very almost indistinguishable from Q8 I didn't do a Q8. it could be a tight fit in systems with 64gb of ram and a 24gb vram gpu, but I have that system and it's freezing when I try to load it. The q4_m file is older and slower than these new three quants, so I see no reason to use it instad of the mxfp4_moe Enjoy! --- license: apache-2.0 language: base_model: pipeline_tag: text-generation tags: ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: apache-2.0\nlanguage:\n- en\n- zh\nbase_model:\n- huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated\n---\nGGUF quants for [Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated)\n\nI've recreated them after the late December 2025 llama.cpp update which speeds up Qwen 3 Next, so these quants should perform better than the early quants for this model.\nI've uploaded three quants:\n\niQ3_M – should fit (tight) in systems with 32gb of ram plus an 8-12gb gpu with ram offloading. Possibly lowest useful quant.\n\nMXFP4_MOE – a tight fit for systems with 32gb of ram plus a 16gb or more gpu. Or to fully load it in system ram, with cpu_moe, in systems with 64gb of ram\n\nQ6K – will work well with systems with 64gb of ram plus ram offloading. Quality is supposed to very almost indistinguishable from Q8\n\nI didn't do a Q8. it could be a tight fit in systems with 64gb of ram and a 24gb vram gpu, but I have that system and it's freezing when I try to load it.\n\nThe q4_m file is older and slower than these new three quants, so I see no reason to use it instad of the mxfp4_moe\n\nEnjoy!\n\n\n---\nlicense: apache-2.0\nlanguage:\n- en\n- zh\nbase_model:\n- huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated\npipeline_tag: text-generation\ntags:\n- abliterated\n- uncensored\n---",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "en",
    "zh",
    "base_model:huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated",
    "base_model:quantized:huihui-ai/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 3,
  "downloads": 665,
  "gated": false,
  "private": false,
  "last_modified": "2026-01-16T23:16:05.000Z",
  "created_at": "2025-12-12T22:46:52.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "693c9b5c04c1c3de3f714e52",
  "id": "juanml82/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated-gguf",
  "modelId": "juanml82/Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated-gguf",
  "sha": "7bed9b6d5b42ea74b47ad87c3eb2356a0b001416",
  "createdAt": "2025-12-12T22:46:52.000Z",
  "lastModified": "2026-01-16T23:16:05.000Z",
  "author": "juanml82",
  "downloads": 665,
  "likes": 3,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 6
}

juanml82/huihui-qwen3-next-80b-a3b-thinking-abliterated-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard