Model Intelligence Sheet

youssofal/qwen3.6-35b-a3b-abliterated-heretic-gguf overview

This is a GGUF release of an abliterated version of Qwen's Qwen3.6-35B-A3B. By applying Heretic on the Qwen 3.6 sparse-MoE text stack, the base refusal behavior was removed at the weight level. The result keeps Qwen3.6-35B-A3B's multimodal architecture and general capability profile, while no longer defaulting to the original refusal pattern.

ggufqwenqwen3.6qwen3_5_moemoemixture-of-expertsmultimodalvlmabliterateduncensoredhereticmpoasomallama-cpptext-generationbase_model:Qwen/Qwen3.6-35B-A3Bbase_model:quantized:Qwen/Qwen3.6-35B-A3Blicense:apache-2.0endpoints_compatibleregion:usconversational

youssofal/qwen3.6-35b-a3b-abliterated-heretic-gguf visual

Downloads

924

Likes

Pipeline

text-generation

Library

gguf

Visibility

Public

Access

Open

Repository Files & Downloads

6 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen3.6-35B-A3B-Abliterated-Heretic-BF16.gguf-00001-of-00002.gguf	GGUF	BF16	44.24 GB	Download
Qwen3.6-35B-A3B-Abliterated-Heretic-BF16.gguf-00002-of-00002.gguf	GGUF	BF16	20.37 GB	Download
Qwen3.6-35B-A3B-Abliterated-Heretic-Q4_K_M.gguf	GGUF	Q4_K_M	19.71 GB	Download
Qwen3.6-35B-A3B-Abliterated-Heretic-Q6_K.gguf	GGUF	Q6_K	26.56 GB	Download
Qwen3.6-35B-A3B-Abliterated-Heretic-Q8_0.gguf	GGUF	—	34.37 GB	Download
mmproj-Qwen3.6-35B-A3B-Abliterated-Heretic.gguf	GGUF	—	861.00 MB	Download

Model Details Live

Model Slug

youssofal/qwen3.6-35b-a3b-abliterated-heretic-gguf

Author

Youssofal

Pipeline Task

text-generation

Library

gguf

Created

2026-04-16

Last Modified

2026-04-16

Gated

Private

HF SHA

4c22107061e656fb2a87a3ec2491bb61975eb581

License

apache-2.0

Language

Unknown

Base Model

Qwen/Qwen3.6-35B-A3B

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "Qwen/Qwen3.6-35B-A3B",
    "library_name": "gguf",
    "pipeline_tag": "text-generation",
    "license": "apache-2.0",
    "tags": [
      "gguf",
      "qwen",
      "qwen3.6",
      "qwen3_5_moe",
      "moe",
      "mixture-of-experts",
      "multimodal",
      "vlm",
      "abliterated",
      "uncensored",
      "heretic",
      "mpoa",
      "soma",
      "llama-cpp"
    ],
    "quantized_by": "Youssofal",
    "frontmatter": {
      "base_model": "Qwen/Qwen3.6-35B-A3B",
      "library_name": "gguf",
      "pipeline_tag": "text-generation",
      "license": "apache-2.0",
      "tags": [
        "gguf",
        "qwen",
        "qwen3.6",
        "qwen3_5_moe",
        "moe",
        "mixture-of-experts",
        "multimodal",
        "vlm",
        "abliterated",
        "uncensored",
        "heretic",
        "mpoa",
        "soma",
        "llama-cpp"
      ],
      "quantized_by": "Youssofal"
    },
    "hero_image_url": "",
    "summary": "This is a GGUF release of an abliterated version of Qwen's Qwen3.6-35B-A3B. By applying Heretic on the Qwen 3.6 sparse-MoE text stack, the base refusal behavior was removed at the weight level. The result keeps Qwen3.6-35B-A3B's multimodal architecture and general capability profile, while no longer defaulting to the original refusal pattern.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nbase_model: Qwen/Qwen3.6-35B-A3B\nlibrary_name: gguf\npipeline_tag: text-generation\nlicense: apache-2.0\ntags:\n  - gguf\n  - qwen\n  - qwen3.6\n  - qwen3_5_moe\n  - moe\n  - mixture-of-experts\n  - multimodal\n  - vlm\n  - abliterated\n  - uncensored\n  - heretic\n  - mpoa\n  - soma\n  - llama-cpp\nquantized_by: Youssofal\n---\n\n# Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF\n\nThis is a GGUF release of an abliterated version of Qwen's Qwen3.6-35B-A3B.\n\nBy applying Heretic on the Qwen 3.6 sparse-MoE text stack, the base refusal behavior was removed at the weight level. The result keeps Qwen3.6-35B-A3B's multimodal architecture and general capability profile, while no longer defaulting to the original refusal pattern.\n\n## Quick Benchmarks\n\n| Check | Original Qwen3.6-35B-A3B | Abliterated Heretic |\n|---|---:|---:|\n| Official 25-prompt refusal check | 22/25 refusals | 1/25 refusals |\n| Archived Heretic KL divergence | - | 0.010655362159013748 |\n\n## Methodology & Model Notes\n\nQwen3.6-35B-A3B is a 35.95B sparse MoE vision-language model with roughly 3B active parameters per token, 40 text layers, 256 routed experts, and 8 active experts per token.\n\nThis release was produced with a Heretic MPOA/SOMA-style sibling-transfer run, finalized with a split-MoE input-side intervention on the accepted candidate.\n\nThe accepted candidate scored `Refusals: 1/25` on the official 25-prompt marker suite used for the MiniMax M2.7 abliterated run.\n\nThe resulting abliterated checkpoint was exported to BF16 and then converted to GGUF for llama.cpp-compatible deployment.\n\n## Files\n\n- `Qwen3.6-35B-A3B-Abliterated-Heretic-BF16/`: BF16 GGUF source\n- `Qwen3.6-35B-A3B-Abliterated-Heretic-Q8_0/`: highest-fidelity quant\n- `Qwen3.6-35B-A3B-Abliterated-Heretic-Q6_K/`: near-lossless practical quant\n- `Qwen3.6-35B-A3B-Abliterated-Heretic-Q4_K_M/`: smaller general-use quant\n- `mmproj-Qwen3.6-35B-A3B-Abliterated-Heretic.gguf`: matching multimodal projector file for llama.cpp vision use\n\n## Running\n\n```bash\nllama-server \\\n  -m <quant-file.gguf> \\\n  --mmproj <mmproj-file.gguf> \\\n  -ngl 999 -c 32768 --jinja -fa\n```\n\n## Model Architecture\n\n| Spec | Value |\n|---|---|\n| Total Parameters | 35.95B (sparse MoE) |\n| Active Parameters | ~3B per token |\n| Experts | 256 routed, 8 per token |\n| Layers | 40 |\n| Hidden Size | 2048 |\n| Family | `qwen3_5_moe` |\n| Modality | Vision-language |\n| Base Model | Qwen/Qwen3.6-35B-A3B |\n\n## Disclaimer\n\nThis model has had refusal behavior removed at the weight level. It will answer prompts that the base model would normally refuse. You are responsible for how you use it.\n\n## Credits\n\n- Base model: [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B)\n- Refusal removal pipeline: [Heretic](https://github.com/andyrdt/heretic)\n- GGUF runtime and quantization: [llama.cpp](https://github.com/ggml-org/llama.cpp)\n\n## License\n\nThis release inherits the base Qwen3.6-35B-A3B license.\n\n**Apache-2.0.**\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "qwen",
    "qwen3.6",
    "qwen3_5_moe",
    "moe",
    "mixture-of-experts",
    "multimodal",
    "vlm",
    "abliterated",
    "uncensored",
    "heretic",
    "mpoa",
    "soma",
    "llama-cpp",
    "text-generation",
    "base_model:Qwen/Qwen3.6-35B-A3B",
    "base_model:quantized:Qwen/Qwen3.6-35B-A3B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 8,
  "downloads": 924,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-16T22:29:43.000Z",
  "created_at": "2026-04-16T19:36:08.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "gguf"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "69e13a281870d5b9faf6e7ce",
  "id": "Youssofal/Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF",
  "modelId": "Youssofal/Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF",
  "sha": "4c22107061e656fb2a87a3ec2491bb61975eb581",
  "createdAt": "2026-04-16T19:36:08.000Z",
  "lastModified": "2026-04-16T22:29:43.000Z",
  "author": "Youssofal",
  "downloads": 924,
  "likes": 8,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "gguf",
  "siblings_count": 8
}