cahlen/qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive-gguf IQ2_XXS GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

cahlen/qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive-gguf overview

Full llama.cpp quantization ladder for HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive. K-quants from Q80 through Q4 use standard llama-quantize without an importance matrix. Low-bit Q3KL / Q3KM / Q3KS, Q2K, and all IQ* types use WikiText-2 importance-matrix calibration (200 chunks) when this workspace contains imatrix.dat.

ggufquantizedllama-cppqwenqwen3.5moevisionmultimodaluncensoredenzhbase_model:HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressivebase_model:quantized:HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressivelicense:apache-2.0endpoints_compatibleregion:usimatrixconversational

cahlen/qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive-gguf visual

Downloads

5,989

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

11 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-IQ1_M.gguf	GGUF	IQ1_M	7.67 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-IQ2_S.gguf	GGUF	IQ2_S	9.92 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-IQ2_XXS.gguf	GGUF	IQ2_XXS	8.85 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-IQ3_S.gguf	GGUF	IQ3_S	14.20 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-IQ3_XXS.gguf	GGUF	IQ3_XXS	12.69 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q2_K.gguf	GGUF	Q2_K	12.05 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q3_K_L.gguf	GGUF	Q3_K_L	16.87 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q3_K_S.gguf	GGUF	Q3_K_S	14.14 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_S.gguf	GGUF	Q4_K_S	18.52 GB	Download
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q5_K_S.gguf	GGUF	Q5_K_S	22.33 GB	Download
mmproj-Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-f16.gguf	GGUF	F16	857.62 MB	Download

Model Details Live

Model Slug

cahlen/qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive-gguf

Author

cahlen

Pipeline Task

—

Library

—

Created

2026-04-02

Last Modified

2026-04-03

Gated

Private

HF SHA

01b3d25eae945154168f7f0c821571d68d783345

License

apache-2.0

Language

en, zh

Base Model

HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive",
    "tags": [
      "gguf",
      "quantized",
      "llama-cpp",
      "qwen",
      "qwen3.5",
      "moe",
      "vision",
      "multimodal",
      "uncensored"
    ],
    "license": "apache-2.0",
    "language": [
      "en",
      "zh"
    ],
    "frontmatter": {
      "base_model": "HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive",
      "tags": [
        "gguf",
        "quantized",
        "llama-cpp",
        "qwen",
        "qwen3.5",
        "moe",
        "vision",
        "multimodal",
        "uncensored"
      ],
      "license": "apache-2.0",
      "language": [
        "en",
        "zh"
      ]
    },
    "hero_image_url": "",
    "summary": "Full **llama.cpp** quantization ladder for HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive. **K-quants** from Q8_0 through Q4 use standard llama-quantize without an importance matrix. Low-bit Q3_K_L / Q3_K_M / Q3_K_S, Q2_K, and all IQ* types use **WikiText-2** importance-matrix calibration (200 chunks) when this workspace contains imatrix.dat.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nbase_model: HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive\ntags:\n  - gguf\n  - quantized\n  - llama-cpp\n  - qwen\n  - qwen3.5\n  - moe\n  - vision\n  - multimodal\n  - uncensored\nlicense: apache-2.0\nlanguage:\n  - en\n  - zh\n---\n\n# Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-GGUF\n\nFull **llama.cpp** quantization ladder for [`HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive`](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive). **K-quants** from Q8_0 through Q4 use standard `llama-quantize` without an importance matrix. Low-bit `Q3_K_L` / `Q3_K_M` / `Q3_K_S`, `Q2_K`, and all `IQ*` types use **WikiText-2** importance-matrix calibration (200 chunks) when this workspace contains `imatrix.dat`.\n\n## About the Source Model\n\nThis repo is a **GGUF quantization ladder** for [`HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive`](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive): an **Aggressive** uncensored build based on [`Qwen/Qwen3.5-35B-A3B`](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) (MoE, multimodal, long context). Low-bit K-quants (Q3_K*, Q2_K) and IQ-types use an importance matrix when `imatrix.dat` was produced in this run—same spirit as [our compacted Qwen3.5 GGUF ladder](https://huggingface.co/cahlen/qwen3.5-35b-a3b-compacted-GGUF).\n\nFor refusal behavior, recommended sampling settings, and **mmproj** vision tensors, follow the [HauhauCS model card](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive) and Qwen docs. **Note:** LM Studio may show `256×2.6B` in the params column; HauhauCS reports this is a cosmetic metadata quirk.\n\n### Complementary files (read this if a quant is missing here)\n\nThe [HauhauCS weight index](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive) already hosts **BF16**, **Q8_0** through **Q6_K**, several **Q4/Q5** variants, **IQ4_XS**, **IQ3_M**, **IQ2_M**, **Q3_K_M**, etc. This **cahlen** companion repo (HF names end with **-GGUF**) is **disk-aware**: it adds the *extra* ladder rungs we use on constrained hardware (e.g. **Q5_K_S**, **Q4_K_S**, **Q3_K_L** / **Q3_K_S**, **Q2_K**, **IQ3_S**, **IQ3_XXS**, **IQ2_S**, **IQ2_XXS**, **IQ1_M**) with the same **WikiText-2 / 200-chunk** imatrix workflow as [cahlen/qwen3.5-35b-a3b-compacted-GGUF](https://huggingface.co/cahlen/qwen3.5-35b-a3b-compacted-GGUF). Pull from HauhauCS if you need a size we do not mirror here.\n\n## Available Quantizations\n\n| Filename | Quant | Size | Notes |\n|----------|-------|------|-------|\n| Q5_K_S | Q5_K_S | 23G | K-quant |\n| Q4_K_S | Q4_K_S | 19G | K-quant |\n| Q3_K_L | Q3_K_L | 17G | imatrix |\n| Q3_K_S | Q3_K_S | 15G | imatrix |\n| IQ3_S | IQ3_S | 15G | imatrix |\n| IQ3_XXS | IQ3_XXS | 13G | imatrix |\n| Q2_K | Q2_K | 13G | imatrix |\n| IQ2_S | IQ2_S | 10G | imatrix |\n| IQ2_XXS | IQ2_XXS | 8.9G | imatrix |\n| IQ1_M | IQ1_M | 7.7G | imatrix |\n| mmproj-...-f16.gguf | mmproj (vision) | 858M | Pair with any quant above |\n\nAll filenames are prefixed with `Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-`. The BF16 baseline (65G) was used locally for quantization but is **not** uploaded to save space; grab it from the [HauhauCS source repo](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive) if needed. \"imatrix\" rows used WikiText-2 importance-matrix calibration (200 chunks).\n\n\n\n## Quality (WikiText-2 Perplexity)\n\nLower is better. First row is the unquantized baseline.\n\n| Quant | Size | Perplexity | vs Baseline |\n|-------|------|-----------|-------------|\n| BF16 (baseline) | 65G | 6.4393 | — |\n| Q5_K_S | 23G | 6.4871 | +0.7% |\n| Q4_K_S | 19G | 6.6214 | +2.8% |\n| Q3_K_L | 17G | 6.7204 | +4.4% |\n| IQ3_S | 15G | 6.7631 | +5.0% |\n| Q3_K_S | 15G | 6.9724 | +8.3% |\n| IQ3_XXS | 13G | 7.0490 | +9.5% |\n| Q2_K | 13G | 7.4896 | +16.3% |\n| IQ2_S | 10G | 8.1019 | +25.8% |\n| IQ2_XXS | 8.9G | 9.0738 | +40.9% |\n| IQ1_M | 7.7G | 11.1425 | +73.0% |\n\nMeasured with `llama-perplexity` on the WikiText-2 test set (580 chunks, context 512). BF16 baseline evaluated on CPU; quantized variants on NVIDIA RTX 5090.\n\n## How to Use\n\n### With llama.cpp (text)\n```bash\nllama-cli -m Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_S.gguf --jinja -c 131072 -ngl 99 -p \"Hello\"\n```\n\n### With llama.cpp (vision)\n```bash\nllama-cli -m Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_S.gguf \\\n  --mmproj mmproj-Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-f16.gguf \\\n  --jinja -c 131072 -ngl 99\n```\n\n### With llama-server\n```bash\nllama-server -m Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_S.gguf --jinja -c 131072 -ngl 99\n```\n\n### With Ollama\n```bash\nollama run hf.co/cahlen/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-GGUF:Q4_K_S\n```\n\n### With LM Studio\nDownload any GGUF from the table and load it.\n\n## Choosing a Quant\n\nRough **disk size / VRAM** guidance (actual usage varies by context length and loader). Quants marked ★ are in **this repo**; others are on the [HauhauCS source repo](https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive).\n\n| Your VRAM | Try | Size |\n|-----------|-----|------|\n| 24GB+ | Q8_0 or Q6_K (HauhauCS) | largest |\n| 16GB | ★ Q5_K_S / ★ Q4_K_S | 19–23G |\n| 12GB | ★ Q3_K_L / ★ IQ3_S | 15–17G |\n| 8GB | ★ IQ3_XXS / ★ Q2_K | 13G |\n| 6GB | ★ IQ2_S / ★ IQ2_XXS | 8.9–10G |\n\n## Quantization Details\n\n- **Quantized by**: [cahlen](https://huggingface.co/cahlen)\n- **Importance matrix**: WikiText-2 (`wikitext-2-raw-v1`, 200 chunks), when generated for this run\n- **Tool**: [llama.cpp](https://github.com/ggml-org/llama.cpp) @ `59d840209`\n- **Hardware**: NVIDIA RTX 5090 32GB / Intel Core Ultra 9 285K / 188GB RAM\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "quantized",
    "llama-cpp",
    "qwen",
    "qwen3.5",
    "moe",
    "vision",
    "multimodal",
    "uncensored",
    "en",
    "zh",
    "base_model:HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive",
    "base_model:quantized:HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "imatrix",
    "conversational"
  ],
  "likes": 1,
  "downloads": 5989,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-03T18:30:51.000Z",
  "created_at": "2026-04-02T22:39:59.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "69cef03f9c435d0ffaf5c73f",
  "id": "cahlen/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-GGUF",
  "modelId": "cahlen/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-GGUF",
  "sha": "01b3d25eae945154168f7f0c821571d68d783345",
  "createdAt": "2026-04-02T22:39:59.000Z",
  "lastModified": "2026-04-03T18:30:51.000Z",
  "author": "cahlen",
  "downloads": 5989,
  "likes": 1,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 13
}

cahlen/qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard