Qwen 3.6 35B A3B DFlash GGUF

GGUF made to use in ikawrakow/ik_llama.cpp, currently for PR #1970. The small quantizations delivered here are made for test purposes; feel free to create your own quantization.

Derived from the safetensors DFlash draft model z-lab/Qwen3.6-35B-A3B-DFlash.

Compatible target model

Qwen3.6-35B-A3B-UD.gguf - Mainly tested with Q4_K_M.

Files

| File | Quant | Size |

|---|---|---|

| qwen36-35b-a3b-dflash-F16.gguf | F16 | 915 MB |

| qwen36-35b-a3b-dflash-Q8_0.gguf | Q8_0 | 491 MB |

| qwen36-35b-a3b-dflash-Q4_K_M.gguf | Q4_K_M | 279 MB |

Usage

./build/bin/llama-server \
  --model <target.gguf> \
  --model-draft <draft.gguf> \
  --spec-type dflash:n_max=<N>,cross_ctx=<N> ...

Notes

This repo contains DFlash draft models, not a standalone instruct model.
Use it with the matching target family listed above.
Q4_K_M and Q8_0 are small test-oriented quants; create your own quant if you need a different tradeoff.

Question 2

What license applies to Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF?

Accepted Answer

License: See model card. Verify terms on Hugging Face before commercial use.

Question 3

How do I run Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF need?

Accepted Answer

Runs locally from ~278.2 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF overview

Repository Files & Downloads

Model Details

Model README

Qwen 3.6 35B A3B DFlash GGUF

Compatible target model

Files

Usage

Notes

Run Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF with guIDE

File	Type	Quantization	Size	Link
qwen36-35b-a3b-dflash-F16.gguf	GGUF	F16	914.6 MB	Download
qwen36-35b-a3b-dflash-Q4_K_M.gguf	GGUF	Q4_K_M	278.2 MB	Download
qwen36-35b-a3b-dflash-Q8_0.gguf	GGUF	Q8_0	490.8 MB	Download

Model ID	Radamanthys11/Qwen3.6-35B-A3B-DFlash-GGUF
Author	Radamanthys11
Pipeline	text-generation
License	—
Base model	z-lab/Qwen3.6-35B-A3B-DFlash
Last modified	2026-06-15T01:41:09.000Z