What license applies to LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF?

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Model Intelligence Sheet

LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF overview

Q: How do I run LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF locally?

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: image-text-to-text.

⚡ https://web.tribute.tg/d/KIH https://web.tribute.tg/d/KIH ⚡ If you like this Genesis LLM release you can donate https://web.tribute.tg/d/KIH to me via @Tribu…

ggufuncensoredqwen3.6moevisionmultimodalgenesisimage-text-to-textconversationalenzhmultilingualbase_model:HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressivebase_model:quantized:HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressivelicense:apache-2.0endpoints_compatibleregion:usimatrix

Runs locally from ~857.6 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads

61,868

Likes

Pipeline

image-text-to-text

Author

LuffyTheFox

Repository Files & Downloads

7 GGUF files detected

Direct downloads for local inference

File	Type	Quantization	Size	Link
Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-Compact.gguf	GGUF	GGUF	16.14 GB	Download
Qwen3.6-35B-A3B-Uncensored-Genesis-APEX.gguf	GGUF	GGUF	23.87 GB	Download
Qwen3.6-35B-A3B-Uncensored-Genesis-MTP-APEX-Compact.gguf	GGUF	GGUF	16.78 GB	Download
Qwen3.6-35B-A3B-Uncensored-Genesis-MTP-APEX.gguf	GGUF	GGUF	24.63 GB	Download
Qwen3.6-35B-A3B-Uncensored-Genesis-MTP-Q8_K_P.gguf	GGUF	Q8_K_P	41.45 GB	Download
Qwen3.6-35B-A3B-Uncensored-Genesis-Q8_K_P.gguf	GGUF	Q8_K_P	40.61 GB	Download
mmproj-Qwen3.6-35B-A3B-Uncensored-Genesis-f16.gguf	GGUF	F16	857.6 MB	Download

Model Details

Model ID	LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF
Author	LuffyTheFox
Pipeline	image-text-to-text
License	apache-2.0
Base model	HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive
Last modified	2026-06-16T06:22:58.000Z

Model README

---

license: apache-2.0

tags:

uncensored
qwen3.6
moe
gguf
vision
multimodal
genesis

language:

en
zh
multilingual

pipeline_tag: image-text-to-text

base_model:

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

---

> ⚡ https://web.tribute.tg/d/KIH ⚡ If you like this Genesis LLM release you can donate to me via @Tribute bot in Telegram messenger and support future Genesis LLM development.

🌟 Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive -> Genesis-V2

> Key difference from Wasserstein release and old Genesis release is data regeneration in model via mathematical statistics based on what it's already learned and stored in tensors. I regenerated even more dead blocks from data in healthy blocks in this version. Also I distilled model dictionary from garbage tokens.

> Join the Discord for updates, roadmaps, projects, or just to chat.

Base model. HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive- 0/465 refusals.

Thanks to HauhauCS

Usage

Ready to use. Recommended quant: APEX or MTP-APEX

On my RTX 3060 12GB and regular chatting, I have more tokens per second without MTP.

Tensor drift repair by me. Method: Sig-ScaleSync-Wasserstein

LLM models often have:

Saturated weights: the model's activations are stuck, gradients vanish, outputs degrade
Scale mismatches: one layer's weights are 10× larger than its peers for no good reason
Mean drift: weight distributions shifted positive or negative, breaking symmetry assumptions

My approach fixes all of that without retraining - pure numerical surgery on the raw bytes of the file.

Quantization script available here: https://pastebin.com/hXhcMJn9

Feel free to do your own quants if you want.

Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive: Diagnostic & Repair Summary

| Metric | Value |

|--------|-------|

| Weight tensors analyzed | 500 |

| Healthy (all criteria) | 497 |

| Repaired (C2 – scale misalignment) | 3 |

| Skipped | 233 |

Repair Effectiveness

|--------|--------|-------|-------------|

| S (saturation error) | 0.0023 | 0.0008 | 63.7% |

| W1 (Wasserstein‑1) | 0.0035 | 0.0008 | 76.2% |

Scale correction factors (α): min = 0.577, mean = 0.602, max = 0.653.

Repaired Tensors

All three are ssm_conv1d.weight layers – recurrent state transition layers responsible for long‑context memory.

|--------|---|---------------|-----------|----------|

| blk.36.ssm_conv1d.weight | 0.5765 | 0.553 | 0.0038 | 0.0009 |

| blk.37.ssm_conv1d.weight | 0.5768 | 0.725 | 0.0040 | 0.0009 |

| blk.38.ssm_conv1d.weight | 0.6533 | 0.649 | 0.0026 | 0.0006 |

Interpretation: All three layers were too loud (σ_w > σ_med by 50–100%). Scale correction restored them to peer median. W1 dropped by ≈80%, confirming distribution shape normalized.

---

Verdict: Model is clinically healthy. 497 out of 500 weight tensors passed all four criteria. Three SSM layers repaired successfully. No saturation, no W1 drift, no ReLU asymmetry. Ready for use.

---

Links:

---

Wanna fix your GGUF model?

Contact: luffythefox@mail.ru

My Telegram: @LuffyTheFox

🌟 Recommended Settings (LM Studio)

Set K Cache Quantization Type and V Cache Quantization Type in advanced model loading settings to Q8_0 or F16.

Chat template: chat_template.jinja

Chat template: chat_template_thinking.jinja

| Parameter | Value |

|-----------|-------|

| Temperature | 0.7 |

| Top K Sampling | 20 |

| Presence Penalty| 1.5 |

| Repeat Penalty| 1.0 |

| Top P Sampling | 0.8 |

| Min P Sampling | 0 |

| Seed | 42 |

System prompt: System_Prompt.txt

Or use this minimal string as the first line:

> You are Qwen, created by Alibaba Cloud. You are a helpful assistant.

Then add anything you want after.

About

No changes to datasets or capabilities. Fully functional - 100% of what the original authors intended, just without refusals and with the critical architecture bug fixed on output layers.

These are meant to be the best lossless uncensored models out there.

---

Specs

35B total parameters, ~3B active per forward pass (MoE)
256 experts, 8 routed + 1 shared per token
Hybrid architecture: Gated DeltaNet linear attention + full softmax attention (3:1 ratio)
40 layers, pattern: 10 × (3 × DeltaNet-MoE + 1 × Attention-MoE)
262K native context (extendable to 1M with YaRN)
Natively multimodal (text, image, video)
Multi-token prediction (MTP) support
248K vocabulary, 201 languages
Base model. HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

---

Recommended Settings (Official Qwen Authors)

Thinking mode (default):

General: temperature=1.0, top_p=0.95, top_k=20, min_p=0, presence_penalty=1.5
Coding/precise tasks: temperature=0.6, top_p=0.95, top_k=20, min_p=0, presence_penalty=0

Non-thinking mode:

General: temperature=0.7, top_p=0.8, top_k=20, min_p=0, presence_penalty=1.5
Reasoning tasks: temperature=1.0, top_p=1.0, top_k=40, min_p=0, presence_penalty=2.0

Important:

Keep at least 128K context to preserve thinking capabilities
Use --jinja flag with llama.cpp for proper chat template handling
Vision support requires the mmproj file alongside the main GGUF

---

Compatibility

Works with llama.cpp, LM Studio, koboldcpp, and other GGUF-compatible runtimes.

Run LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models