Model Intelligence Sheet
prithivmlmods/qwen3-4b-abliterated-f32-ggufs overview
Qwen3-4B-abliterated is an experimental, uncensored version of the Qwen/Qwen3-4B language model that explores how refusals and latent fine-tuning work in large language models using a novel "abliteration" technique, which subtracts a computed refusal direction from hidden module states (such as o_proj) to minimize refusals without degrading output quality. The process involves comparing residual streams between harmful and harmless prompts, orthogonalizing hidden states with weight factors distributed across layers, and iterative or accumulated orthogonalization methods for efficiency.
Downloads
405
Likes
2
Pipeline
text-generation
Library
transformers
Visibility
Public
Access
Open
Repository Files & Downloads
13 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3-4B-abliterated.BF16.gguf | GGUF | BF16 | 7.50 GB | Download |
| Qwen3-4B-abliterated.F16.gguf | GGUF | F16 | 7.50 GB | Download |
| Qwen3-4B-abliterated.F32.gguf | GGUF | F32 | 14.99 GB | Download |
| Qwen3-4B-abliterated.Q2_K.gguf | GGUF | Q2_K | 1.55 GB | Download |
| Qwen3-4B-abliterated.Q3_K_L.gguf | GGUF | Q3_K_L | 2.09 GB | Download |
| Qwen3-4B-abliterated.Q3_K_M.gguf | GGUF | Q3_K_M | 1.93 GB | Download |
| Qwen3-4B-abliterated.Q3_K_S.gguf | GGUF | Q3_K_S | 1.76 GB | Download |
| Qwen3-4B-abliterated.Q4_K_M.gguf | GGUF | Q4_K_M | 2.33 GB | Download |
| Qwen3-4B-abliterated.Q4_K_S.gguf | GGUF | Q4_K_S | 2.22 GB | Download |
| Qwen3-4B-abliterated.Q5_K_M.gguf | GGUF | Q5_K_M | 2.69 GB | Download |
| Qwen3-4B-abliterated.Q5_K_S.gguf | GGUF | Q5_K_S | 2.63 GB | Download |
| Qwen3-4B-abliterated.Q6_K.gguf | GGUF | Q6_K | 3.08 GB | Download |
| Qwen3-4B-abliterated.Q8_0.gguf | GGUF | — | 3.99 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "apache-2.0",
"language": [
"en"
],
"base_model": [
"Qwen/Qwen3-4B"
],
"pipeline_tag": "text-generation",
"library_name": "transformers",
"tags": [
"text-generation-inference"
],
"frontmatter": {
"license": "apache-2.0",
"language": [
"en"
],
"base_model": [
"Qwen/Qwen3-4B"
],
"pipeline_tag": "text-generation",
"library_name": "transformers",
"tags": [
"text-generation-inference"
]
},
"hero_image_url": "https://www.nethype.de/huggingface_embed/quantpplgraph.png",
"summary": "> Qwen3-4B-abliterated is an experimental, uncensored version of the Qwen/Qwen3-4B language model that explores how refusals and latent fine-tuning work in large language models using a novel \"abliteration\" technique, which subtracts a computed refusal direction from hidden module states (such as o_proj) to minimize refusals without degrading output quality. The process involves comparing residual streams between harmful and harmless prompts, orthogonalizing hidden states with weight factors distributed across layers, and iterative or accumulated orthogonalization methods for efficiency.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlicense: apache-2.0\nlanguage:\n- en\nbase_model:\n- Qwen/Qwen3-4B\npipeline_tag: text-generation\nlibrary_name: transformers\ntags:\n- text-generation-inference\n---\n\n# **Qwen3-4B-abliterated-f32-GGUFs**\n\n> Qwen3-4B-abliterated is an experimental, uncensored version of the Qwen/Qwen3-4B language model that explores how refusals and latent fine-tuning work in large language models using a novel \"abliteration\" technique, which subtracts a computed refusal direction from hidden module states (such as o_proj) to minimize refusals without degrading output quality. The process involves comparing residual streams between harmful and harmless prompts, orthogonalizing hidden states with weight factors distributed across layers, and iterative or accumulated orthogonalization methods for efficiency.\n\n## Model Files\n\n| File name | Size | Quant Type |\n|-----------|------|------------|\n| Qwen3-4B-abliterated.F32.gguf | 16.1 GB | F32 |\n| Qwen3-4B-abliterated.BF16.gguf | 8.05 GB | BF16 |\n| Qwen3-4B-abliterated.F16.gguf | 8.05 GB | F16 |\n| Qwen3-4B-abliterated.Q8_0.gguf | 4.28 GB | Q8_0 |\n| Qwen3-4B-abliterated.Q6_K.gguf | 3.31 GB | Q6_K |\n| Qwen3-4B-abliterated.Q5_K_M.gguf | 2.89 GB | Q5_K_M |\n| Qwen3-4B-abliterated.Q5_K_S.gguf | 2.82 GB | Q5_K_S |\n| Qwen3-4B-abliterated.Q4_K_M.gguf | 2.5 GB | Q4_K_M |\n| Qwen3-4B-abliterated.Q4_K_S.gguf | 2.38 GB | Q4_K_S |\n| Qwen3-4B-abliterated.Q3_K_L.gguf | 2.24 GB | Q3_K_L |\n| Qwen3-4B-abliterated.Q3_K_M.gguf | 2.08 GB | Q3_K_M |\n| Qwen3-4B-abliterated.Q3_K_S.gguf | 1.89 GB | Q3_K_S |\n| Qwen3-4B-abliterated.Q2_K.gguf | 1.67 GB | Q2_K |\n\n## Quants Usage \n\n(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)\n\nHere is a handy graph by ikawrakow comparing some lower-quality quant\ntypes (lower is better):\n\n",
"related_quantizations": []
},
"tags": [
"transformers",
"gguf",
"qwen3",
"text-generation-inference",
"text-generation",
"en",
"base_model:Qwen/Qwen3-4B",
"base_model:quantized:Qwen/Qwen3-4B",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 2,
"downloads": 405,
"gated": false,
"private": false,
"last_modified": "2025-08-03T04:23:32.000Z",
"created_at": "2025-08-02T07:45:41.000Z",
"pipeline_tag": "text-generation",
"library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "688dc225728dff723b522031",
"id": "prithivMLmods/Qwen3-4B-abliterated-f32-GGUFs",
"modelId": "prithivMLmods/Qwen3-4B-abliterated-f32-GGUFs",
"sha": "705d645d642b0171e23ea273e250a4444df7a9f7",
"createdAt": "2025-08-02T07:45:41.000Z",
"lastModified": "2025-08-03T04:23:32.000Z",
"author": "prithivMLmods",
"downloads": 405,
"likes": 2,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "transformers",
"siblings_count": 16
}