khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf Q6_K GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
Model Intelligence Sheet
khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf overview
Comprehensive model page for khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf
Downloads
7,912
Likes
5
Pipeline
text-generation
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
4 files detected
Direct downloads for all repository files
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"tags": [
"gguf",
"llama.cpp",
"unsloth",
"reasoning",
"distillation",
"sft"
],
"license": "apache-2.0",
"datasets": [
"khazarai/kimi-2.5-high-reasoning-250x"
],
"language": [
"en"
],
"base_model": [
"khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled"
],
"pipeline_tag": "text-generation",
"frontmatter": {
"tags": [
"gguf",
"llama.cpp",
"unsloth",
"reasoning",
"distillation",
"sft"
],
"license": "apache-2.0",
"datasets": [
"khazarai/kimi-2.5-high-reasoning-250x"
],
"language": [
"en"
],
"base_model": [
"khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled"
],
"pipeline_tag": "text-generation"
},
"hero_image_url": "benchmark/Kimi_distilled.png",
"summary": "",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\ntags:\n- gguf\n- llama.cpp\n- unsloth\n- reasoning\n- distillation\n- sft\nlicense: apache-2.0\ndatasets:\n- khazarai/kimi-2.5-high-reasoning-250x\nlanguage:\n- en\nbase_model:\n- khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled\npipeline_tag: text-generation\n---\n\n# Qwen3-4B-Kimi2.5-Reasoning-Distilled : GGUF\n\n## Model: khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled\n\n\n\n- **Success Rate**: 76.09%\n\n## Model: Qwen/Qwen3-4B-Thinking-2507\n\n\n\n- **Success Rate**: 73.73%\n\n- **Benchmark**: khazarai/Multi-Domain-Reasoning-Benchmark\n- **Total Questions**: 100\n\n`Qwen3-4B-Kimi2.5-Reasoning-Distilled` is a fine-tuned language model optimized for structured, long-form reasoning. It is derived from the Qwen3-4b-Thinking-2507 base model and fine-tuned using a specialized distillation dataset generated by Kimi-2.5-thinking.\n\nThis model is designed to bridge the gap between small, efficient models (0.6B–4B range) and the complex reasoning capabilities typically found in much larger models. It excels at breaking down problems, self-correcting, and providing detailed analytical answers.\n\n**Base Model**:\tQwen3-4b-Thinking-2507\n\n**Training Technique**:\tUnsloth + QLoRa\n \n\n## Available Model files:\n- `qwen3-4b-thinking-2507.BF16.gguf`\n- `qwen3-4b-thinking-2507.Q8_0.gguf`\n- `qwen3-4b-thinking-2507.Q6_K.gguf`\n- `qwen3-4b-thinking-2507.Q4_K_M.gguf`\n\n## Ollama\nAn Ollama Modelfile is included for easy deployment.\n \n\n## Provided Quants\n\n(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)\n\n| Type | Size/GB | Notes |\n|:-----|--------:|:------|\n| Q4_K_M | 2.5 | fast, recommended |\n| Q6_K | 3.3 | very good quality |\n| Q8_0 | 4.2 | fast, best quality |\n| f16 | 8.0 | 16 bpw, overkill |\n\nHere is a handy graph by ikawrakow comparing some lower-quality quant\ntypes (lower is better):\n\n\n\nAnd here are Artefact2's thoughts on the matter:\nhttps://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9\n \n\n## Dataset\n\nThe model was fine-tuned on the [khazarai/kimi-2.5-high-reasoning-250x](https://huggingface.co/datasets/khazarai/kimi-2.5-high-reasoning-250x) \n\nDataset Composition:\n - Total Samples: 250\n - Total Tokens: 1,114,407\n - Teacher Model: Kimi-2.5-Thinking\n\n## Acknowledgements\n\n**Unsloth** for the incredibly fast and memory-efficient training framework.",
"related_quantizations": []
},
"tags": [
"gguf",
"qwen3",
"llama.cpp",
"unsloth",
"reasoning",
"distillation",
"sft",
"text-generation",
"en",
"dataset:khazarai/kimi-2.5-high-reasoning-250x",
"base_model:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled",
"base_model:quantized:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 5,
"downloads": 7912,
"gated": false,
"private": false,
"last_modified": "2026-04-15T11:26:09.000Z",
"created_at": "2026-03-16T16:28:00.000Z",
"pipeline_tag": "text-generation",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69b82f90c8973a232d994b63",
"id": "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled-GGUF",
"modelId": "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled-GGUF",
"sha": "4c65129ebe481d33203d7b7fd65b3c2a930ffad3",
"createdAt": "2026-03-16T16:28:00.000Z",
"lastModified": "2026-04-15T11:26:09.000Z",
"author": "khazarai",
"downloads": 7912,
"likes": 5,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "",
"siblings_count": 10
}