GraySoft
Projects Models About FAQ Contact Download guIDE →

khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf Q6_K GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf overview

Comprehensive model page for khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf

ggufqwen3llama.cppunslothreasoningdistillationsfttext-generationendataset:khazarai/kimi-2.5-high-reasoning-250xbase_model:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilledbase_model:quantized:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilledlicense:apache-2.0endpoints_compatibleregion:usconversational
khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf visual
Downloads
7,912
Likes
5
Pipeline
text-generation
Library
Visibility
Public
Access
Open

Repository Files & Downloads

4 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
qwen3-4b-thinking-2507.BF16.gguf GGUF BF16 7.50 GB Download
qwen3-4b-thinking-2507.Q4_K_M.gguf GGUF Q4_K_M 2.33 GB Download
qwen3-4b-thinking-2507.Q6_K.gguf GGUF Q6_K 3.08 GB Download
qwen3-4b-thinking-2507.Q8_0.gguf GGUF 3.99 GB Download

Model Details Live

Model Slug
khazarai/qwen3-4b-kimi2.5-reasoning-distilled-gguf
Author
khazarai
Pipeline Task
text-generation
Library
Created
2026-03-16
Last Modified
2026-04-15
Gated
No
Private
No
HF SHA
4c65129ebe481d33203d7b7fd65b3c2a930ffad3
License
apache-2.0
Language
en
Base Model
khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "tags": [
      "gguf",
      "llama.cpp",
      "unsloth",
      "reasoning",
      "distillation",
      "sft"
    ],
    "license": "apache-2.0",
    "datasets": [
      "khazarai/kimi-2.5-high-reasoning-250x"
    ],
    "language": [
      "en"
    ],
    "base_model": [
      "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled"
    ],
    "pipeline_tag": "text-generation",
    "frontmatter": {
      "tags": [
        "gguf",
        "llama.cpp",
        "unsloth",
        "reasoning",
        "distillation",
        "sft"
      ],
      "license": "apache-2.0",
      "datasets": [
        "khazarai/kimi-2.5-high-reasoning-250x"
      ],
      "language": [
        "en"
      ],
      "base_model": [
        "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled"
      ],
      "pipeline_tag": "text-generation"
    },
    "hero_image_url": "benchmark/Kimi_distilled.png",
    "summary": "",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\ntags:\n- gguf\n- llama.cpp\n- unsloth\n- reasoning\n- distillation\n- sft\nlicense: apache-2.0\ndatasets:\n- khazarai/kimi-2.5-high-reasoning-250x\nlanguage:\n- en\nbase_model:\n- khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled\npipeline_tag: text-generation\n---\n\n# Qwen3-4B-Kimi2.5-Reasoning-Distilled : GGUF\n\n## Model: khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled\n\n![alt=\"General Benchmark Comparison Chart\"](benchmark/Kimi_distilled.png)\n\n- **Success Rate**: 76.09%\n\n## Model: Qwen/Qwen3-4B-Thinking-2507\n\n![alt=\"General Benchmark Comparison Chart\"](benchmark/BaseModel.png)\n\n- **Success Rate**: 73.73%\n\n- **Benchmark**: khazarai/Multi-Domain-Reasoning-Benchmark\n- **Total Questions**: 100\n\n`Qwen3-4B-Kimi2.5-Reasoning-Distilled` is a fine-tuned language model optimized for structured, long-form reasoning. It is derived from the Qwen3-4b-Thinking-2507 base model and fine-tuned using a specialized distillation dataset generated by Kimi-2.5-thinking.\n\nThis model is designed to bridge the gap between small, efficient models (0.6B–4B range) and the complex reasoning capabilities typically found in much larger models. It excels at breaking down problems, self-correcting, and providing detailed analytical answers.\n\n**Base Model**:\tQwen3-4b-Thinking-2507\n\n**Training Technique**:\tUnsloth + QLoRa\n \n\n## Available Model files:\n- `qwen3-4b-thinking-2507.BF16.gguf`\n- `qwen3-4b-thinking-2507.Q8_0.gguf`\n- `qwen3-4b-thinking-2507.Q6_K.gguf`\n- `qwen3-4b-thinking-2507.Q4_K_M.gguf`\n\n## Ollama\nAn Ollama Modelfile is included for easy deployment.\n \n\n## Provided Quants\n\n(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)\n\n| Type | Size/GB | Notes |\n|:-----|--------:|:------|\n| Q4_K_M | 2.5 | fast, recommended |\n| Q6_K | 3.3 | very good quality |\n| Q8_0 | 4.2 | fast, best quality |\n| f16 | 8.0 | 16 bpw, overkill |\n\nHere is a handy graph by ikawrakow comparing some lower-quality quant\ntypes (lower is better):\n\n![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)\n\nAnd here are Artefact2's thoughts on the matter:\nhttps://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9\n \n\n## Dataset\n\nThe model was fine-tuned on the [khazarai/kimi-2.5-high-reasoning-250x](https://huggingface.co/datasets/khazarai/kimi-2.5-high-reasoning-250x)  \n\nDataset Composition:\n - Total Samples: 250\n - Total Tokens: 1,114,407\n - Teacher Model: Kimi-2.5-Thinking\n\n## Acknowledgements\n\n**Unsloth** for the incredibly fast and memory-efficient training framework.",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "qwen3",
    "llama.cpp",
    "unsloth",
    "reasoning",
    "distillation",
    "sft",
    "text-generation",
    "en",
    "dataset:khazarai/kimi-2.5-high-reasoning-250x",
    "base_model:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled",
    "base_model:quantized:khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 5,
  "downloads": 7912,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-15T11:26:09.000Z",
  "created_at": "2026-03-16T16:28:00.000Z",
  "pipeline_tag": "text-generation",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "69b82f90c8973a232d994b63",
  "id": "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled-GGUF",
  "modelId": "khazarai/Qwen3-4B-Kimi2.5-Reasoning-Distilled-GGUF",
  "sha": "4c65129ebe481d33203d7b7fd65b3c2a930ffad3",
  "createdAt": "2026-03-16T16:28:00.000Z",
  "lastModified": "2026-04-15T11:26:09.000Z",
  "author": "khazarai",
  "downloads": 7912,
  "likes": 5,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "",
  "siblings_count": 10
}