jackrong/qwen3-4b-thinking-2507-glm-4.7-distilled-gguf Q3_K_L GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

jackrong/qwen3-4b-thinking-2507-glm-4.7-distilled-gguf overview

Qwen3-4B-Thinking-2507-GLM-4.7-Distilled is a fine-tuned model built upon the GRPO-optimized Jackrong/DASD-4B-Thinking-2507-GRPO-v2 (originally based on Qwen/Qwen3-4B-Thinking-2507). This model was developed using a Supervised Fine-Tuning (SFT) strategy heavily distilled from the GLM-4.7 model series (at a default temperature of 1.0), with a central focus on multi-turn conversational alignment and structured Chain-of-Thought (CoT) execution. 🎯 Core Improvement: The primary objective of this fine-tuning was to transform the model's reasoning pattern for everyday and lightweight tasks. Instead of the typical linear, free-associative, and highly self-correcting ("think-as-you-go") stream of consciousness, this model has learned to adopt a highly confident, "Plan-then-Execute" paradigm. It systematically breaks down tasks into logical outlines and executes modular, report-like responses without unnecessary self-doubt or hesitation. ---

ggufqwen3unslothtext-generationreasoningmathgrposftdistillationconversationalglm-4.7enzhbase_model:Jackrong/DASD-4B-Thinking-2507-GRPO-v2base_model:quantized:Jackrong/DASD-4B-Thinking-2507-GRPO-v2license:apache-2.0endpoints_compatibleregion:us

jackrong/qwen3-4b-thinking-2507-glm-4.7-distilled-gguf visual

Downloads

658

Likes

Pipeline

text-generation

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

12 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-BF16.gguf	GGUF	BF16	7.50 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-IQ4_XS.gguf	GGUF	IQ4_XS	2.13 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q2_K.gguf	GGUF	Q2_K	1.55 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q3_K_L.gguf	GGUF	Q3_K_L	2.09 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q3_K_M.gguf	GGUF	Q3_K_M	1.93 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q3_K_S.gguf	GGUF	Q3_K_S	1.76 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q4_K_S.gguf	GGUF	Q4_K_S	2.22 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q5_K_M.gguf	GGUF	Q5_K_M	2.69 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q5_K_S.gguf	GGUF	Q5_K_S	2.63 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-Q6_K.gguf	GGUF	Q6_K	3.08 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled.Q4_K_M.gguf	GGUF	Q4_K_M	2.33 GB	Download
Qwen3-4B-Thinking-2507-GLM-4.7-Distilled.Q8_0.gguf	GGUF	—	3.99 GB	Download

Model Details Live

Model Slug

jackrong/qwen3-4b-thinking-2507-glm-4.7-distilled-gguf

Author

Jackrong

Pipeline Task

text-generation

Library

—

Created

2026-02-23

Last Modified

2026-02-24

Gated

Private

HF SHA

cf4722c669a1369d1b48e78a6a6c73ba031b057f

License

apache-2.0

Language

en, zh

Base Model

Jackrong/DASD-4B-Thinking-2507-GRPO-v2

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "language": [
      "en",
      "zh"
    ],
    "license": "apache-2.0",
    "base_model": "Jackrong/DASD-4B-Thinking-2507-GRPO-v2",
    "tags": [
      "qwen3",
      "unsloth",
      "text-generation",
      "reasoning",
      "math",
      "grpo",
      "sft",
      "distillation",
      "conversational",
      "glm-4.7"
    ],
    "pipeline_tag": "text-generation",
    "frontmatter": {
      "language": [
        "en",
        "zh"
      ],
      "license": "apache-2.0",
      "base_model": "Jackrong/DASD-4B-Thinking-2507-GRPO-v2",
      "tags": [
        "qwen3",
        "unsloth",
        "text-generation",
        "reasoning",
        "math",
        "grpo",
        "sft",
        "distillation",
        "conversational",
        "glm-4.7"
      ],
      "pipeline_tag": "text-generation"
    },
    "hero_image_url": "",
    "summary": "**Qwen3-4B-Thinking-2507-GLM-4.7-Distilled** is a fine-tuned model built upon the GRPO-optimized Jackrong/DASD-4B-Thinking-2507-GRPO-v2 (originally based on Qwen/Qwen3-4B-Thinking-2507). This model was developed using a Supervised Fine-Tuning (SFT) strategy heavily distilled from the GLM-4.7 model series (at a default temperature of 1.0), with a central focus on multi-turn conversational alignment and **structured Chain-of-Thought (CoT) execution**. 🎯 **Core Improvement:** The primary objective of this fine-tuning was to transform the model's reasoning pattern for everyday and lightweight tasks. Instead of the typical linear, free-associative, and highly self-correcting (\"think-as-you-go\") stream of consciousness, this model has learned to adopt a highly confident, **\"Plan-then-Execute\"** paradigm. It systematically breaks down tasks into logical outlines and executes modular, report-like responses without unnecessary self-doubt or hesitation. ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlanguage:\n- en\n- zh\nlicense: apache-2.0\nbase_model: Jackrong/DASD-4B-Thinking-2507-GRPO-v2\ntags:\n- qwen3\n- unsloth\n- text-generation\n- reasoning\n- math\n- grpo\n- sft\n- distillation\n- conversational\n- glm-4.7\npipeline_tag: text-generation\n---\n\n# Qwen3-4B-Thinking-2507-GLM-4.7-Distilled\n\n**Qwen3-4B-Thinking-2507-GLM-4.7-Distilled** is a fine-tuned model built upon the GRPO-optimized `Jackrong/DASD-4B-Thinking-2507-GRPO-v2` (originally based on [`Qwen/Qwen3-4B-Thinking-2507`](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)). This model was developed using a Supervised Fine-Tuning (SFT) strategy heavily distilled from the GLM-4.7 model series (at a default temperature of 1.0), with a central focus on multi-turn conversational alignment and **structured Chain-of-Thought (CoT) execution**.\n\n🎯 **Core Improvement:** \nThe primary objective of this fine-tuning was to transform the model's reasoning pattern for everyday and lightweight tasks. Instead of the typical linear, free-associative, and highly self-correcting (\"think-as-you-go\") stream of consciousness, this model has learned to adopt a highly confident, **\"Plan-then-Execute\"** paradigm. It systematically breaks down tasks into logical outlines and executes modular, report-like responses without unnecessary self-doubt or hesitation.\n\n---\n\n## 🧬 Training Pipeline Overview\n\nThis model is the culmination of two sequential training stages targeting mathematical reasoning and conversational CoT tracking:\n\n```text\nQwen/Qwen3-4B-Thinking-2507\n         │\n         ▼  Stage 0: GRPO (RL on Math & Reasoning)\nDASD-4B-Thinking-2507-GRPO-v2\n         │\n         ▼  Stage 1: SFT with GLM-4.7 Series Distilled Datasets (T=1.0)\nQwen3-4B-Thinking-2507-GLM-4.7-Distilled  ← (this model)\n```\n\n### 🧠 Chain of Thought (CoT) Evolution: Base vs. Distilled\n\nA significant shift in the model's reasoning style is observed after distillation from the GLM-4.7 series data. The model transitions from a **spontaneous thinker** into a **structured planner**:\n\n| 🎯 Feature | 🌀 Base Model (Qwen3-4B-Thinking) | ✨ Distilled Model (GLM-4.7-Distilled) |\n| :--- | :--- | :--- |\n| **Thinking Style** | 🌊 Linear, stream-of-consciousness | 🧱 Modularized, report-like |\n| **Execution** | 🏃 Thinks on the fly, writes as it thinks | 📝 \"Plan-then-Execute\" framework |\n| **Structure** | 🔀 Unstructured, organic self-correction mid-thought | 📑 Highly structured with headings & logical phases |\n| **Confidence** | 🤔 High self-doubt (\"Wait...\", \"Maybe...\", \"Should I...\") | 🚀 Highly confident, rarely hesitates |\n| **Output Tone** | 🗣️ Conversational, exploring multiple paths | 📊 Objective, direct, and systematic |\n\n**🌟 Key Takeaway:**\nThrough the GLM-4.7 dataset distillation, the model successfully learned the **modular thinking paradigm**. Instead of continuously questioning itself, it now *breaks down tasks, creates a clear outline, and systematically executes each step* like writing a formal report.\n\n---\n\n## 📚 Stage Details\n\n### Stage 0 — GRPO Reinforcement Learning: `DASD-4B-Thinking-2507-GRPO-v2`\n\nStarting from the base model `Qwen/Qwen3-4B-Thinking-2507`, Group Relative Policy Optimization (GRPO) was applied. This stage consisted of:\n- **Cold Start:** Fine-tuning on the [`unsloth/OpenMathReasoning-mini`](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini) dataset.\n- **Reinforcement Learning:** Applying GRPO via the [`open-r1/DAPO-Math-17k-Processed`](https://huggingface.co/datasets/open-r1/DAPO-Math-17k-Processed) dataset.\n\nThis stage significantly improved the model's:\n\n- Correctness on math problem solving\n- Step-by-step logical reasoning\n- Reward signal alignment for verifiable tasks\n\n---\n\n### Stage 1 — SFT GLM-4.7 Distillation (T=1.0): `Qwen3-4B-Thinking-2507-GLM-4.7-Distilled` (this model)\n\nBuilding on the reasoning foundation of `DASD-4B-Thinking-2507-GRPO-v2`, Stage 1 SFT was performed using a mixed dataset heavily utilizing **GLM-4.7** synthetic data generated at a **default temperature of 1.0**, along with multi-turn alignments.\n\nHigher-temperature data introduces greater **lexical diversity, broader mode coverage, and more formatted/structured chain-of-thought traces**, enabling the model to generalize better across diverse conversational reasoning patterns and problem domains. It helps the model handle multi-turn conversations effectively while protecting its internal structure of `<think>...</think>` tracking.\n\n---\n\n## 🗂️ All Datasets Used\n\n| Stage | Dataset | Purpose |\n|-------|---------|---------|\n| GRPO (Cold Start) | [`unsloth/OpenMathReasoning-mini`](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini) | Initial foundational mathematical reasoning |\n| GRPO (RL) | [`open-r1/DAPO-Math-17k-Processed`](https://huggingface.co/datasets/open-r1/DAPO-Math-17k-Processed) | Math & reasoning RL training via GRPO |\n| SFT Distillation | [`Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b`](https://huggingface.co/datasets/Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b) (Stage 2) | Diverse reasoning structures |\n| SFT Distillation | [`Jackrong/glm-4.7-multiturn-CoT`](https://huggingface.co/datasets/Jackrong/glm-4.7-multiturn-CoT) | Multi-turn CoT alignment |\n| SFT Distillation | [`Jackrong/glm-4.7-Superior-Reasoning-stage1`](https://huggingface.co/datasets/Jackrong/glm-4.7-Superior-Reasoning-stage1) | Enhanced fundamental reasoning |\n| SFT Distillation | [`TeichAI/glm-4.7-2000x`](https://huggingface.co/datasets/TeichAI/glm-4.7-2000x) | Generalization and lexical diversity |\n| SFT Distillation | [`Jackrong/MultiReason-ChatAlpaca`](https://huggingface.co/datasets/Jackrong/MultiReason-ChatAlpaca) | Conversational multi-turn tracking |\n\n---\n\n## 🏃 Quickstart\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_name = \"Jackrong/Qwen3-4B-Thinking-2507-GLM-4.7-Distilled\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=\"auto\", device_map=\"auto\")\n\nmessages = [\n    {\"role\": \"user\", \"content\": \"Solve: find all real solutions to x^3 - 6x^2 + 11x - 6 = 0.\"}\n]\n\ntext = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\ninputs = tokenizer([text], return_tensors=\"pt\").to(model.device)\noutputs = model.generate(**inputs, max_new_tokens=4096)\nresponse = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)\nprint(response)\n```\n\n> **Tip:** This model naturally generates `<think>...</think>` reasoning traces before the final answer. You can parse these to inspect the chain-of-thought.\n\n---\n\n## 📋 Model Details\n\n| Attribute | Value |\n|-----------|-------|\n| **Base Model** | `Jackrong/DASD-4B-Thinking-2507-GRPO-v2` |\n| **Architecture** | Qwen3 (4B Dense) |\n| **License** | Apache 2.0 |\n| **Language(s)** | English, Chinese |\n| **Training Framework** | [Unsloth](https://github.com/unslothai/unsloth) + Hugging Face TRL |\n| **RL Algorithm** | GRPO (Group Relative Policy Optimization) |\n| **Fine-tuning Method** | SFT (GLM-4.7 Distillation at T=1.0) |\n| **Developed by** | Jackrong |\n\n---\n\n## ⚠️ Limitations & Intended Use\n\n- This model is intended for **research and educational purposes** related to reasoning and mathematical problem-solving.\n- While mathematical and logical reasoning capabilities have been enhanced, the model may still produce incorrect answers or hallucinations — always verify outputs on critical tasks.\n- The model inherits the capabilities and limitations of the underlying `Qwen3-4B-Thinking-2507` architecture.\n- Not intended for deployment in high-stakes applications without additional safety evaluation.\n\n---\n\n## 📎 Related Models\n\n| Model | Description |\n|-------|-------------|\n| [`Qwen/Qwen3-4B-Thinking-2507`](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) | Base model |\n| [`Jackrong/DASD-4B-Thinking-2507-GRPO-v2`](https://huggingface.co/Jackrong/DASD-4B-Thinking-2507-GRPO-v2) | After GRPO RL training |\n| [`Jackrong/Qwen3-4B-Thinking-2507-GLM-4.7-Distilled`](https://huggingface.co/Jackrong/Qwen3-4B-Thinking-2507-GLM-4.7-Distilled) | **This model** — GLM-4.7 Distilled |\n\n---\n\n## 🙏 Acknowledgements\n\n- [Zhipu AI](https://huggingface.co/THUDM) for the GLM-4.7 model series capability\n- [Alibaba Cloud Apsara Lab](https://huggingface.co/Alibaba-Apsara) for reasoning datasets\n- [Open-R1](https://huggingface.co/open-r1) for the DAPO Math dataset\n- [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning infrastructure\n- [Qwen Team](https://huggingface.co/Qwen) for the excellent base model\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "qwen3",
    "unsloth",
    "text-generation",
    "reasoning",
    "math",
    "grpo",
    "sft",
    "distillation",
    "conversational",
    "glm-4.7",
    "en",
    "zh",
    "base_model:Jackrong/DASD-4B-Thinking-2507-GRPO-v2",
    "base_model:quantized:Jackrong/DASD-4B-Thinking-2507-GRPO-v2",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 2,
  "downloads": 658,
  "gated": false,
  "private": false,
  "last_modified": "2026-02-24T02:58:14.000Z",
  "created_at": "2026-02-23T15:05:12.000Z",
  "pipeline_tag": "text-generation",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "699c6ca8dac985183e28ce38",
  "id": "Jackrong/Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-GGUF",
  "modelId": "Jackrong/Qwen3-4B-Thinking-2507-GLM-4.7-Distilled-GGUF",
  "sha": "cf4722c669a1369d1b48e78a6a6c73ba031b057f",
  "createdAt": "2026-02-23T15:05:12.000Z",
  "lastModified": "2026-02-24T02:58:14.000Z",
  "author": "Jackrong",
  "downloads": 658,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "",
  "siblings_count": 15
}

jackrong/qwen3-4b-thinking-2507-glm-4.7-distilled-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard