khtsly/qwen3.5-27b-claude-4.6-opus-distilled-32k-gguf Q5_K_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
Model Intelligence Sheet
khtsly/qwen3.5-27b-claude-4.6-opus-distilled-32k-gguf overview
Comprehensive model page for khtsly/qwen3.5-27b-claude-4.6-opus-distilled-32k-gguf
Downloads
2,463
Likes
3
Pipeline
image-text-to-text
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
12 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.BF16-00002-of-00002.gguf | GGUF | BF16 | 3.59 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.BF16-mmproj.gguf | GGUF | BF16 | 888.01 MB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q2_K.gguf | GGUF | Q2_K | 9.43 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q3_K_L.gguf | GGUF | Q3_K_L | 13.07 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q3_K_M.gguf | GGUF | Q3_K_M | 12.38 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q3_K_S.gguf | GGUF | Q3_K_S | 11.24 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q4_K_M.gguf | GGUF | Q4_K_M | 15.40 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q4_K_S.gguf | GGUF | Q4_K_S | 14.50 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q5_K_M.gguf | GGUF | Q5_K_M | 18.07 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q5_K_S.gguf | GGUF | Q5_K_S | 17.40 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q6_K.gguf | GGUF | Q6_K | 20.57 GB | Download |
| Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k.Q8_0.gguf | GGUF | — | 26.63 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"language": [
"en",
"zh"
],
"license": "apache-2.0",
"base_model": "Qwen/Qwen3.5-27B",
"tags": [
"unsloth",
"qwen",
"qwen3.5",
"reasoning",
"chain-of-thought",
"lora",
"luau",
"gguf",
"llama.cpp",
"vision-language-model"
],
"datasets": [
"nohurry/Opus-4.6-Reasoning-3000x-filtered"
],
"pipeline_tag": "image-text-to-text",
"frontmatter": {
"language": [
"en",
"zh"
],
"license": "apache-2.0",
"base_model": "Qwen/Qwen3.5-27B",
"tags": [
"unsloth",
"qwen",
"qwen3.5",
"reasoning",
"chain-of-thought",
"lora",
"luau",
"gguf",
"llama.cpp",
"vision-language-model"
],
"datasets": [
"nohurry/Opus-4.6-Reasoning-3000x-filtered"
],
"pipeline_tag": "image-text-to-text"
},
"hero_image_url": "",
"summary": "",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlanguage:\n- en\n- zh\nlicense: apache-2.0\nbase_model: Qwen/Qwen3.5-27B\ntags:\n- unsloth\n- qwen\n- qwen3.5\n- reasoning\n- chain-of-thought\n- lora\n- luau\n- gguf\n- llama.cpp\n- vision-language-model\ndatasets:\n- nohurry/Opus-4.6-Reasoning-3000x-filtered\npipeline_tag: image-text-to-text\n---\n\n# Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k\n\n## # Model Introduction\n**Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k** is a highly capable reasoning and coding model fine-tuned on top of the `Qwen3.5-27B` hybrid dense architecture. The model's core directive is to leverage state-of-the-art Chain-of-Thought (CoT) distillation primarily sourced from Claude-4.6 Opus interactions, with a specialized focus on extended output generation and improved Luau programming capability.\n\nThrough Supervised Fine-Tuning (SFT) focusing on structured reasoning logic and a massive 32k output length max, this model excels in breaking down complex user problems, planning step-by-step methodologies within strictly formatted `<think>` tags, and delivering comprehensive, nuanced solutions—even for highly extensive generation tasks.\n\n### # Benchmark\n| Benchmark | Baseline (27B) | Distilled (27B) | [Jackrong (27B)](https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled) |\n| :--- | :---: | :---: | :---: |\n| GPQA Diamond CoT (0-shot) | 55.05 | **69.69** | 67.67 |\n| ARC-Challenge (25-shot) | 73.80 | **74.65** | 74.39 |\n| AIME 2026 (0-shot) | 0.0 | 16.66 | **33.33** |\n| AIME 2025 (0-shot) | 0.0 | 23.33 | **26.66** |\n| MMLU-CF (0-shot) | 72.30 | **73.75** | - |\n| *Humanities* | 78.28 | **79.80** | - |\n| *Social Sciences* | 70.83 | **71.91** | - |\n| *STEM* | 65.74 | **67.25** | - |\n| *Other* | 74.35 | **76.02** | - |\n| IFEval (0-shot) | 38.13 | 38.13 | **38.81** |\n| *Prompt-Level* | 31.05 | 31.05 | **31.7** |\n| *Instruction-Level* | 45.20 | 45.20 | **45.92** |\n\n*The benchmark is taken in 4-bit using `lm eval`. No chat-template enabled in this run. Higher the score is better.*\n\n## # Training Pipeline Overview\n\n```text\nBase Model (Qwen3.5-27B-FP8)\n │\n ▼\nSupervised Fine-Tuning (SFT) + LoRA (r=64, α=128)\n(Response-Only Training masked on \"<|im_start|>assistant\\n\")\n(Max 32k Output Length)\n+\nnohurry/Opus-4.6-Reasoning-3000x-filtered + luau coding samples\n(shuffled)\n │\n ▼\nFinal Model (Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k)\n```\n\n### # Supervised Fine-Tuning (SFT) Details\n- **Objective:** To inject high-density reasoning logic, establish a strict internal thinking format prior to output, and train the model to sustain coherent generation over exceptionally long contexts.\n- **Extended Output Capacity:** Trained specifically to handle up to **32,768 (32k) tokens of maximum output** (recommended), allowing for massive codebases, comprehensive essays, and deeply detailed reasoning traces.\n- **LoRA Configuration:** Fine-tuned efficiently using LoRA (16-bit) with **Rank (r) set to 64** and **Alpha (α) set to 128**, ensuring strong adaptation and retention of complex Opus-level logic.\n- **Method:** Utilized **Unsloth** for highly efficient memory and compute optimization. A critical component was the `train_on_responses_only` strategy, masking instructions so the loss is purely calculated over the generation of the `<think>` sequences and the subsequent solutions.\n- **Format Enforcement:** All training samples were systematically normalized so the model strictly abides by the structure `<think> {internal reasoning} </think>\\n {final answer}`.\n\n### # Datasets Used\nThe dataset consists of highly curated, filtered reasoning distillation data, supplemented by specialized coding sets:\n\n| Dataset Name | Description / Purpose |\n|--------------|-----------------------|\n| [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | Provides comprehensive, high-quality Claude 4.6 Opus reasoning trajectories. |\n| **Custom Luau Coding Set** | 75 meticulously crafted various Luau coding samples generated natively by Opus 4.6, injecting specialized high-quality domain knowledge for Roblox/Luau scripting capability. |\n\n### # Training Compute & Loss Curve\n* **Hardware:** 1x NVIDIA H100 (80GB)\n* **Training Duration:** ~50 Minutes\n* **Estimated Total Cost:** $12.00\n* **Distillation Efficacy:** The loss curve demonstrated a strong, healthy downward trajectory throughout the run, confirming successful knowledge transfer from the Opus teacher model. The model converged steadily from an initial loss of **0.588880** down to a final loss of **0.176861**.\n\n## # Core Skills & Capabilities\n1. **Massive Output Generation:** Capable of sustaining coherent, high-quality output for up to 32k tokens, making it ideal for writing extensive code, documentation, or deep analytical reports in a single shot.\n2. **Modular & Structured Thinking:** Inheriting traits from Opus-level reasoning, the model confidently parses prompts and outlines plans sequentially in its `<think>` block, avoiding exploratory \"trial-and-error\" self-doubt.\n3. **Luau Proficiency:** Thanks to the targeted 75-sample dataset, the model exhibits improved syntax adherence and logic formulation for the Luau programming language.\n\n## # Limitations & Intended Use\n- **Hallucination Risk:** While reasoning is strong, the model remains an autoregressive LLM. Extended 32k outputs may experience minor drift or hallucinate external facts if relying on real-world verification without grounding.\n- **Intended Scenario:** Best suited for offline analytical tasks, heavy coding (especially Luau), math, and logic-dependent prompting where the user needs transparent internal logic and extremely long, continuous outputs.\n\n## # Acknowledgements\n\nThis model's development was made possible by the foundational tools and contributions from the broader AI ecosystem:\n\n* **[Unsloth AI](https://unsloth.ai/):** For their state-of-the-art framework, enabling highly efficient, memory-optimized LoRA tuning and seamless 32k context scaling.\n* **Qwen Team:** For engineering the robust and highly capable `Qwen3.5-27B` dense base architecture.\n* **Dataset Contributors:** Special recognition to `nohurry` for the rigorous curation of the Claude 4.6 Opus reasoning trajectories, which serves as the core cognitive engine for this project's SFT phase.\n\n-https://ko-fi.com/khtsly",
"related_quantizations": []
},
"tags": [
"gguf",
"qwen3_5",
"unsloth",
"qwen",
"qwen3.5",
"reasoning",
"chain-of-thought",
"lora",
"luau",
"llama.cpp",
"vision-language-model",
"image-text-to-text",
"en",
"zh",
"dataset:nohurry/Opus-4.6-Reasoning-3000x-filtered",
"base_model:Qwen/Qwen3.5-27B",
"base_model:adapter:Qwen/Qwen3.5-27B",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 3,
"downloads": 2463,
"gated": false,
"private": false,
"last_modified": "2026-03-22T20:30:37.000Z",
"created_at": "2026-03-06T13:29:11.000Z",
"pipeline_tag": "image-text-to-text",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69aad6a7c13be952df862a9f",
"id": "khtsly/Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k-GGUF",
"modelId": "khtsly/Qwen3.5-27B-Claude-4.6-Opus-Distilled-32k-GGUF",
"sha": "d0ffd5672a9cb54eb334c908cd46d6064065388d",
"createdAt": "2026-03-06T13:29:11.000Z",
"lastModified": "2026-03-22T20:30:37.000Z",
"author": "khtsly",
"downloads": 2463,
"likes": 3,
"gated": false,
"private": false,
"pipeline_tag": "image-text-to-text",
"library_name": "",
"siblings_count": 15
}