What license applies to ronaldcmz/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF?

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Model Intelligence Sheet

ronaldcmz/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF overview

Q: How do I run ronaldcmz/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF locally?

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

🌟 Gemma 4 26B A4B x Claude Opus 4.6 Build Environment & Features: Fine tuning Framework : Unsloth Reasoning Effort : High This model bridges the gap between G…

gguftext-generation-inferencellama.cppunslothgemma4reasoningdataset:TeichAI/Claude-Opus-4.6-Reasoning-887xdataset:TeichAI/Claude-Sonnet-4.6-Reasoning-1100xdataset:TeichAI/claude-4.5-opus-high-reasoning-250xdataset:TeichAI/Claude-Opus-4.6-Reasoning-500xdataset:Crownelius/Opus-4.6-Reasoning-2100x-formattedbase_model:TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distillbase_model:quantized:TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distilllicense:apache-2.0endpoints_compatibleregion:usconversational

Runs locally from ~1.11 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads

Likes

Pipeline

—

Author

ronaldcmz

Repository Files & Downloads

11 GGUF files detected

Direct downloads for local inference

File	Type	Quantization	Size	Link
gemma-4-26B-A4B-it-Claude-Opus-Distill.bf16.gguf	GGUF	GGUF	47.04 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.iq4_nl.gguf	GGUF	GGUF	13.58 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q3_k_m.gguf	GGUF	GGUF	12.37 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q3_k_s.gguf	GGUF	GGUF	11.38 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q4_k_m.gguf	GGUF	GGUF	15.64 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q5_k_m.gguf	GGUF	GGUF	17.82 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q6_k.gguf	GGUF	GGUF	21.08 GB	Download
gemma-4-26B-A4B-it-Claude-Opus-Distill.q8_0.gguf	GGUF	GGUF	25.02 GB	Download
mmproj-BF16.gguf	GGUF	BF16	1.11 GB	Download
mmproj-F16.gguf	GGUF	F16	1.11 GB	Download
mmproj-F32.gguf	GGUF	F32	2.13 GB	Download

Model Details

Model ID	ronaldcmz/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF
Author	ronaldcmz
Pipeline	—
License	apache-2.0
Base model	TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distill
Last modified	2026-06-20T06:21:24.000Z

Model README

---

base_model: TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distill

tags:

text-generation-inference
llama.cpp
gguf
unsloth
gemma4
reasoning

license: apache-2.0

datasets:

TeichAI/Claude-Opus-4.6-Reasoning-887x
TeichAI/Claude-Sonnet-4.6-Reasoning-1100x
TeichAI/claude-4.5-opus-high-reasoning-250x
TeichAI/Claude-Opus-4.6-Reasoning-500x
Crownelius/Opus-4.6-Reasoning-2100x-formatted

---

🌟 Gemma 4 - 26B A4B x Claude Opus 4.6

> Build Environment & Features:

> - Fine-tuning Framework: Unsloth

> - Reasoning Effort: High

> - This model bridges the gap between Google's exceptional open-weights architecture and Claude 4.6's profound reasoning capabilities, leveraging cutting-edge fine-tuning environments.

!Gemma 4 Benchmarks

🚀 Important Update: Version 2 Now Available

> - Upgrade Alert: A significantly enhanced version of this distillation has been released. We highly recommend switching to v2 for a superior reasoning experience and improved stability.

> - Model Link: gemma-4-26B-A4B-it-Claude-Opus-Distill-v2-GGUF

🔄 Key Enhancements in v2

We have implemented critical upgrades to further refine the model's performance:

✨ Dataset Quality: Re-curated high-density reasoning paths, resulting in significantly higher quality responses and more nuanced logical depth.
🛠️ Chat Template Fixes: Comprehensive structural fixes to the chat template to improve formatting.
⚖️ Generalization vs. Style: v2 was trained with a large batch size, high rank/alpha, and a low learning rate (LR). This configuration prioritizes broad generalization over specific stylistic imitation. We are currently evaluating which approach offers the best real-world utility; we are also looking for community feedback, as numbers alone don't always tell the full story.

💡 Model Introduction

Gemma 4 - 26B A4B x Claude Opus 4.6 is a highly capable model fine-tuned on top of the powerful Gemma 4 architecture. The model's core directive is to absorb state-of-the-art reasoning distillation, primarily sourced from Claude-4.6 Opus interactions.

By utilizing datasets where the reasoning effort was explicitly set to High, this model excels in breaking down complex problems and delivering precise, nuanced solutions across a variety of demanding domains.

🗺️ Training Pipeline Overview

Base Model (unsloth/gemma-4-26B-A4B-it)
 │
 ▼
Supervised Fine-Tuning (SFT) + High-Effort Reasoning Datasets
 │
 ▼
Final Model (Gemma 4 - 26B A4B x Claude Opus 4.6)

📋 Stage Details & Benchmarks

Performance vs Size:

> Deep Dive Analysis: For more comprehensive insights regarding the base capabilities of the Gemma 4 architecture, please refer to this Analysis Document.

🔹 Supervised Fine-Tuning (Meeting Claude)

- Objective: To inject high-density reasoning logic and establish a strict format for complex problem-solving.

- Methodology: We utilized Unsloth for highly efficient memory and compute optimization during the fine-tuning process. The model was trained extensively on various reasoning trajectories from Claude Opus 4.6 to adopt a structured and efficient thinking pattern.

📚 All Datasets Used

The dataset consists of high-quality, high-effort reasoning distillation data:

| Dataset Name | Description / Purpose |

|--------------|-----------------------|

| TeichAI/Claude-Opus-4.6-Reasoning-887x | Core Claude 4.6 Opus reasoning trajectories. |

| TeichAI/Claude-Sonnet-4.6-Reasoning-1100x | Additional high-density reasoning instances from Claude 4.6 Sonnet. |

| TeichAI/claude-4.5-opus-high-reasoning-250x | Legacy high-intensity reasoning distillation. |

| TeichAI/Claude-Opus-4.6-Reasoning-500x | Additional Opus 4.6 reasoning traces targeting domain diversity |

| Crownelius/Opus-4.6-Reasoning-2100x-formatted | Crownelius's extensively formatted Opus reasoning dataset for structural reinforcement. |

🌟 Core Skills & Capabilities

Thanks to its robust base model and high-effort reasoning distillation, this model is highly optimized for the following use cases:

💻 Coding: Advanced code generation, debugging, and software architecture planning.
🔬 Science: Deep scientific reasoning, hypothesis evaluation, and analytical problem-solving.
🔎 Deep Research: Navigating complex, multi-step research queries and synthesizing vast amounts of information.
🧠 General Purpose: Highly capable instruction-following for everyday tasks requiring high logical coherence.

Best Practices

For the best performance, use these configurations and best practices:

1. Sampling Parameters

Use the following standardized sampling configuration across all use cases:

temperature=1.0
top_p=0.95
top_k=64

2. Thinking Mode Configuration

Compared to Gemma 3, the models use standard system, assistant, and user roles. To properly manage the thinking process, use the following control tokens:

Trigger Thinking: Thinking is enabled by including the <|think|> token at the start of the system prompt. To disable thinking, remove the token.
Standard Generation: When thinking is enabled, the model will output its internal reasoning followed by the final answer using this structure:

<|channel>thought\n[Internal reasoning]<channel|>

Disabled Thinking Behavior: For all models except for the E2B and E4B variants, if thinking is disabled, the model will still generate the tags but with an empty thought block:

<|channel>thought\n<channel|>[Final answer]

> [!Note]

> Note that many libraries like Transformers and llama.cpp handle the complexities of the chat template for you.

3. Multi-Turn Conversations

No Thinking Content in History: In multi-turn conversations, the historical model output should only include the final response. Thoughts from previous model turns must not be added before the next user turn begins.

4. Modality order

For optimal performance with multimodal inputs, place image and/or audio content before the text in your prompt.

5. Variable Image Resolution

Aside from variable aspect ratios, Gemma 4 supports variable image resolution through a configurable visual token budget, which controls how many tokens are used to represent an image. A higher token budget preserves more visual detail at the cost of additional compute, while a lower budget enables faster inference for tasks that don't require fine-grained understanding.

The supported token budgets are: 70, 140, 280, 560, and 1120.

Use lower budgets* for classification, captioning, or video understanding, where faster inference and processing many frames outweigh fine-grained detail.

Use higher budgets* for tasks like OCR, document parsing, or reading small text.

6. Audio

Use the following prompt structures for audio processing:

Audio Speech Recognition (ASR)

Transcribe the following speech segment in {LANGUAGE} into {LANGUAGE} text.

Follow these specific instructions for formatting the answer:
* Only output the transcription, with no newlines.
* When transcribing numbers, write the digits, i.e. write 1.7 and not one point seven, and write 3 instead of three.

Automatic Speech Translation (AST)

Transcribe the following speech segment in {SOURCE_LANGUAGE}, then translate it into {TARGET_LANGUAGE}.
When formatting the answer, first output the transcription in {SOURCE_LANGUAGE}, then one newline, then output the string '{TARGET_LANGUAGE}: ', then the translation in {TARGET_LANGUAGE}.

7. Audio and Video Length

All models support image inputs and can process videos as frames whereas the E2B and E4B models also support audio inputs. Audio supports a maximum length of 30 seconds. Video supports a maximum of 60 seconds assuming the images are processed at one frame per second.

🙏 Acknowledgements

- Google: For providing an exceptional open weights model. Read more about Gemma 4 on the Google Innovation Blog.

- Unsloth: For assembling ready-to-use, cutting-edge fine-tuning environments that make this work possible.

- Crownelius: For creating and sharing his awesome Opus reasoning dataset with the community.

📖 Citation

If you use this model in your research or projects, please cite:

@misc{teichai_gemma4_26b_a4b_opus_distilled,
  title        = {Gemma-4-26B-A4B-it-Claude-Opus-Distill},
  author       = {TeichAI},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distill}}
}

Run ronaldcmz/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models