Dzluck/gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-GGUF overview
gemma 4 E4B Gemini 3.1 Pro Reasoning Distill GGUF This repository contains GGUF format model files for Ayodele01's gemma 4 E4B Gemini 3.1 Pro Reasoning Distill…
Runs locally from ~4.52 GB disk (8 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q3_K_M.gguf | GGUF | Q3_K_M | 4.52 GB | Download |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q4_K_M.gguf | GGUF | Q4_K_M | 4.97 GB | Download |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf | GGUF | Q5_K_M | 5.37 GB | Download |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q6_K.gguf | GGUF | Q6_K | 5.79 GB | Download |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q8_0.gguf | GGUF | Q8_0 | 7.48 GB | Download |
Model Details
| Model ID | Dzluck/gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-GGUF |
|---|---|
| Author | Dzluck |
| Pipeline | text-generation |
| License | — |
| Base model | Ayodele01/gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill |
| Last modified | 2026-06-20T15:25:31.000Z |
Model README
---
base_model: Ayodele01/gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill
library_name: gguf
pipeline_tag: text-generation
tags:
- gguf
- quantized
- gemma
---
gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-GGUF
This repository contains GGUF format model files for Ayodele01's gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill.
These models were compiled and quantized via llama.cpp to enable efficient local inference on consumer hardware.
Available Quantizations
| File Name | Description |
|---|---|
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q8_0.gguf | 8-bit quantization. Near unquantized performance, largest file size. |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q6_K.gguf | 6-bit quantization. Very high quality, minimal degradation from original. |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf | 5-bit quantization. Higher quality, slightly larger size and slower inference. |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q4_K_M.gguf | 4-bit quantization. Recommended. Excellent balance of speed, memory usage, and quality. |
| gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-Q3_K_M.gguf | 3-bit quantization. Very high compression, fast inference, lower quality. |
Run Dzluck/gemma-4-E4B-Gemini-3.1-Pro-Reasoning-Distill-GGUF with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models