GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

prithivMLmods/gemma-4-E2B-it-GGUF overview

gemma 4 E2B it GGUF Gemma 4 E2B it from Google is an ultra efficient 2.3B effective parameter 5.1B total with Per Layer Embeddings multimodal dense model in th…

transformersggufgemma4text-generation-inferencellama-cppmoee2bimage-text-to-textenbase_model:google/gemma-4-E2B-itbase_model:quantized:google/gemma-4-E2B-itlicense:apache-2.0endpoints_compatibleregion:usconversational

Runs locally from ~531.5 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
1
Pipeline
image-text-to-text

Repository Files & Downloads

17 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
gemma-4-E2B-it.BF16.ggufGGUFGGUF8.67 GBDownload
gemma-4-E2B-it.F16.ggufGGUFGGUF8.67 GBDownload
gemma-4-E2B-it.Q2_K.ggufGGUFGGUF2.78 GBDownload
gemma-4-E2B-it.Q3_K_L.ggufGGUFGGUF3.06 GBDownload
gemma-4-E2B-it.Q3_K_M.ggufGGUFGGUF2.98 GBDownload
gemma-4-E2B-it.Q3_K_S.ggufGGUFGGUF2.90 GBDownload
gemma-4-E2B-it.Q4_0.ggufGGUFGGUF3.13 GBDownload
gemma-4-E2B-it.Q4_K_M.ggufGGUFGGUF3.19 GBDownload
gemma-4-E2B-it.Q4_K_S.ggufGGUFGGUF3.13 GBDownload
gemma-4-E2B-it.Q5_0.ggufGGUFGGUF3.35 GBDownload
gemma-4-E2B-it.Q5_K_M.ggufGGUFGGUF3.38 GBDownload
gemma-4-E2B-it.Q5_K_S.ggufGGUFGGUF3.35 GBDownload
gemma-4-E2B-it.Q6_K.ggufGGUFGGUF3.58 GBDownload
gemma-4-E2B-it.Q8_0.ggufGGUFGGUF4.61 GBDownload
gemma-4-E2B-it.mmproj-bf16.ggufGGUFBF16941.1 MBDownload
gemma-4-E2B-it.mmproj-f16.ggufGGUFF16941.1 MBDownload
gemma-4-E2B-it.mmproj-q8_0.ggufGGUFQ8_0531.5 MBDownload

Model Details

Model IDprithivMLmods/gemma-4-E2B-it-GGUF
AuthorprithivMLmods
Pipelineimage-text-to-text
Licenseapache-2.0
Base modelgoogle/gemma-4-E2B-it
Last modified2026-06-07T04:54:13.000Z

Model README

---

license: apache-2.0

language:

  • en

base_model:

  • google/gemma-4-E2B-it

pipeline_tag: image-text-to-text

library_name: transformers

tags:

  • text-generation-inference
  • llama-cpp
  • moe
  • e2b

---

gemma-4-E2B-it-GGUF

> Gemma-4-E2B-it from Google is an ultra-efficient 2.3B effective parameter (5.1B total with Per-Layer Embeddings) multimodal dense model in the Gemma 4 family, purpose-built for on-device deployment across smartphones, laptops, Raspberry Pi, and IoT edge hardware with native support for text, images (variable aspect ratio/resolution), audio, and configurable thinking modes for advanced reasoning. Featuring 35 layers, 512-token sliding window, 128K context length, and 262K vocabulary, it excels at agentic workflows, OCR (multilingual/handwriting), document/PDF parsing, UI/screen understanding, chart comprehension, object detection, coding assistance, and low-latency inference optimized for Qualcomm/MediaTek chips via Android AICore—delivering frontier-level intelligence rivaling models 20x larger while consuming minimal RAM/battery. The instruction-tuned variant prioritizes seamless integration for mobile developers prototyping autonomous agents, with safety protocols matching Google's proprietary standards and open weights enabling local-first AI servers on consumer GPUs for reasoning-heavy tasks like IDE assistance and structured data extraction.

Model Files

File Name | Quant Type | File Size | File Link |

|-----------|------------|-----------|-----------|

| gemma-4-E2B-it.BF16.gguf | BF16 | 9.31 GB | Download |

| gemma-4-E2B-it.F16.gguf | F16 | 9.31 GB | Download |

| gemma-4-E2B-it.Q2_K.gguf | Q2_K | 2.99 GB | Download |

| gemma-4-E2B-it.Q3_K_L.gguf | Q3_K_L | 3.28 GB | Download |

| gemma-4-E2B-it.Q3_K_M.gguf | Q3_K_M | 3.2 GB | Download |

| gemma-4-E2B-it.Q3_K_S.gguf | Q3_K_S | 3.11 GB | Download |

| gemma-4-E2B-it.Q4_0.gguf | Q4_0 | 3.36 GB | Download |

| gemma-4-E2B-it.Q4_K_M.gguf | Q4_K_M | 3.43 GB | Download |

| gemma-4-E2B-it.Q4_K_S.gguf | Q4_K_S | 3.37 GB | Download |

| gemma-4-E2B-it.Q5_0.gguf | Q5_0 | 3.6 GB | Download |

| gemma-4-E2B-it.Q5_K_M.gguf | Q5_K_M | 3.63 GB | Download |

| gemma-4-E2B-it.Q5_K_S.gguf | Q5_K_S | 3.6 GB | Download |

| gemma-4-E2B-it.Q6_K.gguf | Q6_K | 3.85 GB | Download |

| gemma-4-E2B-it.Q8_0.gguf | Q8_0 | 4.95 GB | Download |

| gemma-4-E2B-it.mmproj-bf16.gguf | mmproj-bf16 | 987 MB | Download |

| gemma-4-E2B-it.mmproj-f16.gguf | mmproj-f16 | 987 MB | Download |

| gemma-4-E2B-it.mmproj-q8_0.gguf | mmproj-q8_0 | 557 MB | Download |

llama.cpp

LLM inference in C/C++ — https://github.com/ggml-org/llama.cpp

Run prithivMLmods/gemma-4-E2B-it-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models