NobodyWho/Google_Gemma4-E2B-GGUF overview
NobodyWho/Google Gemma4 E2B GGUF Overview GGUF quantization of Google's Gemma 4 E2B instruction tuned model, re hosted for NobodyWho https://github.com/nobodyw…
Runs locally from ~941.1 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
Model Details
| Model ID | NobodyWho/Google_Gemma4-E2B-GGUF |
|---|---|
| Author | NobodyWho |
| Pipeline | image-text-to-text |
| License | apache-2.0 |
| Base model | google/gemma-4-E2B-it |
| Last modified | 2026-06-16T03:59:07.000Z |
Model README
---
license: apache-2.0
base_model: google/gemma-4-E2B-it
tags:
- gguf
- nobodywho
- tool-calling
- vision
- gemma
pipeline_tag: image-text-to-text
library_name: gguf
---
NobodyWho/Google_Gemma4-E2B-GGUF
Overview
GGUF quantization of Google's Gemma 4 E2B instruction-tuned model, re-hosted for
NobodyWho. The unsloth build already ships a
tool-calling setup and recommended sampling metadata (general.sampling: temp 1.0,
top_k 64, top_p 0.95), so nothing needs patching — the model is verified with NobodyWho's test
suite. E2B is the smallest, most on-device-friendly Gemma 4 variant — multimodal (text + image),
multilingual, and Apache 2.0 licensed.
Model Capabilities
- Text generation — instruction-following chat
- Tool calling — native function calling with grammar-constrained output
- Vision — image understanding via the companion
mmproj-BF16.ggufprojection model - Long context — 128k tokens
- Multilingual — 140+ languages
Available Quantizations
| File | Approach | Tool-calling tests |
|------|----------|--------------------|
| gemma-4-E2B-it-BF16.gguf | Sampling embedded upstream | 14/14 |
| gemma-4-E2B-it-Q8_0.gguf | Sampling embedded upstream | 14/14 |
| gemma-4-E2B-it-Q4_K_M.gguf | Sampling embedded upstream | 14/14 |
| mmproj-BF16.gguf | Vision projection (use with any of the above) | — |
> Verified with NobodyWho's tool-calling suite across BF16 / Q8_0 / Q4_K_M (14/14 each, June
> 2026); vision and multilingual verified per-model. Quant names follow the unsloth gemma-4-E2B-it-GGUF repo.
Quick Start
Using the NobodyWho library:
from nobodywho import Chat
chat = Chat("huggingface:NobodyWho/Google_Gemma4-E2B-GGUF/gemma-4-E2B-it-Q4_K_M.gguf")
response = chat.ask("What is the capital of Denmark?").completed()
print(response) # The capital of Denmark is Copenhagen.
Vision
from nobodywho import Model, Chat, Prompt, Image, Text
model = Model(
"huggingface:NobodyWho/Google_Gemma4-E2B-GGUF/gemma-4-E2B-it-Q4_K_M.gguf",
projection_model_path="huggingface:NobodyWho/Google_Gemma4-E2B-GGUF/mmproj-BF16.gguf",
)
chat = Chat(model=model, system_prompt="You are a helpful assistant.")
response = chat.ask(Prompt([
Text("What is in this image?"),
Image("./photo.png"),
])).completed()
print(response)
llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="NobodyWho/Google_Gemma4-E2B-GGUF",
filename="gemma-4-E2B-it-Q4_K_M.gguf",
)
Model Specifications
- Parameters: ~2.3B effective (E2B)
- Context length: 131,072 tokens
- License: Apache 2.0
- Base model: google/gemma-4-E2B-it
- Architecture: gemma4 (vision-capable)
Licensing / Credits
Licensed under Apache 2.0 (unchanged from upstream). All model credit belongs to Google
DeepMind. GGUF quantizations provided by unsloth.
Run NobodyWho/Google_Gemma4-E2B-GGUF with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models