What license applies to Dzluck/gemma4-e2b-claude-coder-GGUF?

License: apache-2.0. Verify terms on Hugging Face before commercial use.

How do I run Dzluck/gemma4-e2b-claude-coder-GGUF locally?

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Model Intelligence Sheet

Dzluck/gemma4-e2b-claude-coder-GGUF overview

Gemma 4 Claude Coder — local model family A family of custom models built on Gemma 4 edge variants E2B and E4B , tuned to act as autonomous coding and administ…

ggufollamaclaude-codecodingagentfunction-callinggemmatext-generationenlicense:apache-2.0endpoints_compatibleregion:us

Runs locally from ~6.67 GB disk (8 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads

Likes

Pipeline

text-generation

Author

Dzluck

Repository Files & Downloads

1 GGUF files detected

Direct downloads for local inference

File	Type	Quantization	Size	Link
gemma4-e2b-claude-coder.Q4_K_M.gguf	GGUF	GGUF	6.67 GB	Download

Model Details

Model ID	Dzluck/gemma4-e2b-claude-coder-GGUF
Author	Dzluck
Pipeline	text-generation
License	apache-2.0
Base model	google/gemma-3n-e4b
Last modified	2026-06-22T05:17:58.000Z

Model README

---

license: apache-2.0

base_model: google/gemma-3n-e4b

library_name: gguf

tags:

- gguf

- ollama

- claude-code

- coding

- agent

- function-calling

- gemma

language:

- en

pipeline_tag: text-generation

---

Gemma 4 Claude Coder — local model family

A family of custom models built on Gemma 4 (edge variants E2B and E4B), tuned to act as

autonomous coding and administration agents. The models speak the Anthropic-compatible API,

so they drive Claude Code fully locally — your code never leaves your machine and cloud token

cost drops to zero.

Each model ships with a system prompt focused on real work inside a codebase: use tools instead

of guessing, make minimal and precise code changes, return complete and runnable output, and

verify after acting. Sampling follows Google's official Gemma 4 recommendation

(temperature 1.0, top_k 64, top_p 0.95), with thinking mode enabled for better planning before

a tool call.

The idea

The whole point of this family is to run Claude Code on small, popular, consumer-grade hardware.

No datacenter GPU, no cloud bill — just an everyday Mac Mini (or similar 16 GB machine) acting as a

fully local, agentic coding assistant. These models make that practical: light enough to fit, smart

enough to drive real tool-calling agent loops.

In a time of RAM shortages and the big tech giants tightening usage limits and quotas, owning a

capable agent that runs entirely on your own modest hardware stops being a hobby and becomes

leverage: no rate limits, no surprise pricing, no dependency on someone else's quota.

Models in the family

|---|---|---|---|

| gemma4-e4b-claude-coder-admin | Gemma 4 E4B | 32K | Administration and system tasks (scripts, shell, devops). Smaller context fits 100% in GPU for higher, stable throughput. |

What it's for

Driving Claude Code locally (ollama launch claude --model <name>).
Agentic code writing and editing with native function calling / tool use.
Administration and devops tasks on a server (the admin variant).
Full privacy and offline operation — no code sent to the cloud.

Context

Coders (E2B / E4B): 64K tokens — matching Claude Code's recommendation (64K minimum).
Admin (E4B): 32K tokens — a deliberate trade-off for 16 GB hardware that keeps the model

entirely on the GPU.

Base Gemma 4 E2B/E4B natively supports up to 128K, so context can be raised on stronger hardware.

Test hardware

The models were built and tested on:

Mac Mini (Apple Silicon, M-series), 16 GB RAM, macOS 15.6
Ollama 0.24, GPU (Metal) inference

Measured performance (16 GB RAM)

|---|---|---|---|

| gemma4-e4b-claude-coder-admin (32K) | 100% GPU | ~30 tok/s (stable) | ✅ |

All three passed an end-to-end test through Claude Code: real turns with tool calls and correct

responses (HTTP 200 on /v1/messages).

How they were made

These models were designed, built and tested with the help of Claude Opus 4.8 — the best

coding model in the world. Their system prompts, parameter choices and context configuration draw

directly on its knowledge. In other words: the world's best coding model prepared local models

that take that work over right on your desk.

License

Apache 2.0 (inherited from the base Gemma 4).

Run Dzluck/gemma4-e2b-claude-coder-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models