GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

bandtor/gemma-4-E4B-it-GGUF overview

Gemma 4 E4B Instruct — GGUF Q4 K M Quantização Q4 K M do modelo google/gemma 4 E4B it https://huggingface.co/google/gemma 4 E4B it , obtida de unsloth/gemma 4 …

ggufollamagemma4q4_k_mmoellama-cppmultimodalimage-text-to-textenptmultilingualbase_model:google/gemma-4-E4B-itbase_model:quantized:google/gemma-4-E4B-itlicense:gemmaendpoints_compatibleregion:usimatrixconversational

Runs locally from ~4.64 GB disk (8 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
0
Pipeline
image-text-to-text
Author

Repository Files & Downloads

1 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
gemma-4-E4B-it-Q4_K_M.ggufGGUFQ4_K_M4.64 GBDownload

Model Details

Model IDbandtor/gemma-4-E4B-it-GGUF
Authorbandtor
Pipelineimage-text-to-text
Licensegemma
Base modelgoogle/gemma-4-E4B-it
Last modified2026-06-08T03:06:37.000Z

Model README

---

base_model: google/gemma-4-E4B-it

license: gemma

tags:

- gguf

- ollama

- gemma4

- q4_k_m

- moe

- llama-cpp

- multimodal

- image-text-to-text

language:

- en

- pt

- multilingual

library_name: gguf

pipeline_tag: image-text-to-text

---

Gemma 4 E4B Instruct — GGUF Q4_K_M

Quantização Q4_K_M do modelo google/gemma-4-E4B-it,

obtida de unsloth/gemma-4-E4B-it-GGUF.

| Arquivo | Tamanho | Tipo |

|---|---|---|

| gemma-4-E4B-it-Q4_K_M.gguf | ~4.6 GB | Modelo principal (MoE sparse) |

| Modelfile | — | Template Ollama pronto para uso |

Especificações

| Campo | Valor |

|---|---|

| Parâmetros (total) | 7.5B (MoE sparse) |

| Parâmetros (ativos) | ~4.5B por token |

| Arquitetura | gemma4 MoE |

| Contexto máximo | 128K tokens (131 072) |

| Quantização | Q4_K_M |

| Tamanho GGUF | ~4.6 GB |

| Licença | Gemma Terms of Use |

Uso com Ollama

# Direto do HuggingFace Hub (Ollama >= 0.3)
ollama run hf.co/bandtor/gemma-4-E4B-it-GGUF:Q4_K_M

# Com nome canônico local
ollama cp "hf.co/bandtor/gemma-4-E4B-it-GGUF:Q4_K_M" "bandtor/gemma-4-e4b-it:q4_k_m"
ollama run bandtor/gemma-4-e4b-it:q4_k_m

Pré-requisito: KV cache Q4 para contexto 128K

launchctl setenv OLLAMA_KV_CACHE_TYPE q4_0
# Persistente via plist: com.bandtor.ollama-env.plist

Uso com llama.cpp

llama-cli -m gemma-4-E4B-it-Q4_K_M.gguf \
  --ctx-size 131072 \
  -fa 1 \
  -ngl 99 \
  --cache-type-k q4_0 \
  --cache-type-v q4_0 \
  -i

Referências

Run bandtor/gemma-4-E4B-it-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models