Model Intelligence Sheet
tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe overview
This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.
Downloads
5,471
Likes
2
Pipeline
—
Library
llama.cpp
Visibility
Public
Access
Open
Repository Files & Downloads
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"base_model": "google/gemma-4-26B-A4B",
"license": "apache-2.0",
"library_name": "llama.cpp",
"tags": [
"gguf",
"gemma4",
"multimodal",
"llama.cpp",
"lmstudio",
"mxfp4_moe"
],
"frontmatter": {},
"hero_image_url": "",
"summary": "This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\r\nbase_model: google/gemma-4-26B-A4B\r\nlicense: apache-2.0\r\nlibrary_name: llama.cpp\r\ntags:\r\n - gguf\r\n - gemma4\r\n - multimodal\r\n - llama.cpp\r\n - lmstudio\r\n - mxfp4_moe\r\n---\r\n\r\n# Gemma 4 26B A4B GGUF MXFP4_MOE\r\n\r\nThis repo contains a GGUF export of `google/gemma-4-26B-A4B` quantized to `MXFP4_MOE`, plus the matching `mmproj` GGUF for image input support.\r\n\r\n## Files\r\n\r\n- `gemma-4-26B-A4B.MXFP4_MOE.gguf`: main text model quantized to `MXFP4_MOE`.\r\n- `mmproj-gemma-4-26B-A4B.f16.gguf`: multimodal projector required for image input.\r\n\r\n## LM Studio note\r\n\r\n`google/gemma-4-26B-A4B` is a Gemma 4 Mixture-of-Experts model, so the correct GGUF quantization target for the text model is `MXFP4_MOE` rather than dense `MXFP4`.\r\n\r\nThe final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass `enable_thinking=false` or do not expose a separate reasoning toggle.\r\n\r\nThis repo also ships `tokenizer_config.json` and `chat_template.jinja` sidecars with the Gemma 4 `response_schema` for the `<|channel>thought\\n...<channel|>` reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.\r\n\r\nThis is the base `google/gemma-4-26B-A4B` checkpoint, not a separate instruction-tuned `-it` variant. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but runtime compatibility still depends on the GGUF engine supporting Gemma 4 multimodal MoE models and `MXFP4_MOE`.",
"related_quantizations": []
},
"tags": [
"llama.cpp",
"gguf",
"gemma4",
"multimodal",
"lmstudio",
"mxfp4_moe",
"base_model:google/gemma-4-26B-A4B",
"base_model:quantized:google/gemma-4-26B-A4B",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 2,
"downloads": 5471,
"gated": false,
"private": false,
"last_modified": "2026-04-03T22:17:18.000Z",
"created_at": "2026-04-03T05:13:17.000Z",
"pipeline_tag": "",
"library_name": "llama.cpp"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69cf4c6d132f6caf60a80045",
"id": "tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe",
"modelId": "tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe",
"sha": "475cf59efab0192697edd00e90912db3ed5b5f4d",
"createdAt": "2026-04-03T05:13:17.000Z",
"lastModified": "2026-04-03T22:17:18.000Z",
"author": "tianrui6641",
"downloads": 5471,
"likes": 2,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "llama.cpp",
"siblings_count": 6
}