Model Intelligence Sheet

tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe overview

This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.

llama.cppggufgemma4multimodallmstudiomxfp4_moebase_model:google/gemma-4-26B-A4Bbase_model:quantized:google/gemma-4-26B-A4Blicense:apache-2.0endpoints_compatibleregion:usconversational

tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe visual

Downloads

5,471

Likes

Pipeline

—

Library

llama.cpp

Visibility

Public

Access

Open

Repository Files & Downloads

2 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
gemma-4-26B-A4B.MXFP4_MOE.gguf	GGUF	—	13.72 GB	Download
mmproj-gemma-4-26B-A4B.f16.gguf	GGUF	F16	1.11 GB	Download

Model Details Live

Model Slug

tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe

Author

tianrui6641

Pipeline Task

—

Library

llama.cpp

Created

2026-04-03

Last Modified

2026-04-03

Gated

Private

HF SHA

475cf59efab0192697edd00e90912db3ed5b5f4d

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "google/gemma-4-26B-A4B",
    "license": "apache-2.0",
    "library_name": "llama.cpp",
    "tags": [
      "gguf",
      "gemma4",
      "multimodal",
      "llama.cpp",
      "lmstudio",
      "mxfp4_moe"
    ],
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "This repo contains a GGUF export of google/gemma-4-26B-A4B quantized to MXFP4_MOE, plus the matching mmproj GGUF for image input support.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\r\nbase_model: google/gemma-4-26B-A4B\r\nlicense: apache-2.0\r\nlibrary_name: llama.cpp\r\ntags:\r\n  - gguf\r\n  - gemma4\r\n  - multimodal\r\n  - llama.cpp\r\n  - lmstudio\r\n  - mxfp4_moe\r\n---\r\n\r\n# Gemma 4 26B A4B GGUF MXFP4_MOE\r\n\r\nThis repo contains a GGUF export of `google/gemma-4-26B-A4B` quantized to `MXFP4_MOE`, plus the matching `mmproj` GGUF for image input support.\r\n\r\n## Files\r\n\r\n- `gemma-4-26B-A4B.MXFP4_MOE.gguf`: main text model quantized to `MXFP4_MOE`.\r\n- `mmproj-gemma-4-26B-A4B.f16.gguf`: multimodal projector required for image input.\r\n\r\n## LM Studio note\r\n\r\n`google/gemma-4-26B-A4B` is a Gemma 4 Mixture-of-Experts model, so the correct GGUF quantization target for the text model is `MXFP4_MOE` rather than dense `MXFP4`.\r\n\r\nThe final GGUF now embeds a Gemma 4-compatible chat template that forces thinking on, even in runtimes that pass `enable_thinking=false` or do not expose a separate reasoning toggle.\r\n\r\nThis repo also ships `tokenizer_config.json` and `chat_template.jinja` sidecars with the Gemma 4 `response_schema` for the `<|channel>thought\\n...<channel|>` reasoning block, so frontends that look beyond GGUF metadata can both elicit and parse reasoning more reliably.\r\n\r\nThis is the base `google/gemma-4-26B-A4B` checkpoint, not a separate instruction-tuned `-it` variant. The reasoning-aware, tool-capable template is embedded so the runtime keeps both tool formatting and thinking support, but runtime compatibility still depends on the GGUF engine supporting Gemma 4 multimodal MoE models and `MXFP4_MOE`.",
    "related_quantizations": []
  },
  "tags": [
    "llama.cpp",
    "gguf",
    "gemma4",
    "multimodal",
    "lmstudio",
    "mxfp4_moe",
    "base_model:google/gemma-4-26B-A4B",
    "base_model:quantized:google/gemma-4-26B-A4B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 2,
  "downloads": 5471,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-03T22:17:18.000Z",
  "created_at": "2026-04-03T05:13:17.000Z",
  "pipeline_tag": "",
  "library_name": "llama.cpp"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "69cf4c6d132f6caf60a80045",
  "id": "tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe",
  "modelId": "tianrui6641/gemma-4-26b-a4b-gguf-mxfp4-moe",
  "sha": "475cf59efab0192697edd00e90912db3ed5b5f4d",
  "createdAt": "2026-04-03T05:13:17.000Z",
  "lastModified": "2026-04-03T22:17:18.000Z",
  "author": "tianrui6641",
  "downloads": 5471,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "llama.cpp",
  "siblings_count": 6
}