GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

prithivmlmods/gelato-30b-a3b-f32-aio-gguf overview

Gelato-30B-A3B is a 30B-parameter Qwen3-VL MoE–based grounding model specialized for GUI computer-use tasks, trained on the Click-100k dataset to map natural language instructions and screen images to precise click coordinates on user interfaces. It achieves state-of-the-art accuracy on key grounding benchmarks, reaching about 63.88%63.88% on ScreenSpot-Pro and 69.15%/74.65%69.15%/74.65% on OS-World-G / OS-World-G (Refined), outperforming prior dedicated computer grounding models such as GTA1-32B and even larger general-purpose VLMs like Qwen3-VL-235B-A22B-Instruct. The model is released with an open codebase and examples showing how to feed a GUI screenshot plus an instruction and obtain normalized (x,y)(x,y) coordinates, making it a strong drop-in component for building computer-use agents that can reliably locate UI elements and interact with real software environments.

transformersggufqwen3_vl_moellama.cpptext-generation-inferenceagentimage-text-to-textenbase_model:mlfoundations/Gelato-30B-A3Bbase_model:quantized:mlfoundations/Gelato-30B-A3Blicense:apache-2.0endpoints_compatibleregion:usconversational
prithivmlmods/gelato-30b-a3b-f32-aio-gguf visual
Downloads
93
Likes
2
Pipeline
image-text-to-text
Library
transformers
Visibility
Public
Access
Open

Repository Files & Downloads

42 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Gelato-30B-A3B-BF16.gguf GGUF BF16 56.90 GB Download
Gelato-30B-A3B-F16.gguf GGUF F16 56.90 GB Download
Gelato-30B-A3B-F32.gguf GGUF F32 113.75 GB Download
Gelato-30B-A3B-mmproj-bf16.gguf GGUF BF16 1.01 GB Download
Gelato-30B-A3B-mmproj-f16.gguf GGUF F16 1.01 GB Download
Gelato-30B-A3B-mmproj-f32.gguf GGUF F32 2.01 GB Download
Gelato-30B-A3B-mmproj-q8_0.gguf GGUF 679.16 MB Download
Gelato-30B-A3B.IQ4_XS.gguf GGUF IQ4_XS 15.42 GB Download
Gelato-30B-A3B.Q2_K.gguf GGUF Q2_K 10.49 GB Download
Gelato-30B-A3B.Q3_K_L.gguf GGUF Q3_K_L 14.81 GB Download
Gelato-30B-A3B.Q3_K_M.gguf GGUF Q3_K_M 13.70 GB Download
Gelato-30B-A3B.Q3_K_S.gguf GGUF Q3_K_S 12.38 GB Download
Gelato-30B-A3B.Q4_K_M.gguf GGUF Q4_K_M 17.28 GB Download
Gelato-30B-A3B.Q4_K_S.gguf GGUF Q4_K_S 16.26 GB Download
Gelato-30B-A3B.Q5_K_M.gguf GGUF Q5_K_M 20.23 GB Download
Gelato-30B-A3B.Q5_K_S.gguf GGUF Q5_K_S 19.63 GB Download
Gelato-30B-A3B.Q6_K.gguf GGUF Q6_K 23.37 GB Download
Gelato-30B-A3B.Q8_0.gguf GGUF 30.25 GB Download
Gelato-30B-A3B.i1-IQ1_M.gguf GGUF IQ1_M 6.59 GB Download
Gelato-30B-A3B.i1-IQ1_S.gguf GGUF IQ1_S 5.98 GB Download
Gelato-30B-A3B.i1-IQ2_M.gguf GGUF IQ2_M 9.47 GB Download
Gelato-30B-A3B.i1-IQ2_S.gguf GGUF IQ2_S 8.65 GB Download
Gelato-30B-A3B.i1-IQ2_XS.gguf GGUF IQ2_XS 8.45 GB Download
Gelato-30B-A3B.i1-IQ2_XXS.gguf GGUF IQ2_XXS 7.62 GB Download
Gelato-30B-A3B.i1-IQ3_M.gguf GGUF IQ3_M 12.59 GB Download
Gelato-30B-A3B.i1-IQ3_S.gguf GGUF IQ3_S 12.39 GB Download
Gelato-30B-A3B.i1-IQ3_XS.gguf GGUF IQ3_XS 11.73 GB Download
Gelato-30B-A3B.i1-IQ3_XXS.gguf GGUF IQ3_XXS 11.04 GB Download
Gelato-30B-A3B.i1-IQ4_XS.gguf GGUF IQ4_XS 15.24 GB Download
Gelato-30B-A3B.i1-Q2_K.gguf GGUF Q2_K 10.49 GB Download
Gelato-30B-A3B.i1-Q2_K_S.gguf GGUF Q2_K_S 9.80 GB Download
Gelato-30B-A3B.i1-Q3_K_L.gguf GGUF Q3_K_L 14.81 GB Download
Gelato-30B-A3B.i1-Q3_K_M.gguf GGUF Q3_K_M 13.70 GB Download
Gelato-30B-A3B.i1-Q3_K_S.gguf GGUF Q3_K_S 12.38 GB Download
Gelato-30B-A3B.i1-Q4_0.gguf GGUF 16.19 GB Download
Gelato-30B-A3B.i1-Q4_1.gguf GGUF 17.87 GB Download
Gelato-30B-A3B.i1-Q4_K_M.gguf GGUF Q4_K_M 17.28 GB Download
Gelato-30B-A3B.i1-Q4_K_S.gguf GGUF Q4_K_S 16.26 GB Download
Gelato-30B-A3B.i1-Q5_K_M.gguf GGUF Q5_K_M 20.23 GB Download
Gelato-30B-A3B.i1-Q5_K_S.gguf GGUF Q5_K_S 19.63 GB Download
Gelato-30B-A3B.i1-Q6_K.gguf GGUF Q6_K 23.37 GB Download
Gelato-30B-A3B.imatrix.gguf GGUF 116.38 MB Download

Model Details Live

Model Slug
prithivmlmods/gelato-30b-a3b-f32-aio-gguf
Author
prithivMLmods
Pipeline Task
image-text-to-text
Library
transformers
Created
2025-11-16
Last Modified
2025-11-16
Gated
No
Private
No
HF SHA
736357bb1aa41995e60297ef3dde8dabaf8643cd
License
apache-2.0
Language
en
Base Model
mlfoundations/Gelato-30B-A3B

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "apache-2.0",
    "language": [
      "en"
    ],
    "base_model": [
      "mlfoundations/Gelato-30B-A3B"
    ],
    "pipeline_tag": "image-text-to-text",
    "library_name": "transformers",
    "tags": [
      "llama.cpp",
      "text-generation-inference",
      "agent"
    ],
    "frontmatter": {
      "license": "apache-2.0",
      "language": [
        "en"
      ],
      "base_model": [
        "mlfoundations/Gelato-30B-A3B"
      ],
      "pipeline_tag": "image-text-to-text",
      "library_name": "transformers",
      "tags": [
        "llama.cpp",
        "text-generation-inference",
        "agent"
      ]
    },
    "hero_image_url": "https://www.nethype.de/huggingface_embed/quantpplgraph.png",
    "summary": "> Gelato-30B-A3B is a 30B-parameter Qwen3-VL MoE–based grounding model specialized for GUI computer-use tasks, trained on the Click-100k dataset to map natural language instructions and screen images to precise click coordinates on user interfaces. It achieves state-of-the-art accuracy on key grounding benchmarks, reaching about 63.88%63.88% on ScreenSpot-Pro and 69.15%/74.65%69.15%/74.65% on OS-World-G / OS-World-G (Refined), outperforming prior dedicated computer grounding models such as GTA1-32B and even larger general-purpose VLMs like Qwen3-VL-235B-A22B-Instruct. The model is released with an open codebase and examples showing how to feed a GUI screenshot plus an instruction and obtain normalized (x,y)(x,y) coordinates, making it a strong drop-in component for building computer-use agents that can reliably locate UI elements and interact with real software environments.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: apache-2.0\nlanguage:\n- en\nbase_model:\n- mlfoundations/Gelato-30B-A3B\npipeline_tag: image-text-to-text\nlibrary_name: transformers\ntags:\n- llama.cpp\n- text-generation-inference\n- agent\n---\n\n# **Gelato-30B-A3B-f32-AIO-GGUF**\n\n> [Gelato-30B-A3B](https://huggingface.co/mlfoundations/Gelato-30B-A3B) is a 30B-parameter Qwen3-VL MoE–based grounding model specialized for GUI computer-use tasks, trained on the Click-100k dataset to map natural language instructions and screen images to precise click coordinates on user interfaces. It achieves state-of-the-art accuracy on key grounding benchmarks, reaching about 63.88%63.88% on ScreenSpot-Pro and 69.15%/74.65%69.15%/74.65% on OS-World-G / OS-World-G (Refined), outperforming prior dedicated computer grounding models such as GTA1-32B and even larger general-purpose VLMs like Qwen3-VL-235B-A22B-Instruct. The model is released with an open codebase and examples showing how to feed a GUI screenshot plus an instruction and obtain normalized (x,y)(x,y) coordinates, making it a strong drop-in component for building computer-use agents that can reliably locate UI elements and interact with real software environments.\n\n## Model Files\n\n| File Name | Quant Type | File Size |\n| - | - | - |\n| Gelato-30B-A3B-BF16.gguf | BF16 | 61.1 GB |\n| Gelato-30B-A3B-F16.gguf | F16 | 61.1 GB |\n| Gelato-30B-A3B-F32.gguf | F32 | 122 GB |\n| Gelato-30B-A3B.IQ4_XS.gguf | IQ4_XS | 16.6 GB |\n| Gelato-30B-A3B.Q2_K.gguf | Q2_K | 11.3 GB |\n| Gelato-30B-A3B.Q3_K_L.gguf | Q3_K_L | 15.9 GB |\n| Gelato-30B-A3B.Q3_K_M.gguf | Q3_K_M | 14.7 GB |\n| Gelato-30B-A3B.Q3_K_S.gguf | Q3_K_S | 13.3 GB |\n| Gelato-30B-A3B.Q4_K_M.gguf | Q4_K_M | 18.6 GB |\n| Gelato-30B-A3B.Q4_K_S.gguf | Q4_K_S | 17.5 GB |\n| Gelato-30B-A3B.Q5_K_M.gguf | Q5_K_M | 21.7 GB |\n| Gelato-30B-A3B.Q5_K_S.gguf | Q5_K_S | 21.1 GB |\n| Gelato-30B-A3B.Q6_K.gguf | Q6_K | 25.1 GB |\n| Gelato-30B-A3B.Q8_0.gguf | Q8_0 | 32.5 GB |\n| Gelato-30B-A3B.i1-IQ1_M.gguf | i1-IQ1_M | 7.08 GB |\n| Gelato-30B-A3B.i1-IQ1_S.gguf | i1-IQ1_S | 6.42 GB |\n| Gelato-30B-A3B.i1-IQ2_M.gguf | i1-IQ2_M | 10.2 GB |\n| Gelato-30B-A3B.i1-IQ2_S.gguf | i1-IQ2_S | 9.29 GB |\n| Gelato-30B-A3B.i1-IQ2_XS.gguf | i1-IQ2_XS | 9.08 GB |\n| Gelato-30B-A3B.i1-IQ2_XXS.gguf | i1-IQ2_XXS | 8.18 GB |\n| Gelato-30B-A3B.i1-IQ3_M.gguf | i1-IQ3_M | 13.5 GB |\n| Gelato-30B-A3B.i1-IQ3_S.gguf | i1-IQ3_S | 13.3 GB |\n| Gelato-30B-A3B.i1-IQ3_XS.gguf | i1-IQ3_XS | 12.6 GB |\n| Gelato-30B-A3B.i1-IQ3_XXS.gguf | i1-IQ3_XXS | 11.8 GB |\n| Gelato-30B-A3B.i1-IQ4_XS.gguf | i1-IQ4_XS | 16.4 GB |\n| Gelato-30B-A3B.i1-Q2_K.gguf | i1-Q2_K | 11.3 GB |\n| Gelato-30B-A3B.i1-Q2_K_S.gguf | i1-Q2_K_S | 10.5 GB |\n| Gelato-30B-A3B.i1-Q3_K_L.gguf | i1-Q3_K_L | 15.9 GB |\n| Gelato-30B-A3B.i1-Q3_K_M.gguf | i1-Q3_K_M | 14.7 GB |\n| Gelato-30B-A3B.i1-Q3_K_S.gguf | i1-Q3_K_S | 13.3 GB |\n| Gelato-30B-A3B.i1-Q4_0.gguf | i1-Q4_0 | 17.4 GB |\n| Gelato-30B-A3B.i1-Q4_1.gguf | i1-Q4_1 | 19.2 GB |\n| Gelato-30B-A3B.i1-Q4_K_M.gguf | i1-Q4_K_M | 18.6 GB |\n| Gelato-30B-A3B.i1-Q4_K_S.gguf | i1-Q4_K_S | 17.5 GB |\n| Gelato-30B-A3B.i1-Q5_K_M.gguf | i1-Q5_K_M | 21.7 GB |\n| Gelato-30B-A3B.i1-Q5_K_S.gguf | i1-Q5_K_S | 21.1 GB |\n| Gelato-30B-A3B.i1-Q6_K.gguf | i1-Q6_K | 25.1 GB |\n| Gelato-30B-A3B-mmproj-bf16.gguf | mmproj-bf16 | 1.09 GB |\n| Gelato-30B-A3B-mmproj-f16.gguf | mmproj-f16 | 1.08 GB |\n| Gelato-30B-A3B-mmproj-f32.gguf | mmproj-f32 | 2.15 GB |\n| Gelato-30B-A3B-mmproj-q8_0.gguf | mmproj-q8_0 | 712 MB |\n| Gelato-30B-A3B.imatrix.gguf | imatrix | 122 MB |\n\n## Quants Usage \n\n(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)\n\nHere is a handy graph by ikawrakow comparing some lower-quality quant\ntypes (lower is better):\n\n![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "qwen3_vl_moe",
    "llama.cpp",
    "text-generation-inference",
    "agent",
    "image-text-to-text",
    "en",
    "base_model:mlfoundations/Gelato-30B-A3B",
    "base_model:quantized:mlfoundations/Gelato-30B-A3B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 2,
  "downloads": 93,
  "gated": false,
  "private": false,
  "last_modified": "2025-11-16T10:26:29.000Z",
  "created_at": "2025-11-16T08:53:51.000Z",
  "pipeline_tag": "image-text-to-text",
  "library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "6919911f118b0217fc1b0cec",
  "id": "prithivMLmods/Gelato-30B-A3B-f32-AIO-GGUF",
  "modelId": "prithivMLmods/Gelato-30B-A3B-f32-AIO-GGUF",
  "sha": "736357bb1aa41995e60297ef3dde8dabaf8643cd",
  "createdAt": "2025-11-16T08:53:51.000Z",
  "lastModified": "2025-11-16T10:26:29.000Z",
  "author": "prithivMLmods",
  "downloads": 93,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "image-text-to-text",
  "library_name": "transformers",
  "siblings_count": 45
}