makisekurisu-jp/comfyui-qwen3-vl-gguf F16 GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

makisekurisu-jp/comfyui-qwen3-vl-gguf overview

https://github.com/KLL535/ComfyUISimpleQwen3-VL-gguf After compilation is finished, the llama-cpp-python folder must not be deleted. --- There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly. https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- Gemma 4 requires llama‑cpp‑python ≥ 0.3.35 ComfyUI\custom_nodes\ComfyUI_Simple_Qwen3-VL-gguf\system_prompts_user.json

ggufendpoints_compatibleregion:usconversational

makisekurisu-jp/comfyui-qwen3-vl-gguf visual

Downloads

6,551

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

6 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen3-VL-8B-Instruct-IQ4_XS.gguf	GGUF	IQ4_XS	4.27 GB	Download
Qwen3-VL-8B-Instruct-mmproj-F16.gguf	GGUF	F16	1.08 GB	Download
Qwen3.5-9B-IQ4_XS.gguf	GGUF	IQ4_XS	4.81 GB	Download
Qwen3.5-9B-mmproj-F16.gguf	GGUF	F16	875.63 MB	Download
llama-joycaption-beta-one-hf-llava-IQ4_XS.gguf	GGUF	IQ4_XS	4.18 GB	Download
llama-joycaption-beta-one-hf-llava-mmproj-F16.gguf	GGUF	F16	837.11 MB	Download

Model Details Live

Model Slug

makisekurisu-jp/comfyui-qwen3-vl-gguf

Author

makisekurisu-jp

Pipeline Task

—

Library

—

Created

2025-12-09

Last Modified

2026-04-11

Gated

Private

HF SHA

388c9f989bc6fccf6bb459201792373001642a4e

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf `` git clone https://github.com/JamePeng/llama-cpp-python cd llama-cpp-python git clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp $env:CMAKE_ARGS = \"-DGGML_CUDA=on\" D:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose ` **After compilation is finished, the llama-cpp-python folder must not be deleted.** --- **There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.** https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- **Gemma 4 requires llama‑cpp‑python ≥ 0.3.35** *ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json* ` { \"_system_prompts\": { }, \"_user_prompt_styles\": { }, \"_camera_preset\": { }, \"_model_presets\": { \"gemma-4-E4B-it-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 2048, \"gpu_layers\": -1, \"temperature\": 1.0, \"top_p\": 0.95, \"min_p\": 0.01, \"top_k\": 64, \"repeat_penalty\": 1.0, \"chat_handler\": \"gemma4\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true, \"verbose\": true, \"raw_mode\": true, \"prompt_template\": \"system\\n{system}\\nuser\\n{images}\\n{user}\\nmodel\\n\", \"stop\": [\"\", \"\", \"\"] }, \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.8, \"min_p\": 0.05, \"top_k\": 20, \"repeat_penalty\": 1.0, \"present_penalty\": 1.5, \"pool_size\": 4194304, \"chat_handler\": \"qwen35\", \"enable_thinking\": false, \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true }, \"Qwen3-VL-8B-Instruct-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.92, \"min_p\": 0.01, \"top_k\": 40, \"repeat_penalty\": 1.1, \"pool_size\": 4194304, \"chat_handler\": \"qwen3\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true } } } ``",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf\n```\ngit clone https://github.com/JamePeng/llama-cpp-python\ncd llama-cpp-python\ngit clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp\n$env:CMAKE_ARGS = \"-DGGML_CUDA=on\"\nD:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose\n```\n**After compilation is finished, the `llama-cpp-python` folder must not be deleted.**\n\n---\n**There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.**\n\nhttps://github.com/1038lab/ComfyUI-QwenVL\n\nhttps://github.com/JamePeng/llama-cpp-python\n\nhttps://developer.nvidia.com/cuda-toolkit-archive\n\n---\n**Gemma 4 requires llama‑cpp‑python ≥ 0.3.35**\n\n*ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json*\n```\n{\n    \"_system_prompts\": {\n    },\n    \"_user_prompt_styles\": {\n    },\n    \"_camera_preset\": {\n    },\n    \"_model_presets\": {\n        \"gemma-4-E4B-it-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 2048,\n            \"gpu_layers\": -1,\n            \"temperature\": 1.0,\n            \"top_p\": 0.95,\n            \"min_p\": 0.01,\n            \"top_k\": 64,\n            \"repeat_penalty\": 1.0,\n            \"chat_handler\": \"gemma4\",\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true,\n            \"verbose\": true,\n            \"raw_mode\": true,\n            \"prompt_template\": \"<|turn>system\\n{system}<turn|>\\n<|turn>user\\n{images}\\n{user}<turn|>\\n<|turn>model\\n\",\n            \"stop\": [\"<turn|>\", \"<eos>\", \"<|end_of_turn|>\"]\n        },\n        \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"image_min_tokens\": 1024,\n            \"image_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 512,\n            \"gpu_layers\": -1,\n            \"temperature\": 0.7,\n            \"top_p\": 0.8,\n            \"min_p\": 0.05,\n            \"top_k\": 20,\n            \"repeat_penalty\": 1.0,\n            \"present_penalty\": 1.5,\n            \"pool_size\": 4194304,\n            \"chat_handler\": \"qwen35\",\n            \"enable_thinking\": false,\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true\n        },\n        \"Qwen3-VL-8B-Instruct-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"image_min_tokens\": 1024,\n            \"image_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 512,\n            \"gpu_layers\": -1,\n            \"temperature\": 0.7,\n            \"top_p\": 0.92,\n            \"min_p\": 0.01,\n            \"top_k\": 40,\n            \"repeat_penalty\": 1.1,\n            \"pool_size\": 4194304,\n            \"chat_handler\": \"qwen3\",\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true\n        }\n    }\n}\n```",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 1,
  "downloads": 6551,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-11T04:36:53.000Z",
  "created_at": "2025-12-09T05:37:34.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "6937b59e70d7c92a80d04de6",
  "id": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
  "modelId": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
  "sha": "388c9f989bc6fccf6bb459201792373001642a4e",
  "createdAt": "2025-12-09T05:37:34.000Z",
  "lastModified": "2026-04-11T04:36:53.000Z",
  "author": "makisekurisu-jp",
  "downloads": 6551,
  "likes": 1,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 8
}

makisekurisu-jp/comfyui-qwen3-vl-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard