GraySoft
Projects Models About FAQ Contact Download guIDE →

makisekurisu-jp/comfyui-qwen3-vl-gguf F16 GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

makisekurisu-jp/comfyui-qwen3-vl-gguf overview

https://github.com/KLL535/ComfyUISimpleQwen3-VL-gguf After compilation is finished, the llama-cpp-python folder must not be deleted. --- There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly. https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- Gemma 4 requires llama‑cpp‑python ≥ 0.3.35 ComfyUI\custom_nodes\ComfyUI_Simple_Qwen3-VL-gguf\system_prompts_user.json

ggufendpoints_compatibleregion:usconversational
makisekurisu-jp/comfyui-qwen3-vl-gguf visual
Downloads
6,551
Likes
1
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

6 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Qwen3-VL-8B-Instruct-IQ4_XS.gguf GGUF IQ4_XS 4.27 GB Download
Qwen3-VL-8B-Instruct-mmproj-F16.gguf GGUF F16 1.08 GB Download
Qwen3.5-9B-IQ4_XS.gguf GGUF IQ4_XS 4.81 GB Download
Qwen3.5-9B-mmproj-F16.gguf GGUF F16 875.63 MB Download
llama-joycaption-beta-one-hf-llava-IQ4_XS.gguf GGUF IQ4_XS 4.18 GB Download
llama-joycaption-beta-one-hf-llava-mmproj-F16.gguf GGUF F16 837.11 MB Download

Model Details Live

Model Slug
makisekurisu-jp/comfyui-qwen3-vl-gguf
Author
makisekurisu-jp
Pipeline Task
Library
Created
2025-12-09
Last Modified
2026-04-11
Gated
No
Private
No
HF SHA
388c9f989bc6fccf6bb459201792373001642a4e
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf `` git clone https://github.com/JamePeng/llama-cpp-python cd llama-cpp-python git clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp $env:CMAKE_ARGS = \"-DGGML_CUDA=on\" D:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose ` **After compilation is finished, the llama-cpp-python folder must not be deleted.** --- **There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.** https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- **Gemma 4 requires llama‑cpp‑python ≥ 0.3.35** *ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json* ` { \"_system_prompts\": { }, \"_user_prompt_styles\": { }, \"_camera_preset\": { }, \"_model_presets\": { \"gemma-4-E4B-it-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 2048, \"gpu_layers\": -1, \"temperature\": 1.0, \"top_p\": 0.95, \"min_p\": 0.01, \"top_k\": 64, \"repeat_penalty\": 1.0, \"chat_handler\": \"gemma4\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true, \"verbose\": true, \"raw_mode\": true, \"prompt_template\": \"system\\n{system}\\nuser\\n{images}\\n{user}\\nmodel\\n\", \"stop\": [\"\", \"\", \"\"] }, \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.8, \"min_p\": 0.05, \"top_k\": 20, \"repeat_penalty\": 1.0, \"present_penalty\": 1.5, \"pool_size\": 4194304, \"chat_handler\": \"qwen35\", \"enable_thinking\": false, \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true }, \"Qwen3-VL-8B-Instruct-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.92, \"min_p\": 0.01, \"top_k\": 40, \"repeat_penalty\": 1.1, \"pool_size\": 4194304, \"chat_handler\": \"qwen3\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true } } } ``",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf\n```\ngit clone https://github.com/JamePeng/llama-cpp-python\ncd llama-cpp-python\ngit clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp\n$env:CMAKE_ARGS = \"-DGGML_CUDA=on\"\nD:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose\n```\n**After compilation is finished, the `llama-cpp-python` folder must not be deleted.**\n\n---\n**There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.**\n\nhttps://github.com/1038lab/ComfyUI-QwenVL\n\nhttps://github.com/JamePeng/llama-cpp-python\n\nhttps://developer.nvidia.com/cuda-toolkit-archive\n\n---\n**Gemma 4 requires llama‑cpp‑python ≥ 0.3.35**\n\n*ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json*\n```\n{\n    \"_system_prompts\": {\n    },\n    \"_user_prompt_styles\": {\n    },\n    \"_camera_preset\": {\n    },\n    \"_model_presets\": {\n        \"gemma-4-E4B-it-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 2048,\n            \"gpu_layers\": -1,\n            \"temperature\": 1.0,\n            \"top_p\": 0.95,\n            \"min_p\": 0.01,\n            \"top_k\": 64,\n            \"repeat_penalty\": 1.0,\n            \"chat_handler\": \"gemma4\",\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true,\n            \"verbose\": true,\n            \"raw_mode\": true,\n            \"prompt_template\": \"<|turn>system\\n{system}<turn|>\\n<|turn>user\\n{images}\\n{user}<turn|>\\n<|turn>model\\n\",\n            \"stop\": [\"<turn|>\", \"<eos>\", \"<|end_of_turn|>\"]\n        },\n        \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"image_min_tokens\": 1024,\n            \"image_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 512,\n            \"gpu_layers\": -1,\n            \"temperature\": 0.7,\n            \"top_p\": 0.8,\n            \"min_p\": 0.05,\n            \"top_k\": 20,\n            \"repeat_penalty\": 1.0,\n            \"present_penalty\": 1.5,\n            \"pool_size\": 4194304,\n            \"chat_handler\": \"qwen35\",\n            \"enable_thinking\": false,\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true\n        },\n        \"Qwen3-VL-8B-Instruct-IQ4_XS\": {\n            \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\",\n            \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\",\n            \"output_max_tokens\": 2048,\n            \"image_min_tokens\": 1024,\n            \"image_max_tokens\": 2048,\n            \"ctx\": 8192,\n            \"n_batch\": 2048,\n            \"n_ubatch\": 512,\n            \"gpu_layers\": -1,\n            \"temperature\": 0.7,\n            \"top_p\": 0.92,\n            \"min_p\": 0.01,\n            \"top_k\": 40,\n            \"repeat_penalty\": 1.1,\n            \"pool_size\": 4194304,\n            \"chat_handler\": \"qwen3\",\n            \"script\": \"qwen3vl_run.py\",\n            \"silent\": false,\n            \"debug\": true\n        }\n    }\n}\n```",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 1,
  "downloads": 6551,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-11T04:36:53.000Z",
  "created_at": "2025-12-09T05:37:34.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "6937b59e70d7c92a80d04de6",
  "id": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
  "modelId": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
  "sha": "388c9f989bc6fccf6bb459201792373001642a4e",
  "createdAt": "2025-12-09T05:37:34.000Z",
  "lastModified": "2026-04-11T04:36:53.000Z",
  "author": "makisekurisu-jp",
  "downloads": 6551,
  "likes": 1,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 8
}