makisekurisu-jp/comfyui-qwen3-vl-gguf F16 GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
makisekurisu-jp/comfyui-qwen3-vl-gguf overview
https://github.com/KLL535/ComfyUISimpleQwen3-VL-gguf After compilation is finished, the llama-cpp-python folder must not be deleted. --- There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly. https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- Gemma 4 requires llama‑cpp‑python ≥ 0.3.35 ComfyUI\custom_nodes\ComfyUI_Simple_Qwen3-VL-gguf\system_prompts_user.json
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3-VL-8B-Instruct-IQ4_XS.gguf | GGUF | IQ4_XS | 4.27 GB | Download |
| Qwen3-VL-8B-Instruct-mmproj-F16.gguf | GGUF | F16 | 1.08 GB | Download |
| Qwen3.5-9B-IQ4_XS.gguf | GGUF | IQ4_XS | 4.81 GB | Download |
| Qwen3.5-9B-mmproj-F16.gguf | GGUF | F16 | 875.63 MB | Download |
| llama-joycaption-beta-one-hf-llava-IQ4_XS.gguf | GGUF | IQ4_XS | 4.18 GB | Download |
| llama-joycaption-beta-one-hf-llava-mmproj-F16.gguf | GGUF | F16 | 837.11 MB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"frontmatter": {},
"hero_image_url": "",
"summary": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf `` git clone https://github.com/JamePeng/llama-cpp-python cd llama-cpp-python git clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp $env:CMAKE_ARGS = \"-DGGML_CUDA=on\" D:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose ` **After compilation is finished, the llama-cpp-python folder must not be deleted.** --- **There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.** https://github.com/1038lab/ComfyUI-QwenVL https://github.com/JamePeng/llama-cpp-python https://developer.nvidia.com/cuda-toolkit-archive --- **Gemma 4 requires llama‑cpp‑python ≥ 0.3.35** *ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json* ` { \"_system_prompts\": { }, \"_user_prompt_styles\": { }, \"_camera_preset\": { }, \"_model_presets\": { \"gemma-4-E4B-it-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 2048, \"gpu_layers\": -1, \"temperature\": 1.0, \"top_p\": 0.95, \"min_p\": 0.01, \"top_k\": 64, \"repeat_penalty\": 1.0, \"chat_handler\": \"gemma4\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true, \"verbose\": true, \"raw_mode\": true, \"prompt_template\": \"system\\n{system}\\nuser\\n{images}\\n{user}\\nmodel\\n\", \"stop\": [\"\", \"\", \"\"] }, \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.8, \"min_p\": 0.05, \"top_k\": 20, \"repeat_penalty\": 1.0, \"present_penalty\": 1.5, \"pool_size\": 4194304, \"chat_handler\": \"qwen35\", \"enable_thinking\": false, \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true }, \"Qwen3-VL-8B-Instruct-IQ4_XS\": { \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\", \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\", \"output_max_tokens\": 2048, \"image_min_tokens\": 1024, \"image_max_tokens\": 2048, \"ctx\": 8192, \"n_batch\": 2048, \"n_ubatch\": 512, \"gpu_layers\": -1, \"temperature\": 0.7, \"top_p\": 0.92, \"min_p\": 0.01, \"top_k\": 40, \"repeat_penalty\": 1.1, \"pool_size\": 4194304, \"chat_handler\": \"qwen3\", \"script\": \"qwen3vl_run.py\", \"silent\": false, \"debug\": true } } } ``",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf\n```\ngit clone https://github.com/JamePeng/llama-cpp-python\ncd llama-cpp-python\ngit clone https://github.com/ggml-org/llama.cpp ./vendor/llama.cpp\n$env:CMAKE_ARGS = \"-DGGML_CUDA=on\"\nD:\\ComfyUI\\venv\\Scripts\\python -m pip install -e . --verbose\n```\n**After compilation is finished, the `llama-cpp-python` folder must not be deleted.**\n\n---\n**There is no need to compile it manually anymore. You can use JamePeng’s precompiled WHL package, as long as the CUDA version matches exactly.**\n\nhttps://github.com/1038lab/ComfyUI-QwenVL\n\nhttps://github.com/JamePeng/llama-cpp-python\n\nhttps://developer.nvidia.com/cuda-toolkit-archive\n\n---\n**Gemma 4 requires llama‑cpp‑python ≥ 0.3.35**\n\n*ComfyUI\\custom_nodes\\ComfyUI_Simple_Qwen3-VL-gguf\\system_prompts_user.json*\n```\n{\n \"_system_prompts\": {\n },\n \"_user_prompt_styles\": {\n },\n \"_camera_preset\": {\n },\n \"_model_presets\": {\n \"gemma-4-E4B-it-IQ4_XS\": {\n \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-IQ4_XS.gguf\",\n \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\gemma-4-E4B-it-mmproj-BF16.gguf\",\n \"output_max_tokens\": 2048,\n \"ctx\": 8192,\n \"n_batch\": 2048,\n \"n_ubatch\": 2048,\n \"gpu_layers\": -1,\n \"temperature\": 1.0,\n \"top_p\": 0.95,\n \"min_p\": 0.01,\n \"top_k\": 64,\n \"repeat_penalty\": 1.0,\n \"chat_handler\": \"gemma4\",\n \"script\": \"qwen3vl_run.py\",\n \"silent\": false,\n \"debug\": true,\n \"verbose\": true,\n \"raw_mode\": true,\n \"prompt_template\": \"<|turn>system\\n{system}<turn|>\\n<|turn>user\\n{images}\\n{user}<turn|>\\n<|turn>model\\n\",\n \"stop\": [\"<turn|>\", \"<eos>\", \"<|end_of_turn|>\"]\n },\n \"Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS\": {\n \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Huihui-Qwen3.5-9B-abliterated.i1-IQ4_XS.gguf\",\n \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3.5-9B-mmproj-BF16.gguf\",\n \"output_max_tokens\": 2048,\n \"image_min_tokens\": 1024,\n \"image_max_tokens\": 2048,\n \"ctx\": 8192,\n \"n_batch\": 2048,\n \"n_ubatch\": 512,\n \"gpu_layers\": -1,\n \"temperature\": 0.7,\n \"top_p\": 0.8,\n \"min_p\": 0.05,\n \"top_k\": 20,\n \"repeat_penalty\": 1.0,\n \"present_penalty\": 1.5,\n \"pool_size\": 4194304,\n \"chat_handler\": \"qwen35\",\n \"enable_thinking\": false,\n \"script\": \"qwen3vl_run.py\",\n \"silent\": false,\n \"debug\": true\n },\n \"Qwen3-VL-8B-Instruct-IQ4_XS\": {\n \"model_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-IQ4_XS.gguf\",\n \"mmproj_path\": \"D:\\\\ComfyUI\\\\models\\\\LLM\\\\Qwen3-VL-8B-Instruct-mmproj-BF16.gguf\",\n \"output_max_tokens\": 2048,\n \"image_min_tokens\": 1024,\n \"image_max_tokens\": 2048,\n \"ctx\": 8192,\n \"n_batch\": 2048,\n \"n_ubatch\": 512,\n \"gpu_layers\": -1,\n \"temperature\": 0.7,\n \"top_p\": 0.92,\n \"min_p\": 0.01,\n \"top_k\": 40,\n \"repeat_penalty\": 1.1,\n \"pool_size\": 4194304,\n \"chat_handler\": \"qwen3\",\n \"script\": \"qwen3vl_run.py\",\n \"silent\": false,\n \"debug\": true\n }\n }\n}\n```",
"related_quantizations": []
},
"tags": [
"gguf",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 1,
"downloads": 6551,
"gated": false,
"private": false,
"last_modified": "2026-04-11T04:36:53.000Z",
"created_at": "2025-12-09T05:37:34.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "6937b59e70d7c92a80d04de6",
"id": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
"modelId": "makisekurisu-jp/ComfyUI-Qwen3-VL-GGUF",
"sha": "388c9f989bc6fccf6bb459201792373001642a4e",
"createdAt": "2025-12-09T05:37:34.000Z",
"lastModified": "2026-04-11T04:36:53.000Z",
"author": "makisekurisu-jp",
"downloads": 6551,
"likes": 1,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 8
}