GraySoft
Model Comparison

ggml-org/smolvlm2-500m-video-instruct-ggufvsunsloth/qwen3-vl-4b-instruct-gguf

Side-by-side comparison of ggml-org/smolvlm2-500m-video-instruct-gguf and unsloth/qwen3-vl-4b-instruct-gguf: downloads, license, context length, tasks, and benchmarks.

ggml-org/smolvlm2-500m-video-instruct-gguf

ggml-org · —

Original model: https://huggingface.co/HuggingFaceTB/SmolVLM2-500M-Video-Instruct For more info, please refer to this PR: https://github.com/ggml-org/llama.cpp/pull/13050

unsloth/qwen3-vl-4b-instruct-gguf

unsloth · image-text-to-text

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date. This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and vid…

Side-by-side Specifications

ggml-org/smolvlm2-500m-video-instruct-ggufunsloth/qwen3-vl-4b-instruct-gguf
Authorggml-orgunsloth
Pipeline Taskimage-text-to-text
Librarytransformers
Downloads24,74182,451
Likes1746
LicenseUnknownUnknown
Context Length
Created2025-04-212025-10-30
Last Modified2025-04-302025-10-31
Tags
ggufbase_model:HuggingFaceTB/SmolVLM2-500M-Video-Instructbase_model:quantized:HuggingFaceTB/SmolVLM2-500M-Video-Instructlicense:apache-2.0endpoints_compatibleregion:usconversational
transformersggufunslothqwenqwen3image-text-to-textarxiv:2505.09388arxiv:2502.13923arxiv:2409.12191arxiv:2308.12966

View full details: ggml-org/smolvlm2-500m-video-instruct-gguf · unsloth/qwen3-vl-4b-instruct-gguf