GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

cstr/qwen3-vl-2b-crispembed-gguf overview

Qwen3 VL 2B — CrispEmbed GGUF GGUF conversions of Qwen/Qwen3 VL 2B Instruct https://huggingface.co/Qwen/Qwen3 VL 2B Instruct for use with CrispEmbed https://gi…

ggufocrvision-languagecrispembedlicense:apache-2.0region:us

Runs locally from ~37.4 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
0
Pipeline
Author

Repository Files & Downloads

4 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
qwen3-vl-2b-diff-ref.ggufGGUFGGUF37.4 MBDownload
qwen3-vl-2b-f16.ggufGGUFF164.55 GBDownload
qwen3-vl-2b-q4_k.ggufGGUFQ4_K1.48 GBDownload
qwen3-vl-2b-q8_0.ggufGGUFQ8_02.13 GBDownload

Model Details

Model IDcstr/qwen3-vl-2b-crispembed-gguf
Authorcstr
Pipeline
Licenseapache-2.0
Base model
Last modified2026-06-20T04:50:47.000Z

Model README

---

license: apache-2.0

tags:

- gguf

- ocr

- vision-language

- crispembed

---

Qwen3-VL-2B — CrispEmbed GGUF

GGUF conversions of Qwen/Qwen3-VL-2B-Instruct for use with CrispEmbed.

Models

| File | Quant | Size | Description |

|------|-------|------|-------------|

| qwen3-vl-2b-q4_k.gguf | Q4_K | 1.5 GB | Good quality/size balance |

| qwen3-vl-2b-q8_0.gguf | Q8_0 | 2.2 GB | Best quality |

Features

  • DeepStack vision fusion: intermediate vision-encoder features injected into LLM decoder layers
  • Fused flash attention: uses ggml_flash_attn_ext for efficient inference
  • Backend KV cache: decode stays on GPU (Metal/CUDA), no per-token CPU transfer
  • Interleaved mRoPE: improved position encoding vs Qwen2.5-VL
  • QK RMSNorm: per-head query/key normalization

Usage

0 "<stdin>"

0 "<built-in>"

0 "<command-line>"

1 "/usr/include/stdc-predef.h" 1 3 4

0 "<command-line>" 2

1 "<stdin>"

Converted with from CrispEmbed.

Run cstr/qwen3-vl-2b-crispembed-gguf with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models