GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

cstr/paddleocr-vl-0.9b-GGUF overview

PaddleOCR VL 0.9B — CrispEmbed GGUF CrispEmbed native GGUF quantizations of PaddlePaddle/PaddleOCR VL https://huggingface.co/PaddlePaddle/PaddleOCR VL . End to…

ggufocrdocument-understandingcrispembedpaddleocrmultilingualbase_model:PaddlePaddle/PaddleOCR-VLbase_model:quantized:PaddlePaddle/PaddleOCR-VLlicense:apache-2.0region:us

Runs locally from ~1.21 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
0
Pipeline
Author

Repository Files & Downloads

3 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
paddleocr-vl-0.9b-f16.ggufGGUFF162.23 GBDownload
paddleocr-vl-0.9b-q4_k.ggufGGUFQ4_K1.21 GBDownload
paddleocr-vl-0.9b-q8_0.ggufGGUFQ8_01.38 GBDownload

Model Details

Model IDcstr/paddleocr-vl-0.9b-GGUF
Authorcstr
Pipeline
Licenseapache-2.0
Base modelPaddlePaddle/PaddleOCR-VL
Last modified2026-06-19T09:23:20.000Z

Model README

---

base_model: PaddlePaddle/PaddleOCR-VL

language:

- multilingual

license: apache-2.0

tags:

- gguf

- ocr

- document-understanding

- crispembed

- paddleocr

---

PaddleOCR-VL-0.9B — CrispEmbed GGUF

CrispEmbed-native GGUF quantizations of PaddlePaddle/PaddleOCR-VL.

End-to-end VLM-based OCR: text recognition, table extraction, formula recognition, chart understanding. 109 languages.

Files

| File | Size | Description |

|------|------|-------------|

| paddleocr-vl-0.9b-q4_k.gguf | 1.3 GB | 4-bit K-quant — smallest |

| paddleocr-vl-0.9b-q8_0.gguf | 1.4 GB | 8-bit quantization — recommended |

| paddleocr-vl-0.9b-f16.gguf | 2.3 GB | fp16 reference |

Model

  • Architecture: NaViT-style ViT (27L, 1152d, SigLIP 2D RoPE + learned position embeddings)

+ Projector (pre-norm → 2×2 spatial merge → MLP)

+ ERNIE-4.5-0.3B LLM decoder (18L, 1024d, 16/2 GQA, MRoPE, SwiGLU)

  • Parameters: ~0.9B total
  • Languages: 109 (multilingual)
  • Tasks: OCR, Table Recognition, Formula Recognition, Chart Recognition
  • License: Apache 2.0

Usage with CrispEmbed

# OCR
./crispembed -m paddleocr-vl-0.9b-q8_0.gguf --ocr document.png

# With specific prompt
./crispembed -m paddleocr-vl-0.9b-q8_0.gguf --ocr-prompt "Table Recognition:" table.png

Conversion

git clone https://github.com/CrispStrobe/CrispEmbed
cd CrispEmbed

python models/convert-paddleocr-vl-to-gguf.py \\
    --model PaddlePaddle/PaddleOCR-VL \\
    --output paddleocr-vl-0.9b-f16.gguf --dtype f16

./build/crispembed-quantize paddleocr-vl-0.9b-f16.gguf paddleocr-vl-0.9b-q8_0.gguf q8_0

License

Apache 2.0 — same as the base model.

Run cstr/paddleocr-vl-0.9b-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models