cstr/deepseek-ocr2-crispembed-GGUF overview
DeepSeek OCR2 — CrispEmbed GGUF GGUF conversion of deepseek ai/DeepSeek OCR 2 https://huggingface.co/deepseek ai/DeepSeek OCR 2 for use with CrispEmbed https:/…
Runs locally from ~2.07 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
Model Details
Model README
---
license: apache-2.0
tags:
- gguf
- ocr
- document-ocr
- crispembed
- moe
---
DeepSeek-OCR2 — CrispEmbed GGUF
GGUF conversion of deepseek-ai/DeepSeek-OCR-2 for use with CrispEmbed.
Architecture
SAM-ViT-B (12L, 768d) → Qwen2 encoder (24L, 896d, bidirectional) → Linear projector (896→1280) → DeepSeek-V2 MoE decoder (12L, 1280d, 64 experts top-6 + 2 shared, layer 0 dense) → lm_head
Models
| File | Quant | Size | Description |
|------|-------|------|-------------|
| deepseek-ocr2-f16.gguf | F16 | 6.4 GB | Full precision |
| deepseek-ocr2-q8_0.gguf | Q8_0 | ~3.4 GB | Best quality/size balance |
| deepseek-ocr2-q4_k.gguf | Q4_K | ~2.0 GB | Smallest, good quality |
Performance features
- Per-row embedding dequant (saves ~655 MB peak RSS vs full table expansion)
- MoE decoder on Metal via ggml_mul_mat_id
- SAM patch-embed + neck on Metal via ggml_conv_2d
- Qwen2 encoder on Metal graph
Converted with models/convert-deepseek-ocr2-to-gguf.py from CrispEmbed.
Run cstr/deepseek-ocr2-crispembed-GGUF with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models