cstr/text-super-resolution-gguf overview
Text Super Resolution & Restoration GGUF Models Lightweight super resolution and image restoration models converted to GGUF for CrispEmbed https://github.com/C…
Runs locally from ~0.1 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| adair-ref.gguf | GGUF | GGUF | 0.1 MB | Download |
| dat-light-x2-f16.gguf | GGUF | F16 | 37.7 MB | Download |
| dat-light-x2-f32.gguf | GGUF | F32 | 38.8 MB | Download |
| dat-ref.gguf | GGUF | GGUF | 18.0 MB | Download |
| hat-ref.gguf | GGUF | GGUF | 5.8 MB | Download |
| hat-sr-x4-f16.gguf | GGUF | F16 | 40.3 MB | Download |
| hat-sr-x4-q4_k.gguf | GGUF | Q4_K | 39.4 MB | Download |
| hat-sr-x4-q8_0.gguf | GGUF | Q8_0 | 39.7 MB | Download |
| pan-ref.gguf | GGUF | GGUF | 0.2 MB | Download |
| pan-x4-f16.gguf | GGUF | F16 | 0.5 MB | Download |
| pan-x4-q4_k.gguf | GGUF | Q4_K | 0.5 MB | Download |
| pan-x4-q8_0.gguf | GGUF | Q8_0 | 0.5 MB | Download |
| restormer-denoise-f16.gguf | GGUF | F16 | 49.9 MB | Download |
| restormer-denoise-q4_k.gguf | GGUF | Q4_K | 28.0 MB | Download |
| restormer-denoise-q8_0.gguf | GGUF | Q8_0 | 35.6 MB | Download |
| swinir-light-x4-f16.gguf | GGUF | F16 | 14.2 MB | Download |
| swinir-ref.gguf | GGUF | GGUF | 5.5 MB | Download |
| tbsrn-ref.gguf | GGUF | GGUF | 4.1 MB | Download |
| tbsrn-telescope-f16.gguf | GGUF | F16 | 2.2 MB | Download |
| tbsrn-telescope-q4_k.gguf | GGUF | Q4_K | 0.7 MB | Download |
| tbsrn-telescope-q8_0.gguf | GGUF | Q8_0 | 1.2 MB | Download |
Model Details
Model README
---
license: apache-2.0
tags:
- super-resolution
- image-restoration
- ocr
- text-enhancement
- gguf
- crispembed
library_name: crispembed
---
Text Super-Resolution & Restoration GGUF Models
Lightweight super-resolution and image restoration models converted to GGUF for CrispEmbed OCR preprocessing.
Models
| File | Architecture | Params | Scale | Size | License | Paper |
|------|-------------|--------|-------|------|---------|-------|
| tbsrn-telescope-f16.gguf | TBSRN (text-line SR) | 1.13M | 2x | 2.2 MB | Apache-2.0 | CVPR 2021 |
| pan-x4-f16.gguf | PAN (pixel attention) | 272K | 4x | 0.5 MB | Apache-2.0 | ECCV 2020W |
| hat-sr-x4-f16.gguf | HAT (hybrid attention transformer) | 21M | 4x | 40 MB | MIT | CVPR 2023 |
| dat-light-x2-f16.gguf | DAT-light (dual aggregation transformer) | 830K | 2x | 38 MB | Apache-2.0 | ICCV 2023 |
| restormer-denoise-f16.gguf | Restormer (denoising) | 26M | 1x | 50 MB | Apache-2.0 | CVPR 2022 |
TBSRN Telescope (text-line SR)
- Task: Enhance individual detected text lines before recognition
- Input: Text-line crop resized to 16x64 -> Output: 32x128 (2x)
- Source: PaddleOCR
sr_telescope(Apache-2.0)
PAN (whole-image 4x SR)
- Task: Upscale full document pages (rescues 75dpi text)
- Input: Any RGB image (tiled) -> Output: 4x upscale
- Source: PaddleGAN
pan_x4(Apache-2.0)
HAT (hybrid attention transformer, 4x SR)
- Task: High-quality 4x upscaling (CVPR 2023 SOTA on multiple SR benchmarks)
- Input: Any RGB image (tiled) -> Output: 4x upscale
- Architecture: Swin Transformer + overlapping cross-attention + channel attention
- Source: XPixelGroup/HAT (MIT)
DAT-light (dual aggregation transformer, 2x SR)
- Task: High-quality 2x upscaling with dual spatial+channel attention
- Input: Any RGB image (tiled) -> Output: 2x upscale
- Architecture: Split-channel windowed spatial attention + L2-normalized transposed channel attention + AIM + SGFN
- Source: zhengchen1999/DAT (Apache-2.0)
Restormer (image denoising/restoration)
- Task: Remove noise from document scans
- Input: Any RGB image -> Output: Denoised (same size)
- Architecture: Multi-Dconv head transposed attention, U-Net encoder-decoder
- Source: swz30/Restormer (Apache-2.0)
Parity Verification
All models pass the CrispEmbed diff harness (Python reference vs C++ engine):
| Model | cos_sim | Status |
|-------|---------|--------|
| TBSRN | 0.999985 | PASS |
| PAN | 0.999654 | PASS |
| HAT | 0.999990 | PASS |
| DAT-light | 0.999956 | PASS |
| Restormer | 1.000000 | PASS |
Usage with CrispEmbed
from crispembed import CrispPanSr, CrispDatSr
# PAN: 4x upscale
sr = CrispPanSr("pan-x4-f16.gguf")
out, ow, oh = sr.process(pixels, width, height)
# DAT: 2x upscale (higher quality)
sr = CrispDatSr("dat-light-x2-f16.gguf")
out, ow, oh = sr.process(pixels, width, height)
# CLI
crispembed --pan-model pan-x4-f16.gguf --pan-sr input.png > output.ppm
crispembed --dat-model dat-light-x2-f16.gguf --dat-sr input.png > output.ppm
License
Apache-2.0 for all models except HAT (MIT). Both licenses are permissive.
Run cstr/text-super-resolution-gguf with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models