Model Intelligence Sheet
zuzett/lfm2-vl-3b-heretic-imatrix-gguf overview
Comprehensive model page for zuzett/lfm2-vl-3b-heretic-imatrix-gguf
Downloads
4,727
Likes
3
Pipeline
image-text-to-text
Library
transformers
Visibility
Public
Access
Open
Repository Files & Downloads
27 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| LFM2-VL-3B-heretic-f16.gguf | GGUF | F16 | 4.79 GB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ1_M.gguf | GGUF | IQ1_M | 608.23 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ1_S.gguf | GGUF | IQ1_S | 556.22 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ2_M.gguf | GGUF | IQ2_M | 846.31 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ2_S.gguf | GGUF | IQ2_S | 776.97 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ2_XS.gguf | GGUF | IQ2_XS | 765.25 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ2_XXS.gguf | GGUF | IQ2_XXS | 694.91 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ3_M.gguf | GGUF | IQ3_M | 1.09 GB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ3_S.gguf | GGUF | IQ3_S | 1.08 GB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ3_XS.gguf | GGUF | IQ3_XS | 1.03 GB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ3_XXS.gguf | GGUF | IQ3_XXS | 979.17 MB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ4_NL.gguf | GGUF | IQ4_NL | 1.38 GB | Download |
| LFM2-VL-3B-heretic-imatrix-IQ4_XS.gguf | GGUF | IQ4_XS | 1.31 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q2_K.gguf | GGUF | Q2_K | 938.23 MB | Download |
| LFM2-VL-3B-heretic-imatrix-Q3_K_L.gguf | GGUF | Q3_K_L | 1.25 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q3_K_M.gguf | GGUF | Q3_K_M | 1.17 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q3_K_S.gguf | GGUF | Q3_K_S | 1.08 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q4_0.gguf | GGUF | — | 1.39 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q4_1.gguf | GGUF | — | 1.52 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q4_K_M.gguf | GGUF | Q4_K_M | 1.46 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q4_K_S.gguf | GGUF | Q4_K_S | 1.39 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q5_0.gguf | GGUF | — | 1.67 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q5_K_M.gguf | GGUF | Q5_K_M | 1.70 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q5_K_S.gguf | GGUF | Q5_K_S | 1.66 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q6_K.gguf | GGUF | Q6_K | 1.97 GB | Download |
| LFM2-VL-3B-heretic-imatrix-Q8_0.gguf | GGUF | — | 2.55 GB | Download |
| LFM2-VL-3B-mmproj-f16.gguf | GGUF | F16 | 820.97 MB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"library_name": "transformers",
"license": "other",
"license_name": "lfm1.0",
"license_link": "LICENSE",
"language": [
"en",
"ja",
"fr",
"es",
"de",
"it",
"pt",
"ar",
"zh",
"ko"
],
"pipeline_tag": "image-text-to-text",
"tags": [
"liquid",
"lfm2",
"lfm2-vl",
"edge",
"heretic",
"uncensored",
"decensored",
"abliterated"
],
"base_model": [
"LiquidAI/LFM2-VL-3B"
],
"frontmatter": {
"library_name": "transformers",
"license": "other",
"license_name": "lfm1.0",
"license_link": "LICENSE",
"language": [
"en",
"ja",
"fr",
"es",
"de",
"it",
"pt",
"ar",
"zh",
"ko"
],
"pipeline_tag": "image-text-to-text",
"tags": [
"liquid",
"lfm2",
"lfm2-vl",
"edge",
"heretic",
"uncensored",
"decensored",
"abliterated"
],
"base_model": [
"LiquidAI/LFM2-VL-3B"
]
},
"hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/7_6D7rWrLxp2hb6OHSV1p.png",
"summary": "",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlibrary_name: transformers\nlicense: other\nlicense_name: lfm1.0\nlicense_link: LICENSE\nlanguage:\n- en\n- ja\n- fr\n- es\n- de\n- it\n- pt\n- ar\n- zh\n- ko\npipeline_tag: image-text-to-text\ntags:\n- liquid\n- lfm2\n- lfm2-vl\n- edge\n- heretic\n- uncensored\n- decensored\n- abliterated\nbase_model:\n- LiquidAI/LFM2-VL-3B\n---\n\nOrigianl model: https://huggingface.co/pszemraj/LFM2-VL-3B-heretic\n\n# This is a decensored version of [LiquidAI/LFM2-VL-3B](https://huggingface.co/LiquidAI/LFM2-VL-3B), made using [Heretic](https://github.com/p-e-w/heretic) v1.0.1\n\n## Abliteration parameters\n\n| Parameter | Value |\n| :-------- | :---: |\n| **direction_index** | per layer |\n| **attn.o_proj.max_weight** | 1.78 |\n| **attn.o_proj.max_weight_position** | 20.88 |\n| **attn.o_proj.min_weight** | 1.52 |\n| **attn.o_proj.min_weight_distance** | 12.07 |\n| **conv.out_proj.max_weight** | 1.01 |\n| **conv.out_proj.max_weight_position** | 21.66 |\n| **conv.out_proj.min_weight** | 0.13 |\n| **conv.out_proj.min_weight_distance** | 4.90 |\n| **mlp.down_proj.max_weight** | 1.16 |\n| **mlp.down_proj.max_weight_position** | 20.83 |\n| **mlp.down_proj.min_weight** | 0.29 |\n| **mlp.down_proj.min_weight_distance** | 1.03 |\n\n## Performance\n\n| Metric | This model | Original model ([LiquidAI/LFM2-VL-3B](https://huggingface.co/LiquidAI/LFM2-VL-3B)) |\n| :----- | :--------: | :---------------------------: |\n| **KL divergence** | 0.02 | 0 *(by definition)* |\n| **Refusals** | 4/100 | 87/100 |\n\n\n```\n? Which trial do you want to use? (Use arrow keys)\n » [Trial 251] Refusals: 0/100, KL divergence: 0.08\n [Trial 386] Refusals: 1/100, KL divergence: 0.03\n [Trial 277] Refusals: 2/100, KL divergence: 0.03\n [Trial 389] Refusals: 3/100, KL divergence: 0.03\n -->[Trial 323] Refusals: 4/100, KL divergence: 0.02<--\n [Trial 324] Refusals: 6/100, KL divergence: 0.02\n [Trial 220] Refusals: 7/100, KL divergence: 0.02\n [Trial 357] Refusals: 8/100, KL divergence: 0.02\n [Trial 316] Refusals: 10/100, KL divergence: 0.01\n [Trial 230] Refusals: 12/100, KL divergence: 0.01\n [Trial 234] Refusals: 18/100, KL divergence: 0.01\n [Trial 379] Refusals: 27/100, KL divergence: 0.01\n [Trial 336] Refusals: 34/100, KL divergence: 0.01\n [Trial 345] Refusals: 35/100, KL divergence: 0.01\n [Trial 248] Refusals: 40/100, KL divergence: 0.01\n [Trial 398] Refusals: 60/100, KL divergence: 0.00\n [Trial 380] Refusals: 64/100, KL divergence: 0.00\n [Trial 363] Refusals: 66/100, KL divergence: 0.00\n [Trial 155] Refusals: 69/100, KL divergence: 0.00\n [Trial 310] Refusals: 70/100, KL divergence: 0.00\n```\n\n\n-----\n\n\n<center>\n<div style=\"text-align: center;\">\n <img \n src=\"https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/7_6D7rWrLxp2hb6OHSV1p.png\" \n alt=\"Liquid AI\"\n style=\"width: 100%; max-width: 66%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;\"\n />\n</div>\n</center>\n\n# LFM2‑VL\n\n**LFM2-VL-3B** is the newest and most capable model in [Liquid AI](https://www.liquid.ai/)'s multimodal **LFM2-VL** series, designed to process text and images with variable resolutions. \nBuilt on the [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-686d721927015b2ad73eaa38) backbone, it extends the architecture for higher-capacity reasoning and stronger visual understanding while retaining efficiency. \n\nWe are releasing the weights of the new [3B](https://huggingface.co/LiquidAI/LFM2-VL-3B) checkpoint—offering higher performance across benchmarks while remaining optimized for scalable deployment.\n\n* **Competitive multimodal performance** among lightweight open models.\n* **Enhanced visual understanding and reasoning**, particularly on fine-grained perception tasks\n* **Retains efficient inference** with the same flexible architecture and user-tunable speed-quality tradeoffs \n* **Processes native resolutions up to 512×512** with intelligent patch-based handling for larger inputs \n\nFor more details, see the [LFM2-VL-3B post](https://www.liquid.ai/blog/lfm2-vl-3b-a-new-efficient-vision-language-for-the-edge) and the [LFM2 blog post](https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models).\n\n## 📄 Model details\n\nDue to their small size, **we recommend fine-tuning LFM2-VL models on narrow use cases** to maximize performance. \nThey were trained for instruction following and lightweight agentic flows. \nNot intended for safety‑critical decisions.\n\n| Property | [**LFM2-VL-450M**](https://huggingface.co/LiquidAI/LFM2-VL-450M) | [**LFM2-VL-1.6B**](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | [**LFM2-VL-3B**](https://huggingface.co/LiquidAI/LFM2-VL-3B) |\n|---|---:|---:|---:|\n| **Parameters (LM only)** | 350M | 1.2B | 2.6B |\n| **Vision encoder** | SigLIP2 NaFlex base (86M) | SigLIP2 NaFlex shape-optimized (400M) | SigLIP2 NaFlex large (400M) |\n| **Backbone layers** | hybrid conv+attention | hybrid conv+attention | hybrid conv+attention |\n| **Context (text)** | 32,768 tokens | 32,768 tokens | 32,768 tokens |\n| **Image tokens** | dynamic, user-tunable | dynamic, user-tunable | dynamic, user-tunable |\n| **Vocab size** | 65,536 | 65,536 | 65,536 |\n| **Precision** | bfloat16 | bfloat16 | bfloat16 |\n| **License** | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 |\n\n**Supported languages:** English\n\n**Generation parameters**: We recommend the following parameters:\n- Text: `temperature=0.1`, `min_p=0.15`, `repetition_penalty=1.05`\n- Vision: `min_image_tokens=64` `max_image_tokens=256`, `do_image_splitting=True`\n\n**Chat template**: LFM2-VL uses a ChatML-like chat template as follows: \n\n```\n<|startoftext|><|im_start|>system\nYou are a helpful multimodal assistant by Liquid AI.<|im_end|>\n<|im_start|>user\n<image>Describe this image.<|im_end|>\n<|im_start|>assistant\nThis image shows a Caenorhabditis elegans (C. elegans) nematode.<|im_end|>\n```\n\nImages are referenced with a sentinel (`<image>`), which is automatically replaced with the image tokens by the processor.\n\nYou can apply it using the dedicated [`.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#applychattemplate) function from Hugging Face transformers.\n\n**Architecture**\n- **Hybrid backbone**: Language model tower (LFM2-2.6B) paired with SigLIP2 NaFlex vision encoders (400M shape-optimized)\n- **Native resolution processing**: Handles images up to 512×512 pixels without upscaling and preserves non-standard aspect ratios without distortion\n- **Tiling strategy**: Splits large images into non-overlapping 512×512 patches and includes thumbnail encoding for global context\n- **Efficient token mapping**: 2-layer MLP connector with pixel unshuffle reduces image tokens (e.g., 256×384 image → 96 tokens, 1000×3000 → 1,020 tokens)\n- **Inference-time flexibility**: User-tunable maximum image tokens and patch count for speed/quality tradeoff without retraining\n\n**Training approach**\n- Builds on the LFM2 base model with joint mid-training that fuses vision and language capabilities using a gradually adjusted text-to-image ratio\n- Applies joint SFT with emphasis on image understanding and vision tasks\n- Leverages large-scale open-source datasets combined with in-house synthetic vision data, selected for balanced task coverage\n- Follows a progressive training strategy: base model → joint mid-training → supervised fine-tuning\n\n## 🏃 How to run LFM2-VL\n\nYou can run LFM2-VL with Hugging Face [`transformers`](https://github.com/huggingface/transformers) via installing Transformers from source as follows:\n\n```bash\npip install git+https://github.com/huggingface/transformers.git@87be5595081364ef99393feeaa60d71db3652679 pillow\n```\n\nHere is an example of how to generate an answer with transformers in Python:\n\n```python\nfrom transformers import AutoProcessor, AutoModelForImageTextToText\nfrom transformers.image_utils import load_image\n\n# Load model and processor\nmodel_id = \"LiquidAI/LFM2-VL-3B\"\nmodel = AutoModelForImageTextToText.from_pretrained(\n model_id,\n device_map=\"auto\",\n dtype=\"bfloat16\"\n)\nprocessor = AutoProcessor.from_pretrained(model_id)\n\n# Load image and create conversation\nurl = \"https://www.ilankelman.org/stopsigns/australia.jpg\"\nimage = load_image(url)\nconversation = [\n {\n \"role\": \"user\",\n \"content\": [\n {\"type\": \"image\", \"image\": image},\n {\"type\": \"text\", \"text\": \"What is in this image?\"},\n ],\n },\n]\n\n# Generate Answer\ninputs = processor.apply_chat_template(\n conversation,\n add_generation_prompt=True,\n return_tensors=\"pt\",\n return_dict=True,\n tokenize=True,\n).to(model.device)\noutputs = model.generate(**inputs, max_new_tokens=64)\nprocessor.batch_decode(outputs, skip_special_tokens=True)[0]\n\n# This image captures a vibrant street scene in a Chinatown area. The focal point is a large red Chinese archway with gold and black accents, adorned with Chinese characters. Flanking the archway are two white stone lion statues, which are traditional guardians in Chinese culture.\n```\n\nYou can directly run and test the model with this [Colab notebook](https://colab.research.google.com/drive/11EMJhcVB6OTEuv--OePyGK86k-38WU3q?usp=sharing).\n\n\n## 🔧 How to fine-tune\n\nWe recommend fine-tuning LFM2-VL models on your use cases to maximize performance.\n\n| Notebook | Description | Link |\n|-----------|----------------------------------------------------------------------|------|\n| SFT (TRL) | Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using TRL. | <a href=\"https://colab.research.google.com/drive/1csXCLwJx7wI7aruudBp6ZIcnqfv8EMYN?usp=sharing\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png\" width=\"110\" alt=\"Colab link\"></a> |\n\n\n## 📈 Performance\n\n| Model | Average | MMStar | RealWorldQA | MM-IFEval | BLINK | MMBench (dev en) | OCRBench | POPE |\n|-------------------|----------|--------|--------------|------------|--------|------------------|-----------|-------|\n| InternVL3_5-2B | 66.50 | 57.67 | 60.78 | 47.31 | 50.97 | 78.18 | 834.00 | 87.17 |\n| Qwen2.5-VL-3B | 65.42 | 56.13 | 65.23 | 38.62 | 48.97 | 80.41 | 824.00 | 86.17 |\n| InternVL3-2B | 67.44 | 61.10 | 65.10 | 38.49 | 53.10 | 81.10 | 831.00 | 90.10 |\n| SmolVLM2-2.2B | 56.01 | 46.00 | 57.50 | 19.42 | 42.30 | 69.24 | 725.00 | 85.10 |\n| LFM2-VL-3B | 69.00 | 57.73 | 71.37 | 51.83 | 51.03 | 79.81 | 822.00 | 89.01 |\n\nMore benchmark scores are reported in our [LFM2-VL-3B post](https://www.liquid.ai/blog/lfm2-vl-3b-a-new-efficient-vision-language-for-the-edge). We obtained the scores for competitive models using VLMEvalKit. Qwen3-VL-2B is not listed in the results table, as its release occurred the day before.\n\n## 📬 Contact\n\nIf you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).",
"related_quantizations": []
},
"tags": [
"transformers",
"gguf",
"liquid",
"lfm2",
"lfm2-vl",
"edge",
"heretic",
"uncensored",
"decensored",
"abliterated",
"image-text-to-text",
"en",
"ja",
"fr",
"es",
"de",
"it",
"pt",
"ar",
"zh",
"ko",
"base_model:LiquidAI/LFM2-VL-3B",
"base_model:quantized:LiquidAI/LFM2-VL-3B",
"license:other",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 3,
"downloads": 4727,
"gated": false,
"private": false,
"last_modified": "2026-02-23T17:55:33.000Z",
"created_at": "2026-01-01T14:06:40.000Z",
"pipeline_tag": "image-text-to-text",
"library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69567f7027b09753439ef072",
"id": "ZuzeTt/LFM2-VL-3B-heretic-Imatrix-GGUF",
"modelId": "ZuzeTt/LFM2-VL-3B-heretic-Imatrix-GGUF",
"sha": "83cc45e6d1dae26597197f1a1fd600820e644036",
"createdAt": "2026-01-01T14:06:40.000Z",
"lastModified": "2026-02-23T17:55:33.000Z",
"author": "ZuzeTt",
"downloads": 4727,
"likes": 3,
"gated": false,
"private": false,
"pipeline_tag": "image-text-to-text",
"library_name": "transformers",
"siblings_count": 29
}