GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

zuzett/lfm2-vl-3b-heretic-imatrix-gguf overview

Comprehensive model page for zuzett/lfm2-vl-3b-heretic-imatrix-gguf

transformersggufliquidlfm2lfm2-vledgehereticuncensoreddecensoredabliteratedimage-text-to-textenjafresdeitptarzhkobase_model:LiquidAI/LFM2-VL-3Bbase_model:quantized:LiquidAI/LFM2-VL-3Blicense:otherendpoints_compatibleregion:usconversational
zuzett/lfm2-vl-3b-heretic-imatrix-gguf visual
Downloads
4,727
Likes
3
Pipeline
image-text-to-text
Library
transformers
Visibility
Public
Access
Open

Repository Files & Downloads

27 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
LFM2-VL-3B-heretic-f16.gguf GGUF F16 4.79 GB Download
LFM2-VL-3B-heretic-imatrix-IQ1_M.gguf GGUF IQ1_M 608.23 MB Download
LFM2-VL-3B-heretic-imatrix-IQ1_S.gguf GGUF IQ1_S 556.22 MB Download
LFM2-VL-3B-heretic-imatrix-IQ2_M.gguf GGUF IQ2_M 846.31 MB Download
LFM2-VL-3B-heretic-imatrix-IQ2_S.gguf GGUF IQ2_S 776.97 MB Download
LFM2-VL-3B-heretic-imatrix-IQ2_XS.gguf GGUF IQ2_XS 765.25 MB Download
LFM2-VL-3B-heretic-imatrix-IQ2_XXS.gguf GGUF IQ2_XXS 694.91 MB Download
LFM2-VL-3B-heretic-imatrix-IQ3_M.gguf GGUF IQ3_M 1.09 GB Download
LFM2-VL-3B-heretic-imatrix-IQ3_S.gguf GGUF IQ3_S 1.08 GB Download
LFM2-VL-3B-heretic-imatrix-IQ3_XS.gguf GGUF IQ3_XS 1.03 GB Download
LFM2-VL-3B-heretic-imatrix-IQ3_XXS.gguf GGUF IQ3_XXS 979.17 MB Download
LFM2-VL-3B-heretic-imatrix-IQ4_NL.gguf GGUF IQ4_NL 1.38 GB Download
LFM2-VL-3B-heretic-imatrix-IQ4_XS.gguf GGUF IQ4_XS 1.31 GB Download
LFM2-VL-3B-heretic-imatrix-Q2_K.gguf GGUF Q2_K 938.23 MB Download
LFM2-VL-3B-heretic-imatrix-Q3_K_L.gguf GGUF Q3_K_L 1.25 GB Download
LFM2-VL-3B-heretic-imatrix-Q3_K_M.gguf GGUF Q3_K_M 1.17 GB Download
LFM2-VL-3B-heretic-imatrix-Q3_K_S.gguf GGUF Q3_K_S 1.08 GB Download
LFM2-VL-3B-heretic-imatrix-Q4_0.gguf GGUF 1.39 GB Download
LFM2-VL-3B-heretic-imatrix-Q4_1.gguf GGUF 1.52 GB Download
LFM2-VL-3B-heretic-imatrix-Q4_K_M.gguf GGUF Q4_K_M 1.46 GB Download
LFM2-VL-3B-heretic-imatrix-Q4_K_S.gguf GGUF Q4_K_S 1.39 GB Download
LFM2-VL-3B-heretic-imatrix-Q5_0.gguf GGUF 1.67 GB Download
LFM2-VL-3B-heretic-imatrix-Q5_K_M.gguf GGUF Q5_K_M 1.70 GB Download
LFM2-VL-3B-heretic-imatrix-Q5_K_S.gguf GGUF Q5_K_S 1.66 GB Download
LFM2-VL-3B-heretic-imatrix-Q6_K.gguf GGUF Q6_K 1.97 GB Download
LFM2-VL-3B-heretic-imatrix-Q8_0.gguf GGUF 2.55 GB Download
LFM2-VL-3B-mmproj-f16.gguf GGUF F16 820.97 MB Download

Model Details Live

Model Slug
zuzett/lfm2-vl-3b-heretic-imatrix-gguf
Author
ZuzeTt
Pipeline Task
image-text-to-text
Library
transformers
Created
2026-01-01
Last Modified
2026-02-23
Gated
No
Private
No
HF SHA
83cc45e6d1dae26597197f1a1fd600820e644036
License
other
Language
en, ja, fr, es, de, it, pt, ar, zh, ko
Base Model
LiquidAI/LFM2-VL-3B

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "library_name": "transformers",
    "license": "other",
    "license_name": "lfm1.0",
    "license_link": "LICENSE",
    "language": [
      "en",
      "ja",
      "fr",
      "es",
      "de",
      "it",
      "pt",
      "ar",
      "zh",
      "ko"
    ],
    "pipeline_tag": "image-text-to-text",
    "tags": [
      "liquid",
      "lfm2",
      "lfm2-vl",
      "edge",
      "heretic",
      "uncensored",
      "decensored",
      "abliterated"
    ],
    "base_model": [
      "LiquidAI/LFM2-VL-3B"
    ],
    "frontmatter": {
      "library_name": "transformers",
      "license": "other",
      "license_name": "lfm1.0",
      "license_link": "LICENSE",
      "language": [
        "en",
        "ja",
        "fr",
        "es",
        "de",
        "it",
        "pt",
        "ar",
        "zh",
        "ko"
      ],
      "pipeline_tag": "image-text-to-text",
      "tags": [
        "liquid",
        "lfm2",
        "lfm2-vl",
        "edge",
        "heretic",
        "uncensored",
        "decensored",
        "abliterated"
      ],
      "base_model": [
        "LiquidAI/LFM2-VL-3B"
      ]
    },
    "hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/7_6D7rWrLxp2hb6OHSV1p.png",
    "summary": "",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlibrary_name: transformers\nlicense: other\nlicense_name: lfm1.0\nlicense_link: LICENSE\nlanguage:\n- en\n- ja\n- fr\n- es\n- de\n- it\n- pt\n- ar\n- zh\n- ko\npipeline_tag: image-text-to-text\ntags:\n- liquid\n- lfm2\n- lfm2-vl\n- edge\n- heretic\n- uncensored\n- decensored\n- abliterated\nbase_model:\n- LiquidAI/LFM2-VL-3B\n---\n\nOrigianl model: https://huggingface.co/pszemraj/LFM2-VL-3B-heretic\n\n# This is a decensored version of [LiquidAI/LFM2-VL-3B](https://huggingface.co/LiquidAI/LFM2-VL-3B), made using [Heretic](https://github.com/p-e-w/heretic) v1.0.1\n\n## Abliteration parameters\n\n| Parameter | Value |\n| :-------- | :---: |\n| **direction_index** | per layer |\n| **attn.o_proj.max_weight** | 1.78 |\n| **attn.o_proj.max_weight_position** | 20.88 |\n| **attn.o_proj.min_weight** | 1.52 |\n| **attn.o_proj.min_weight_distance** | 12.07 |\n| **conv.out_proj.max_weight** | 1.01 |\n| **conv.out_proj.max_weight_position** | 21.66 |\n| **conv.out_proj.min_weight** | 0.13 |\n| **conv.out_proj.min_weight_distance** | 4.90 |\n| **mlp.down_proj.max_weight** | 1.16 |\n| **mlp.down_proj.max_weight_position** | 20.83 |\n| **mlp.down_proj.min_weight** | 0.29 |\n| **mlp.down_proj.min_weight_distance** | 1.03 |\n\n## Performance\n\n| Metric | This model | Original model ([LiquidAI/LFM2-VL-3B](https://huggingface.co/LiquidAI/LFM2-VL-3B)) |\n| :----- | :--------: | :---------------------------: |\n| **KL divergence** | 0.02 | 0 *(by definition)* |\n| **Refusals** | 4/100 | 87/100 |\n\n\n```\n? Which trial do you want to use? (Use arrow keys)\n » [Trial 251] Refusals:  0/100, KL divergence: 0.08\n   [Trial 386] Refusals:  1/100, KL divergence: 0.03\n   [Trial 277] Refusals:  2/100, KL divergence: 0.03\n   [Trial 389] Refusals:  3/100, KL divergence: 0.03\n   -->[Trial 323] Refusals:  4/100, KL divergence: 0.02<--\n   [Trial 324] Refusals:  6/100, KL divergence: 0.02\n   [Trial 220] Refusals:  7/100, KL divergence: 0.02\n   [Trial 357] Refusals:  8/100, KL divergence: 0.02\n   [Trial 316] Refusals: 10/100, KL divergence: 0.01\n   [Trial 230] Refusals: 12/100, KL divergence: 0.01\n   [Trial 234] Refusals: 18/100, KL divergence: 0.01\n   [Trial 379] Refusals: 27/100, KL divergence: 0.01\n   [Trial 336] Refusals: 34/100, KL divergence: 0.01\n   [Trial 345] Refusals: 35/100, KL divergence: 0.01\n   [Trial 248] Refusals: 40/100, KL divergence: 0.01\n   [Trial 398] Refusals: 60/100, KL divergence: 0.00\n   [Trial 380] Refusals: 64/100, KL divergence: 0.00\n   [Trial 363] Refusals: 66/100, KL divergence: 0.00\n   [Trial 155] Refusals: 69/100, KL divergence: 0.00\n   [Trial 310] Refusals: 70/100, KL divergence: 0.00\n```\n\n\n-----\n\n\n<center>\n<div style=\"text-align: center;\">\n  <img \n    src=\"https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/7_6D7rWrLxp2hb6OHSV1p.png\" \n    alt=\"Liquid AI\"\n    style=\"width: 100%; max-width: 66%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;\"\n  />\n</div>\n</center>\n\n# LFM2‑VL\n\n**LFM2-VL-3B** is the newest and most capable model in [Liquid AI](https://www.liquid.ai/)'s multimodal **LFM2-VL** series, designed to process text and images with variable resolutions.  \nBuilt on the [LFM2](https://huggingface.co/collections/LiquidAI/lfm2-686d721927015b2ad73eaa38) backbone, it extends the architecture for higher-capacity reasoning and stronger visual understanding while retaining efficiency.  \n\nWe are releasing the weights of the new [3B](https://huggingface.co/LiquidAI/LFM2-VL-3B) checkpoint—offering higher performance across benchmarks while remaining optimized for scalable deployment.\n\n* **Competitive multimodal performance** among lightweight open models.\n* **Enhanced visual understanding and reasoning**, particularly on fine-grained perception tasks\n* **Retains efficient inference** with the same flexible architecture and user-tunable speed-quality tradeoffs  \n* **Processes native resolutions up to 512×512** with intelligent patch-based handling for larger inputs  \n\nFor more details, see the [LFM2-VL-3B post](https://www.liquid.ai/blog/lfm2-vl-3b-a-new-efficient-vision-language-for-the-edge) and the [LFM2 blog post](https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models).\n\n## 📄 Model details\n\nDue to their small size, **we recommend fine-tuning LFM2-VL models on narrow use cases** to maximize performance. \nThey were trained for instruction following and lightweight agentic flows. \nNot intended for safety‑critical decisions.\n\n| Property | [**LFM2-VL-450M**](https://huggingface.co/LiquidAI/LFM2-VL-450M) | [**LFM2-VL-1.6B**](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | [**LFM2-VL-3B**](https://huggingface.co/LiquidAI/LFM2-VL-3B) |\n|---|---:|---:|---:|\n| **Parameters (LM only)** | 350M | 1.2B | 2.6B |\n| **Vision encoder** | SigLIP2 NaFlex base (86M) | SigLIP2 NaFlex shape-optimized (400M) | SigLIP2 NaFlex large (400M) |\n| **Backbone layers** | hybrid conv+attention | hybrid conv+attention | hybrid conv+attention |\n| **Context (text)** | 32,768 tokens | 32,768 tokens | 32,768 tokens |\n| **Image tokens** | dynamic, user-tunable | dynamic, user-tunable | dynamic, user-tunable |\n| **Vocab size** | 65,536 | 65,536 | 65,536 |\n| **Precision** | bfloat16 | bfloat16 | bfloat16 |\n| **License** | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 |\n\n**Supported languages:** English\n\n**Generation parameters**: We recommend the following parameters:\n- Text: `temperature=0.1`, `min_p=0.15`, `repetition_penalty=1.05`\n- Vision: `min_image_tokens=64` `max_image_tokens=256`, `do_image_splitting=True`\n\n**Chat template**: LFM2-VL uses a ChatML-like chat template as follows:  \n\n```\n<|startoftext|><|im_start|>system\nYou are a helpful multimodal assistant by Liquid AI.<|im_end|>\n<|im_start|>user\n<image>Describe this image.<|im_end|>\n<|im_start|>assistant\nThis image shows a Caenorhabditis elegans (C. elegans) nematode.<|im_end|>\n```\n\nImages are referenced with a sentinel (`<image>`), which is automatically replaced with the image tokens by the processor.\n\nYou can apply it using the dedicated [`.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#applychattemplate) function from Hugging Face transformers.\n\n**Architecture**\n- **Hybrid backbone**: Language model tower (LFM2-2.6B) paired with SigLIP2 NaFlex vision encoders (400M shape-optimized)\n- **Native resolution processing**: Handles images up to 512×512 pixels without upscaling and preserves non-standard aspect ratios without distortion\n- **Tiling strategy**: Splits large images into non-overlapping 512×512 patches and includes thumbnail encoding for global context\n- **Efficient token mapping**: 2-layer MLP connector with pixel unshuffle reduces image tokens (e.g., 256×384 image → 96 tokens, 1000×3000 → 1,020 tokens)\n- **Inference-time flexibility**: User-tunable maximum image tokens and patch count for speed/quality tradeoff without retraining\n\n**Training approach**\n- Builds on the LFM2 base model with joint mid-training that fuses vision and language capabilities using a gradually adjusted text-to-image ratio\n- Applies joint SFT with emphasis on image understanding and vision tasks\n- Leverages large-scale open-source datasets combined with in-house synthetic vision data, selected for balanced task coverage\n- Follows a progressive training strategy: base model → joint mid-training → supervised fine-tuning\n\n## 🏃 How to run LFM2-VL\n\nYou can run LFM2-VL with Hugging Face [`transformers`](https://github.com/huggingface/transformers) via installing Transformers from source as follows:\n\n```bash\npip install git+https://github.com/huggingface/transformers.git@87be5595081364ef99393feeaa60d71db3652679 pillow\n```\n\nHere is an example of how to generate an answer with transformers in Python:\n\n```python\nfrom transformers import AutoProcessor, AutoModelForImageTextToText\nfrom transformers.image_utils import load_image\n\n# Load model and processor\nmodel_id = \"LiquidAI/LFM2-VL-3B\"\nmodel = AutoModelForImageTextToText.from_pretrained(\n    model_id,\n    device_map=\"auto\",\n    dtype=\"bfloat16\"\n)\nprocessor = AutoProcessor.from_pretrained(model_id)\n\n# Load image and create conversation\nurl = \"https://www.ilankelman.org/stopsigns/australia.jpg\"\nimage = load_image(url)\nconversation = [\n    {\n        \"role\": \"user\",\n        \"content\": [\n            {\"type\": \"image\", \"image\": image},\n            {\"type\": \"text\", \"text\": \"What is in this image?\"},\n        ],\n    },\n]\n\n# Generate Answer\ninputs = processor.apply_chat_template(\n    conversation,\n    add_generation_prompt=True,\n    return_tensors=\"pt\",\n    return_dict=True,\n    tokenize=True,\n).to(model.device)\noutputs = model.generate(**inputs, max_new_tokens=64)\nprocessor.batch_decode(outputs, skip_special_tokens=True)[0]\n\n# This image captures a vibrant street scene in a Chinatown area. The focal point is a large red Chinese archway with gold and black accents, adorned with Chinese characters. Flanking the archway are two white stone lion statues, which are traditional guardians in Chinese culture.\n```\n\nYou can directly run and test the model with this [Colab notebook](https://colab.research.google.com/drive/11EMJhcVB6OTEuv--OePyGK86k-38WU3q?usp=sharing).\n\n\n## 🔧 How to fine-tune\n\nWe recommend fine-tuning LFM2-VL models on your use cases to maximize performance.\n\n| Notebook  | Description                                                          | Link |\n|-----------|----------------------------------------------------------------------|------|\n| SFT (TRL) | Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using TRL. | <a href=\"https://colab.research.google.com/drive/1csXCLwJx7wI7aruudBp6ZIcnqfv8EMYN?usp=sharing\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png\" width=\"110\" alt=\"Colab link\"></a> |\n\n\n## 📈 Performance\n\n| Model             | Average | MMStar | RealWorldQA | MM-IFEval | BLINK | MMBench (dev en) | OCRBench | POPE  |\n|-------------------|----------|--------|--------------|------------|--------|------------------|-----------|-------|\n| InternVL3_5-2B    | 66.50    | 57.67  | 60.78        | 47.31      | 50.97  | 78.18            | 834.00    | 87.17 |\n| Qwen2.5-VL-3B     | 65.42    | 56.13  | 65.23        | 38.62      | 48.97  | 80.41            | 824.00    | 86.17 |\n| InternVL3-2B      | 67.44    | 61.10  | 65.10        | 38.49      | 53.10  | 81.10            | 831.00    | 90.10 |\n| SmolVLM2-2.2B     | 56.01    | 46.00  | 57.50        | 19.42      | 42.30  | 69.24            | 725.00    | 85.10 |\n| LFM2-VL-3B        | 69.00    | 57.73  | 71.37        | 51.83      | 51.03  | 79.81            | 822.00    | 89.01 |\n\nMore benchmark scores are reported in our [LFM2-VL-3B post](https://www.liquid.ai/blog/lfm2-vl-3b-a-new-efficient-vision-language-for-the-edge). We obtained the scores for competitive models using VLMEvalKit. Qwen3-VL-2B is not listed in the results table, as its release occurred the day before.\n\n## 📬 Contact\n\nIf you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "liquid",
    "lfm2",
    "lfm2-vl",
    "edge",
    "heretic",
    "uncensored",
    "decensored",
    "abliterated",
    "image-text-to-text",
    "en",
    "ja",
    "fr",
    "es",
    "de",
    "it",
    "pt",
    "ar",
    "zh",
    "ko",
    "base_model:LiquidAI/LFM2-VL-3B",
    "base_model:quantized:LiquidAI/LFM2-VL-3B",
    "license:other",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 3,
  "downloads": 4727,
  "gated": false,
  "private": false,
  "last_modified": "2026-02-23T17:55:33.000Z",
  "created_at": "2026-01-01T14:06:40.000Z",
  "pipeline_tag": "image-text-to-text",
  "library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "69567f7027b09753439ef072",
  "id": "ZuzeTt/LFM2-VL-3B-heretic-Imatrix-GGUF",
  "modelId": "ZuzeTt/LFM2-VL-3B-heretic-Imatrix-GGUF",
  "sha": "83cc45e6d1dae26597197f1a1fd600820e644036",
  "createdAt": "2026-01-01T14:06:40.000Z",
  "lastModified": "2026-02-23T17:55:33.000Z",
  "author": "ZuzeTt",
  "downloads": 4727,
  "likes": 3,
  "gated": false,
  "private": false,
  "pipeline_tag": "image-text-to-text",
  "library_name": "transformers",
  "siblings_count": 29
}