GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

FreedomAISVR/Ministral-3-14B-Instruct-2512-NVFP4-GGUF overview

Ministral 3 14B Instruct 2512 — NVFP4 GGUF NVFP4 quantization of mistralai/Ministral 3 14B Instruct 2512 https://huggingface.co/mistralai/Ministral 3 14B Instr…

ggufmistralministralnvfp4visionmultimodalcodinginstructenmultilingualbase_model:mistralai/Ministral-3-14B-Instruct-2512base_model:quantized:mistralai/Ministral-3-14B-Instruct-2512license:apache-2.0endpoints_compatibleregion:usconversational

Runs locally from ~837.4 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
1
Pipeline

Repository Files & Downloads

2 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
ministral-3-14b-instruct-2512-nvfp4.ggufGGUFGGUF7.25 GBDownload
mmproj-ministral-3-14b-instruct-2512-f16.ggufGGUFF16837.4 MBDownload

Model Details

Model IDFreedomAISVR/Ministral-3-14B-Instruct-2512-NVFP4-GGUF
AuthorFreedomAISVR
Pipeline
Licenseapache-2.0
Base modelmistralai/Ministral-3-14B-Instruct-2512
Last modified2026-06-24T04:14:51.000Z

Model README

---

language:

  • en
  • multilingual

tags:

  • mistral
  • ministral
  • nvfp4
  • gguf
  • vision
  • multimodal
  • coding
  • instruct

license: apache-2.0

base_model: mistralai/Ministral-3-14B-Instruct-2512

---

Ministral 3 14B Instruct-2512 — NVFP4 GGUF

NVFP4 quantization of mistralai/Ministral-3-14B-Instruct-2512, a 14B parameter coding and vision model from Mistral AI.

About the Model

Ministral 3 14B is a dense transformer with 40 layers, 5120 hidden dimension, and 24-layer Pixtral ViT vision encoder. It supports:

  • Code generation and debugging across multiple languages
  • Vision understanding via multimodal image input
  • Tool calling with native function calling support
  • 131K context window
  • 393K maximum context length

Quantization

This GGUF was quantized from the FP8_E4M3 source weights using llama.cpp (build 537). The source safetensors were dequantized to F16 during conversion, then quantized to NVFP4 format.

NVFP4 (Microscaling FP4) uses block-wise quantization with shared exponents per block, providing better precision than standard FP4 for the same memory footprint.

Files

| File | Size | Description |

|------|------|-------------|

| ministral-3-14b-instruct-2512-nvfp4.gguf | ~6.9 GB | NVFP4 quantized model weights |

| mmproj-ministral-3-14b-instruct-2512-f16.gguf | ~878 MB | Vision projector (F16, unquantized) |

Usage

llama.cpp

# Server mode with OpenAI-compatible API
llama-server \
  -m ministral-3-14b-instruct-2512-nvfp4.gguf \
  --mmproj mmproj-ministral-3-14b-instruct-2512-f16.gguf \
  -ngl 99 \
  --host 0.0.0.0 \
  --port 8080

# Direct inference
llama-cli \
  -m ministral-3-14b-instruct-2512-nvfp4.gguf \
  --mmproj mmproj-ministral-3-14b-instruct-2512-f16.gguf \
  -ngl 99 \
  -p "Write a Python function to compute fibonacci numbers"

LM Studio

  1. Download both files from this repository
  2. Load the main GGUF file in LM Studio
  3. Load the mmproj file for vision support
  4. Set GPU offload layers to maximum

Architecture

  • Parameters: 14B (dense transformer)
  • Layers: 40
  • Hidden dimension: 5120
  • Attention heads: 32 (8 KV heads for GQA)
  • Vision encoder: 24-layer Pixtral ViT
  • Context: 131K (native), 393K (extended)
  • Vocabulary: Mistral Tekken tokenizer

Hardware Requirements

  • Minimum: 8 GB VRAM for text-only, 10 GB for vision
  • Recommended: 16 GB VRAM for full GPU offload
  • Disk: ~7.8 GB for model + mmproj

Quantization Details

| Metric | Value |

|--------|-------|

| Source format | FP8_E4M3 (safetensors) |

| Intermediate | F16 GGUF |

| Output format | NVFP4 |

| Approximate BPW | ~4.6 |

| Quantized with | llama.cpp build 537 |

License

Apache 2.0 — same as the base model.

Run FreedomAISVR/Ministral-3-14B-Instruct-2512-NVFP4-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models