What license applies to TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF?

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Model Intelligence Sheet

TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF overview

Q: How do I run TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF locally?

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

TobDeBer/Mellum2 12B A2.5B Instruct Q3 K L GGUF This model was converted to GGUF format from JetBrains/Mellum2 12B A2.5B Instruct https://huggingface.co/JetBra…

transformersggufllama-cppgguf-my-repotext-generationenbase_model:JetBrains/Mellum2-12B-A2.5B-Instructbase_model:quantized:JetBrains/Mellum2-12B-A2.5B-Instructlicense:apache-2.0model-indexendpoints_compatibleregion:us

Runs locally from ~6.14 GB disk (8 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads

Likes

Pipeline

text-generation

Author

TobDeBer

Repository Files & Downloads

1 GGUF files detected

Direct downloads for local inference

File	Type	Quantization	Size	Link
mellum2-12b-a2.5b-instruct-q3_k_l.gguf	GGUF	Q3_K_L	6.14 GB	Download

Model Details

Model ID	TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF
Author	TobDeBer
Pipeline	text-generation
License	apache-2.0
Base model	JetBrains/Mellum2-12B-A2.5B-Instruct
Last modified	2026-06-06T20:51:16.000Z

Model README

---

library_name: transformers

language:

pipeline_tag: text-generation

license: apache-2.0

base_model: JetBrains/Mellum2-12B-A2.5B-Instruct

tags:

llama-cpp
gguf-my-repo

model-index:

name: Mellum2 Instruct

results:

- task:

type: text-generation

dataset:

type: livecodebench

metrics:

- type: pass@1

value: 37.2

verified: false

- task:

type: text-generation

dataset:

type: evalplus

metrics:

- type: pass@1

value: 78.4

verified: false

- task:

type: text-generation

dataset:

type: multipl-e

metrics:

- type: pass@1

value: 67.1

verified: false

- task:

type: text-generation

dataset:

type: bfcl

metrics:

- type: acc

value: 66.3

verified: false

- type: acc

value: 44.2

verified: false

- task:

type: text-generation

dataset:

type: aime

metrics:

- type: exact_match

value: 41.7

verified: false

- task:

type: text-generation

dataset:

type: gsm-plus

metrics:

- type: exact_match

value: 80.5

verified: false

- task:

type: text-generation

dataset:

type: mmlu-redux

metrics:

- type: acc

value: 78.1

verified: false

- task:

type: text-generation

dataset:

type: gpqa

metrics:

- type: acc

value: 40.9

verified: false

- task:

type: text-generation

dataset:

type: ifeval

metrics:

- type: acc

value: 75.8

verified: false

- task:

type: text-generation

dataset:

type: mixeval

metrics:

- type: acc

value: 62.2

verified: false

- task:

type: text-generation

dataset:

type: bs-bench

metrics:

- type: detection_rate

value: 18.0

verified: false

- task:

type: text-generation

dataset:

type: harmbench

metrics:

- type: harmful_rate

value: 23.1

verified: false

- task:

type: text-generation

dataset:

type: xstest

metrics:

- type: safe_compliance

value: 81.2

verified: false

---

TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF

This model was converted to GGUF format from JetBrains/Mellum2-12B-A2.5B-Instruct using llama.cpp via the ggml.ai's GGUF-my-repo space.

Refer to the original model card for more details on the model.

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF --hf-file mellum2-12b-a2.5b-instruct-q3_k_l.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF --hf-file mellum2-12b-a2.5b-instruct-q3_k_l.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF --hf-file mellum2-12b-a2.5b-instruct-q3_k_l.gguf -p "The meaning to life and the universe is"

./llama-server --hf-repo TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF --hf-file mellum2-12b-a2.5b-instruct-q3_k_l.gguf -c 2048

Run TobDeBer/Mellum2-12B-A2.5B-Instruct-Q3_K_L-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models