What license applies to cyberandy/sangue-e-grafi-gemma4-e2b-gguf?

License: apache-2.0. Verify terms on Hugging Face before commercial use.

How do I run cyberandy/sangue-e-grafi-gemma4-e2b-gguf locally?

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Model Intelligence Sheet

cyberandy/sangue-e-grafi-gemma4-e2b-gguf overview

Q: How much VRAM or disk space does cyberandy/sangue-e-grafi-gemma4-e2b-gguf need?

Runs locally from ~3.18 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

🩸 Sangue e Grafi — Gemma 4 E2B GGUF Q4 K M Ready to run quantized model — Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference. <p …

ggufquantizedq4_k_mgemmakinship-reasoningknowledge-graphitalian-lawinheritancebuild-small-hackathonllama-cppollamatext-generationitenarxiv:2604.17056base_model:google/gemma-4-E2B-itbase_model:quantized:google/gemma-4-E2B-itlicense:apache-2.0model-indexendpoints_compatibleregion:usconversational

Runs locally from ~3.18 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads

Likes

Pipeline

text-generation

Author

cyberandy

Repository Files & Downloads

1 GGUF files detected

Direct downloads for local inference

File	Type	Quantization	Size	Link
sangue-e-grafi-gemma-Q4_K_M.gguf	GGUF	Q4_K_M	3.18 GB	Download

Model Details

Model ID	cyberandy/sangue-e-grafi-gemma4-e2b-gguf
Author	cyberandy
Pipeline	text-generation
License	apache-2.0
Base model	google/gemma-4-E2B-it
Last modified	2026-06-13T06:15:58.000Z

Model README

---

license: apache-2.0

base_model: google/gemma-4-E2B-it

tags:

- gguf

- quantized

- q4_k_m

- gemma

- kinship-reasoning

- knowledge-graph

- italian-law

- inheritance

- build-small-hackathon

- llama-cpp

- ollama

language:

- it

- en

pipeline_tag: text-generation

model-index:

- name: sangue-e-grafi-gemma4-e2b-gguf

results:

- task:

type: question-answering

name: Adversarial Kinship QA

metrics:

- name: Easy Benchmark Accuracy (Agent)

type: accuracy

value: 100.0

verified: false

- name: Hard Dev-Set Accuracy (Agent)

type: accuracy

value: 50.0

verified: false

---

🩸 Sangue e Grafi — Gemma 4 E2B GGUF (Q4_K_M)

> Ready-to-run quantized model — Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference.

</p>

Model Description

This is the fully merged and quantized version of the Sangue e Grafi Gemma pipeline:

google/gemma-4-E2B-it
  + SFT adapter (merged)
  + GRPO adapter (merged)
  → GGUF Q4_K_M quantization
  → ~3.3 GB single file

All training stages (SFT on 500 adversarial scenarios + GRPO reinforcement learning) are baked into a single GGUF file, ready for local inference with llama.cpp, llama-cpp-python, or ollama.

File Details

| Property | Value |

|---|---|

| Format | GGUF (Q4_K_M quantization) |

| Size | ~3.3 GB |

| Base model | google/gemma-4-E2B-it (4B params) |

| Training | SFT + GRPO, fully merged before quantization |

| Compatible with | llama.cpp, llama-cpp-python, ollama, LM Studio |

Benchmark Results 📊

| Benchmark | KG Agent (this model) | Gemini 2.5 Flash (no KG) |

|---|---|---|

| Easy (10 seeds) | 10/10 (100%) | 3/10 (30%) |

| Hard dev-set (10 seeds) | 5/10 (50%) | — |

Usage

With llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="cyberandy/sangue-e-grafi-gemma4-e2b-gguf",
    filename="*.gguf",
    n_ctx=4096,
)

output = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Your kinship question here..."}]
)

With ollama

# Download and run
ollama run hf.co/cyberandy/sangue-e-grafi-gemma4-e2b-gguf

With llama.cpp CLI

# Download the GGUF file, then:
./llama-cli -m sangue-e-grafi-gemma4-e2b.gguf -p "Your prompt here" -n 512

Intended Uses & Limitations

Intended uses:

Local/edge deployment of the KG-grounded agent
Quick experimentation without GPU or adapter merging
Integration with llama.cpp-based toolchains

Limitations:

Q4_K_M quantization may slightly reduce accuracy vs full-precision
Still requires the KG agent framework for full pipeline performance
Domain-specific to Italian kinship / inheritance law

Source Adapters

This GGUF was built from:

SFT adapter: sangue-e-grafi-gemma4-e2b-sft-adversarial-v7
GRPO adapter: sangue-e-grafi-gemma4-e2b-grpo-run-f-v7

Project Links

| Resource | Link |

|---|---|

| 🚀 Live Demo | HF Space |

| 📦 GitHub | cyberandy/sangue-e-grafi |

| 📄 Paper | RLM-on-KG (arXiv:2604.17056) |

| 📊 Agent Traces Dataset | sangue-e-grafi-agent-traces |

Citation

@misc{sangue-e-grafi-2026,
  title   = {Sangue e Grafi: Small Models Beat Frontier LLMs on Adversarial Kinship Reasoning with Knowledge Graph Agents},
  author  = {Andrea Volpini},
  year    = {2026},
  url     = {https://github.com/cyberandy/sangue-e-grafi},
  note    = {Hugging Face Build Small Hackathon 2026}
}

Run cyberandy/sangue-e-grafi-gemma4-e2b-gguf with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models