cyberandy/sangue-e-grafi-gemma4-e2b-gguf overview
π©Έ Sangue e Grafi β Gemma 4 E2B GGUF Q4 K M Ready to run quantized model β Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference. <p β¦
Runs locally from ~3.18 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| sangue-e-grafi-gemma-Q4_K_M.gguf | GGUF | Q4_K_M | 3.18 GB | Download |
Model Details
| Model ID | cyberandy/sangue-e-grafi-gemma4-e2b-gguf |
|---|---|
| Author | cyberandy |
| Pipeline | text-generation |
| License | apache-2.0 |
| Base model | google/gemma-4-E2B-it |
| Last modified | 2026-06-13T06:15:58.000Z |
Model README
---
license: apache-2.0
base_model: google/gemma-4-E2B-it
tags:
- gguf
- quantized
- q4_k_m
- gemma
- kinship-reasoning
- knowledge-graph
- italian-law
- inheritance
- build-small-hackathon
- llama-cpp
- ollama
language:
- it
- en
pipeline_tag: text-generation
model-index:
- name: sangue-e-grafi-gemma4-e2b-gguf
results:
- task:
type: question-answering
name: Adversarial Kinship QA
metrics:
- name: Easy Benchmark Accuracy (Agent)
type: accuracy
value: 100.0
verified: false
- name: Hard Dev-Set Accuracy (Agent)
type: accuracy
value: 50.0
verified: false
---
π©Έ Sangue e Grafi β Gemma 4 E2B GGUF (Q4_K_M)
> Ready-to-run quantized model β Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference.
<p align="center">
<img src="https://huggingface.co/spaces/cyberandy/sangue-e-grafi/resolve/main/banner.png" alt="Sangue e Grafi banner" width="700"/>
</p>
Model Description
This is the fully merged and quantized version of the Sangue e Grafi Gemma pipeline:
google/gemma-4-E2B-it
+ SFT adapter (merged)
+ GRPO adapter (merged)
β GGUF Q4_K_M quantization
β ~3.3 GB single file
All training stages (SFT on 500 adversarial scenarios + GRPO reinforcement learning) are baked into a single GGUF file, ready for local inference with llama.cpp, llama-cpp-python, or ollama.
File Details
| Property | Value |
|---|---|
| Format | GGUF (Q4_K_M quantization) |
| Size | ~3.3 GB |
| Base model | google/gemma-4-E2B-it (4B params) |
| Training | SFT + GRPO, fully merged before quantization |
| Compatible with | llama.cpp, llama-cpp-python, ollama, LM Studio |
Benchmark Results π
| Benchmark | KG Agent (this model) | Gemini 2.5 Flash (no KG) |
|---|---|---|
| Easy (10 seeds) | 10/10 (100%) | 3/10 (30%) |
| Hard dev-set (10 seeds) | 5/10 (50%) | β |
Usage
With llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="cyberandy/sangue-e-grafi-gemma4-e2b-gguf",
filename="*.gguf",
n_ctx=4096,
)
output = llm.create_chat_completion(
messages=[{"role": "user", "content": "Your kinship question here..."}]
)
With ollama
# Download and run
ollama run hf.co/cyberandy/sangue-e-grafi-gemma4-e2b-gguf
With llama.cpp CLI
# Download the GGUF file, then:
./llama-cli -m sangue-e-grafi-gemma4-e2b.gguf -p "Your prompt here" -n 512
Intended Uses & Limitations
Intended uses:
- Local/edge deployment of the KG-grounded agent
- Quick experimentation without GPU or adapter merging
- Integration with llama.cpp-based toolchains
Limitations:
- Q4_K_M quantization may slightly reduce accuracy vs full-precision
- Still requires the KG agent framework for full pipeline performance
- Domain-specific to Italian kinship / inheritance law
Source Adapters
This GGUF was built from:
- SFT adapter: sangue-e-grafi-gemma4-e2b-sft-adversarial-v7
- GRPO adapter: sangue-e-grafi-gemma4-e2b-grpo-run-f-v7
Project Links
| Resource | Link |
|---|---|
| π Live Demo | HF Space |
| π¦ GitHub | cyberandy/sangue-e-grafi |
| π Paper | RLM-on-KG (arXiv:2604.17056) |
| π Agent Traces Dataset | sangue-e-grafi-agent-traces |
Citation
@misc{sangue-e-grafi-2026,
title = {Sangue e Grafi: Small Models Beat Frontier LLMs on Adversarial Kinship Reasoning with Knowledge Graph Agents},
author = {Andrea Volpini},
year = {2026},
url = {https://github.com/cyberandy/sangue-e-grafi},
note = {Hugging Face Build Small Hackathon 2026}
}Run cyberandy/sangue-e-grafi-gemma4-e2b-gguf with guIDE
Download guIDE β the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face Β· Compare models