GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE β†’
Model Intelligence Sheet

cyberandy/sangue-e-grafi-gemma4-e2b-gguf overview

🩸 Sangue e Grafi β€” Gemma 4 E2B GGUF Q4 K M Ready to run quantized model β€” Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference. <p …

ggufquantizedq4_k_mgemmakinship-reasoningknowledge-graphitalian-lawinheritancebuild-small-hackathonllama-cppollamatext-generationitenarxiv:2604.17056base_model:google/gemma-4-E2B-itbase_model:quantized:google/gemma-4-E2B-itlicense:apache-2.0model-indexendpoints_compatibleregion:usconversational

Runs locally from ~3.18 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
0
Pipeline
text-generation
Author

Repository Files & Downloads

1 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
sangue-e-grafi-gemma-Q4_K_M.ggufGGUFQ4_K_M3.18 GBDownload

Model Details

Model IDcyberandy/sangue-e-grafi-gemma4-e2b-gguf
Authorcyberandy
Pipelinetext-generation
Licenseapache-2.0
Base modelgoogle/gemma-4-E2B-it
Last modified2026-06-13T06:15:58.000Z

Model README

---

license: apache-2.0

base_model: google/gemma-4-E2B-it

tags:

- gguf

- quantized

- q4_k_m

- gemma

- kinship-reasoning

- knowledge-graph

- italian-law

- inheritance

- build-small-hackathon

- llama-cpp

- ollama

language:

- it

- en

pipeline_tag: text-generation

model-index:

- name: sangue-e-grafi-gemma4-e2b-gguf

results:

- task:

type: question-answering

name: Adversarial Kinship QA

metrics:

- name: Easy Benchmark Accuracy (Agent)

type: accuracy

value: 100.0

verified: false

- name: Hard Dev-Set Accuracy (Agent)

type: accuracy

value: 50.0

verified: false

---

🩸 Sangue e Grafi β€” Gemma 4 E2B GGUF (Q4_K_M)

> Ready-to-run quantized model β€” Gemma 4B with SFT + GRPO fully merged and converted to GGUF for local inference.

<p align="center">

<img src="https://huggingface.co/spaces/cyberandy/sangue-e-grafi/resolve/main/banner.png" alt="Sangue e Grafi banner" width="700"/>

</p>

Model Description

This is the fully merged and quantized version of the Sangue e Grafi Gemma pipeline:

google/gemma-4-E2B-it
  + SFT adapter (merged)
  + GRPO adapter (merged)
  β†’ GGUF Q4_K_M quantization
  β†’ ~3.3 GB single file

All training stages (SFT on 500 adversarial scenarios + GRPO reinforcement learning) are baked into a single GGUF file, ready for local inference with llama.cpp, llama-cpp-python, or ollama.

File Details

| Property | Value |

|---|---|

| Format | GGUF (Q4_K_M quantization) |

| Size | ~3.3 GB |

| Base model | google/gemma-4-E2B-it (4B params) |

| Training | SFT + GRPO, fully merged before quantization |

| Compatible with | llama.cpp, llama-cpp-python, ollama, LM Studio |

Benchmark Results πŸ“Š

| Benchmark | KG Agent (this model) | Gemini 2.5 Flash (no KG) |

|---|---|---|

| Easy (10 seeds) | 10/10 (100%) | 3/10 (30%) |

| Hard dev-set (10 seeds) | 5/10 (50%) | β€” |

Usage

With llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="cyberandy/sangue-e-grafi-gemma4-e2b-gguf",
    filename="*.gguf",
    n_ctx=4096,
)

output = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Your kinship question here..."}]
)

With ollama

# Download and run
ollama run hf.co/cyberandy/sangue-e-grafi-gemma4-e2b-gguf

With llama.cpp CLI

# Download the GGUF file, then:
./llama-cli -m sangue-e-grafi-gemma4-e2b.gguf -p "Your prompt here" -n 512

Intended Uses & Limitations

Intended uses:

  • Local/edge deployment of the KG-grounded agent
  • Quick experimentation without GPU or adapter merging
  • Integration with llama.cpp-based toolchains

Limitations:

  • Q4_K_M quantization may slightly reduce accuracy vs full-precision
  • Still requires the KG agent framework for full pipeline performance
  • Domain-specific to Italian kinship / inheritance law

Source Adapters

This GGUF was built from:

  1. SFT adapter: sangue-e-grafi-gemma4-e2b-sft-adversarial-v7
  2. GRPO adapter: sangue-e-grafi-gemma4-e2b-grpo-run-f-v7

Project Links

| Resource | Link |

|---|---|

| πŸš€ Live Demo | HF Space |

| πŸ“¦ GitHub | cyberandy/sangue-e-grafi |

| πŸ“„ Paper | RLM-on-KG (arXiv:2604.17056) |

| πŸ“Š Agent Traces Dataset | sangue-e-grafi-agent-traces |

Citation

@misc{sangue-e-grafi-2026,
  title   = {Sangue e Grafi: Small Models Beat Frontier LLMs on Adversarial Kinship Reasoning with Knowledge Graph Agents},
  author  = {Andrea Volpini},
  year    = {2026},
  url     = {https://github.com/cyberandy/sangue-e-grafi},
  note    = {Hugging Face Build Small Hackathon 2026}
}

Run cyberandy/sangue-e-grafi-gemma4-e2b-gguf with guIDE

Download guIDE β€” the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE β†’ Β· Browse 524k+ models Β· Compare models

Source: Hugging Face Β· Compare models