abderrahmanskiredj1/gemmaroc-4b-tulu-q4_k_m-gguf Q4_K_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
Model Intelligence Sheet
abderrahmanskiredj1/gemmaroc-4b-tulu-q4_k_m-gguf overview
This model was converted to GGUF format from GemMaroc/GemMaroc-4b-tulu using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.
Downloads
99
Likes
0
Pipeline
text-generation
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
1 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| gemmaroc-4b-tulu-q4_k_m-imat.gguf | GGUF | Q4_K_M | 2.32 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"base_model": "GemMaroc/GemMaroc-4b-tulu",
"tags": [
"llama-cpp",
"gguf-my-repo",
"Moroccan",
"Darija",
"GemMaroc",
"conversational"
],
"pipeline_tag": "text-generation",
"language": [
"ar",
"ary",
"en"
],
"frontmatter": {
"base_model": "GemMaroc/GemMaroc-4b-tulu",
"tags": [
"llama-cpp",
"gguf-my-repo",
"Moroccan",
"Darija",
"GemMaroc",
"conversational"
],
"pipeline_tag": "text-generation",
"language": [
"ar",
"ary",
"en"
]
},
"hero_image_url": "",
"summary": "This model was converted to GGUF format from GemMaroc/GemMaroc-4b-tulu using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nbase_model: GemMaroc/GemMaroc-4b-tulu\ntags:\n- llama-cpp\n- gguf-my-repo\n- Moroccan\n- Darija\n- GemMaroc\n- conversational\npipeline_tag: text-generation\nlanguage:\n- ar\n- ary\n- en\n---\n\n# AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF\nThis model was converted to GGUF format from [`GemMaroc/GemMaroc-4b-tulu`](https://huggingface.co/GemMaroc/GemMaroc-4b-tulu) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.\nRefer to the [original model card](https://huggingface.co/GemMaroc/GemMaroc-4b-tulu) for more details on the model.\n\n## Use with llama.cpp\nInstall llama.cpp through brew (works on Mac and Linux)\n\n```bash\nbrew install llama.cpp\n\n```\nInvoke the llama.cpp server or the CLI.\n\n### CLI:\n```bash\nllama-cli --hf-repo AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF --hf-file gemmaroc-4b-tulu-q4_k_m-imat.gguf -p \"The meaning to life and the universe is\"\n```\n\n### Server:\n```bash\nllama-server --hf-repo AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF --hf-file gemmaroc-4b-tulu-q4_k_m-imat.gguf -c 2048\n```\n\nNote: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.\n\nStep 1: Clone llama.cpp from GitHub.\n```\ngit clone https://github.com/ggerganov/llama.cpp\n```\n\nStep 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).\n```\ncd llama.cpp && LLAMA_CURL=1 make\n```\n\nStep 3: Run inference through the main binary.\n```\n./llama-cli --hf-repo AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF --hf-file gemmaroc-4b-tulu-q4_k_m-imat.gguf -p \"The meaning to life and the universe is\"\n```\nor \n```\n./llama-server --hf-repo AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF --hf-file gemmaroc-4b-tulu-q4_k_m-imat.gguf -c 2048\n```\n\n\n---\nlibrary_name: transformers\ntags:\n- MoroccanArabic\n- Darija\n- GemMaroc\ndatasets:\n- GemMaroc/TULU-3-50k-darija-english\nlanguage:\n- ar\n- ary\n- en\nbase_model:\n- google/gemma-3-27b-it\n---\n\n\n\n# GemMaroc‑27B\n\nUnlocking **Moroccan Darija** proficiency in a state‑of‑the‑art large language model, trained with a *minimal‑data, green‑AI* recipe that preserves Gemma‑27B’s strong reasoning abilities while adding fluent Darija generation.\n\n---\n\n## Model at a glance\n\n| | Details |\n| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- |\n| **Model ID** | `AbderrahmanSkiredj1/GemMaroc-27b-it` |\n| **Base model** | [`google/gemma-3-27b`](https://huggingface.co/google/gemma-3-27b) |\n| **Architecture** | Decoder‑only Transformer (Gemma 3) |\n| **Parameters** | 27 billion |\n| **Context length** | 2 048 tokens |\n| **Training regime** | Supervised fine‑tuning (LoRA → merged) on 50 K high‑quality Darija/English instructions TULU‑50K slice |\n| **Compute budget** | 48 GPU·h (8 × H100‑80GB × 6 h) – ≈ 26 kWh / 10 kg CO₂e |\n| **License** | Apache 2.0 |\n\n---\n\n## Why another Darija model?\n\n* **Inclusive AI** > 36 million speakers of Moroccan Arabic remain underserved by open LLMs.\n* **Quality‑over‑quantity** A carefully curated 50 K instruction set surfaces Darija competence without sacrificing cross‑lingual reasoning.\n* **Green AI** GemMaroc achieves Atlas‑Chat‑level Darija scores using < 2 % of the energy.\n\n---\n\n## Benchmark summary\n\n| Model | Darija MMLU | Darija HellaSwag | GSM8K @5 | HellaSwag (EN) |\n| ---------------- | ----------- | ---------------- | ---------- | -------------- |\n| Atlas‑Chat‑27B | **61.9 %** | 48.4 % | 82.0 % | 77.8 % |\n| **GemMaroc‑27B** | 61.6 % | **60.5 %** | **84.2 %** | **79.3 %** |\n\n<sub>Zero‑shot accuracy; full table in the paper.</sub>\n\n---\n\n## Quick start\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\n\nmodel_id = \"AbderrahmanSkiredj1/GemMaroc-27b-it\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_id)\nmodel = AutoModelForCausalLM.from_pretrained(\n model_id,\n torch_dtype=\"auto\",\n device_map=\"auto\"\n)\n\npipe = pipeline(\n \"text-generation\",\n model=model,\n tokenizer=tokenizer,\n device_map=\"auto\",\n max_new_tokens=1024,\n temperature=0.7,\n repetition_penalty=1.2,\n no_repeat_ngram_size=3,\n)\n\nmessages = [\n {\"role\": \"user\", \"content\": \"شنو هي نظرية ‘butterfly effect’؟ فسّرها بدارجة ونقّط مثال بسيط.\"}\n]\n\nprompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\nprint(pipe(prompt)[0][\"generated_text\"][len(prompt):])\n```\n\n### Chat template (Gemma 3 format)\n\nThe tokenizer provides a baked‑in Jinja template that starts with a **begin‑of‑sequence** token (`<bos>`), then alternates user/model turns, each wrapped by `<start_of_turn>` … `<end_of_turn>` markers. When you set `add_generation_prompt=True` it ends after the opening model tag so the model can continue:\n\n```\n<bos><start_of_turn>user\n{user message}<end_of_turn>\n<start_of_turn>model\n```\n\nThe assistant will keep generating tokens until it decides to emit `<end_of_turn>`.\n\n```python\nprompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)\n```\n\nNo manual token juggling required—the call above handles BOS, turn delimiters, and newline placement automatically.\n\n---\n\nPre‑quantised checkpoints will be published under the same repo tags (`gemmaroc‑27b‑awq‑int4`, `gemmaroc‑27b‑gguf‑q4_k_m`).\n\n---\n\n## Training recipe (one‑paragraph recap)\n\n1. **Data** Translate a 44 K reasoning slice of TULU 50K into Darija, keeping 20 % English for cross‑lingual robustness.\n2. **LoRA SFT** Rank 16, α = 32, 3 epochs, bf16, context 2 048.\n3. **Merge & push** Merge LoRA into base weights (`peft.merge_and_unload`), convert to safetensors, upload.\n\n---\n\n## Limitations & ethical considerations\n\n* Sentiment and abstractive summarisation still trail state‑of‑the‑art.\n* Tokeniser is unchanged; rare Darija spellings may fragment.\n* Model may inherit societal biases present in pre‑training data.\n* No RLHF / RLAIF safety alignment yet – apply a moderation layer in production.\n\n---\n\n## Citation\n\nIf you use GemMaroc in your work, please cite:\n\n```bibtex\n@misc{skiredj2025gemmarocunlockingdarijaproficiency,\n title={GemMaroc: Unlocking Darija Proficiency in LLMs with Minimal Data}, \n author={Abderrahman Skiredj and Ferdaous Azhari and Houdaifa Atou and Nouamane Tazi and Ismail Berrada},\n year={2025},\n eprint={2505.17082},\n archivePrefix={arXiv},\n primaryClass={cs.CL},\n url={https://arxiv.org/abs/2505.17082}, \n}\n\n\n```\n",
"related_quantizations": []
},
"tags": [
"gguf",
"gemma3",
"llama-cpp",
"gguf-my-repo",
"Moroccan",
"Darija",
"GemMaroc",
"conversational",
"text-generation",
"ar",
"ary",
"en",
"arxiv:2505.17082",
"base_model:GemMaroc/GemMaroc-4b-tulu",
"base_model:quantized:GemMaroc/GemMaroc-4b-tulu",
"endpoints_compatible",
"region:us",
"imatrix"
],
"likes": 0,
"downloads": 99,
"gated": false,
"private": false,
"last_modified": "2025-06-18T07:55:22.000Z",
"created_at": "2025-05-22T17:32:03.000Z",
"pipeline_tag": "text-generation",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "682f5f939c1dc16f3f6d25be",
"id": "AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF",
"modelId": "AbderrahmanSkiredj1/GemMaroc-4b-tulu-Q4_K_M-GGUF",
"sha": "704a41fa0e0f1e2a50fdfefbea3a115f5ad652ea",
"createdAt": "2025-05-22T17:32:03.000Z",
"lastModified": "2025-06-18T07:55:22.000Z",
"author": "AbderrahmanSkiredj1",
"downloads": 99,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "",
"siblings_count": 5
}