Model Intelligence Sheet

richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf overview

Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2K.gguf | Q2K | 2.96GB | | zebra-Llama-v0.2.IQ3XS.gguf | IQ3XS | 3.28GB | | zebra-Llama-v0.2.IQ3S.gguf | IQ3S | 3.43GB | | zebra-Llama-v0.2.Q3KS.gguf | Q3KS | 3.41GB | | zebra-Llama-v0.2.IQ3M.gguf | IQ3M | 3.52GB | | zebra-Llama-v0.2.Q3K.gguf | Q3K | 3.74GB | | zebra-Llama-v0.2.Q3KM.gguf | Q3KM | 3.74GB | | zebra-Llama-v0.2.Q3KL.gguf | Q3KL | 4.03GB | | zebra-Llama-v0.2.IQ4XS.gguf | IQ4XS | 4.18GB | | zebra-Llama-v0.2.Q40.gguf | Q40 | 4.34GB | | zebra-Llama-v0.2.IQ4NL.gguf | IQ4NL | 4.38GB | | zebra-Llama-v0.2.Q4KS.gguf | Q4KS | 4.37GB | | zebra-Llama-v0.2.Q4K.gguf | Q4K | 4.58GB | | zebra-Llama-v0.2.Q4KM.gguf | Q4KM | 4.58GB | | zebra-Llama-v0.2.Q41.gguf | Q41 | 4.78GB | | zebra-Llama-v0.2.Q50.gguf | Q50 | 5.21GB | | zebra-Llama-v0.2.Q5KS.gguf | Q5KS | 5.21GB | | zebra-Llama-v0.2.Q5K.gguf | Q5K | 5.34GB | | zebra-Llama-v0.2.Q5KM.gguf | Q5KM | 5.34GB | | zebra-Llama-v0.2.Q51.gguf | Q51 | 5.65GB | | zebra-Llama-v0.2.Q6K.gguf | Q6K | 6.14GB | | zebra-Llama-v0.2.Q80.gguf | Q80 | 7.95GB | Original model description: --- library_name: transformers tags: ---

ggufarxiv:2411.02657endpoints_compatibleregion:usconversational

richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf visual

Downloads

168

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

22 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
zebra-Llama-v0.2.IQ3_M.gguf	GGUF	IQ3_M	3.52 GB	Download
zebra-Llama-v0.2.IQ3_S.gguf	GGUF	IQ3_S	3.43 GB	Download
zebra-Llama-v0.2.IQ3_XS.gguf	GGUF	IQ3_XS	3.28 GB	Download
zebra-Llama-v0.2.IQ4_NL.gguf	GGUF	IQ4_NL	4.38 GB	Download
zebra-Llama-v0.2.IQ4_XS.gguf	GGUF	IQ4_XS	4.18 GB	Download
zebra-Llama-v0.2.Q2_K.gguf	GGUF	Q2_K	2.96 GB	Download
zebra-Llama-v0.2.Q3_K.gguf	GGUF	Q3_K	3.74 GB	Download
zebra-Llama-v0.2.Q3_K_L.gguf	GGUF	Q3_K_L	4.03 GB	Download
zebra-Llama-v0.2.Q3_K_M.gguf	GGUF	Q3_K_M	3.74 GB	Download
zebra-Llama-v0.2.Q3_K_S.gguf	GGUF	Q3_K_S	3.41 GB	Download
zebra-Llama-v0.2.Q4_0.gguf	GGUF	—	4.34 GB	Download
zebra-Llama-v0.2.Q4_1.gguf	GGUF	—	4.78 GB	Download
zebra-Llama-v0.2.Q4_K.gguf	GGUF	Q4_K	4.58 GB	Download
zebra-Llama-v0.2.Q4_K_M.gguf	GGUF	Q4_K_M	4.58 GB	Download
zebra-Llama-v0.2.Q4_K_S.gguf	GGUF	Q4_K_S	4.37 GB	Download
zebra-Llama-v0.2.Q5_0.gguf	GGUF	—	5.21 GB	Download
zebra-Llama-v0.2.Q5_1.gguf	GGUF	—	5.65 GB	Download
zebra-Llama-v0.2.Q5_K.gguf	GGUF	Q5_K	5.34 GB	Download
zebra-Llama-v0.2.Q5_K_M.gguf	GGUF	Q5_K_M	5.34 GB	Download
zebra-Llama-v0.2.Q5_K_S.gguf	GGUF	Q5_K_S	5.21 GB	Download
zebra-Llama-v0.2.Q6_K.gguf	GGUF	Q6_K	6.14 GB	Download
zebra-Llama-v0.2.Q8_0.gguf	GGUF	—	7.95 GB	Download

Model Details Live

Model Slug

richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf

Author

RichardErkhov

Pipeline Task

—

Library

—

Created

2025-06-08

Last Modified

2025-06-08

Gated

Private

HF SHA

062fdf83f5cfafb6c3cbfea8cf7498cd0562d701

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png",
    "summary": "Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2_K.gguf | Q2_K | 2.96GB | | zebra-Llama-v0.2.IQ3_XS.gguf | IQ3_XS | 3.28GB | | zebra-Llama-v0.2.IQ3_S.gguf | IQ3_S | 3.43GB | | zebra-Llama-v0.2.Q3_K_S.gguf | Q3_K_S | 3.41GB | | zebra-Llama-v0.2.IQ3_M.gguf | IQ3_M | 3.52GB | | zebra-Llama-v0.2.Q3_K.gguf | Q3_K | 3.74GB | | zebra-Llama-v0.2.Q3_K_M.gguf | Q3_K_M | 3.74GB | | zebra-Llama-v0.2.Q3_K_L.gguf | Q3_K_L | 4.03GB | | zebra-Llama-v0.2.IQ4_XS.gguf | IQ4_XS | 4.18GB | | zebra-Llama-v0.2.Q4_0.gguf | Q4_0 | 4.34GB | | zebra-Llama-v0.2.IQ4_NL.gguf | IQ4_NL | 4.38GB | | zebra-Llama-v0.2.Q4_K_S.gguf | Q4_K_S | 4.37GB | | zebra-Llama-v0.2.Q4_K.gguf | Q4_K | 4.58GB | | zebra-Llama-v0.2.Q4_K_M.gguf | Q4_K_M | 4.58GB | | zebra-Llama-v0.2.Q4_1.gguf | Q4_1 | 4.78GB | | zebra-Llama-v0.2.Q5_0.gguf | Q5_0 | 5.21GB | | zebra-Llama-v0.2.Q5_K_S.gguf | Q5_K_S | 5.21GB | | zebra-Llama-v0.2.Q5_K.gguf | Q5_K | 5.34GB | | zebra-Llama-v0.2.Q5_K_M.gguf | Q5_K_M | 5.34GB | | zebra-Llama-v0.2.Q5_1.gguf | Q5_1 | 5.65GB | | zebra-Llama-v0.2.Q6_K.gguf | Q6_K | 6.14GB | | zebra-Llama-v0.2.Q8_0.gguf | Q8_0 | 7.95GB | Original model description: --- library_name: transformers tags: ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nzebra-Llama-v0.2 - GGUF\n- Model creator: https://huggingface.co/zebraLLAMA/\n- Original model: https://huggingface.co/zebraLLAMA/zebra-Llama-v0.2/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [zebra-Llama-v0.2.Q2_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q2_K.gguf) | Q2_K | 2.96GB |\n| [zebra-Llama-v0.2.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_XS.gguf) | IQ3_XS | 3.28GB |\n| [zebra-Llama-v0.2.IQ3_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_S.gguf) | IQ3_S | 3.43GB |\n| [zebra-Llama-v0.2.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_S.gguf) | Q3_K_S | 3.41GB |\n| [zebra-Llama-v0.2.IQ3_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_M.gguf) | IQ3_M | 3.52GB |\n| [zebra-Llama-v0.2.Q3_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K.gguf) | Q3_K | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_M.gguf) | Q3_K_M | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_L.gguf) | Q3_K_L | 4.03GB |\n| [zebra-Llama-v0.2.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_XS.gguf) | IQ4_XS | 4.18GB |\n| [zebra-Llama-v0.2.Q4_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_0.gguf) | Q4_0 | 4.34GB |\n| [zebra-Llama-v0.2.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_NL.gguf) | IQ4_NL | 4.38GB |\n| [zebra-Llama-v0.2.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_S.gguf) | Q4_K_S | 4.37GB |\n| [zebra-Llama-v0.2.Q4_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K.gguf) | Q4_K | 4.58GB |\n| [zebra-Llama-v0.2.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_M.gguf) | Q4_K_M | 4.58GB |\n| [zebra-Llama-v0.2.Q4_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_1.gguf) | Q4_1 | 4.78GB |\n| [zebra-Llama-v0.2.Q5_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_0.gguf) | Q5_0 | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_S.gguf) | Q5_K_S | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K.gguf) | Q5_K | 5.34GB |\n| [zebra-Llama-v0.2.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_M.gguf) | Q5_K_M | 5.34GB |\n| [zebra-Llama-v0.2.Q5_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_1.gguf) | Q5_1 | 5.65GB |\n| [zebra-Llama-v0.2.Q6_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q6_K.gguf) | Q6_K | 6.14GB |\n| [zebra-Llama-v0.2.Q8_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q8_0.gguf) | Q8_0 | 7.95GB |\n\n\n\n\nOriginal model description:\n---\nlibrary_name: transformers\ntags:\n- biology\n- medical\n---\n\n## zebra-Llama/zebra-Llama-v0.2\n\nZebra-Llama v0.2 is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.\n\nThe model is trained using a specialized approach called \"context-aware training,\" where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.\n\n\n## What is new in this version of zebra-Llama?\n- Compared to the previous version (zebraLLAMA/zebra-Llama-v0.1), the latest Zebra-Llama model delivers more comprehensive and in-depth explanations for questions about the rare disease Ehlers-Danlos Syndrome.\n\n- The latest version has a greater ability to provide citations consistently compared to the previous version. \n\n- In addition to improved citation ability, it has also been benchmarked against the base model (meta-llama/Llama-3.1-8B-Instruct) and demonstrates superior text generation capabilities in terms of thoroughness, accuracy, and clarity, based on expert evaluation.\n\n- From a modeling perspective, the latest version utilizes \"meta-llama/Llama-3.1-8B-Instruct\" as its base model, while the earlier version (v0.1) was built on \"meta-llama/Meta-Llama-3-8B-Instruct\".\n\n## Model Details\n\nBase model : meta-llama/Llama-3.1-8B-Instruct\n\n\n<img src=\"https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png\" alt=\"Model Diagram\" width=\"1000\"/>\n\n### Model Sources\n\n**Repository:** https://github.com/karthiksoman/zebra-Llama\n\n**Custom built RAG API for rare diseases (focused on EDS):**\n\n• Base URL: https://zebra-llama-rag.onrender.com\n\n• Endpoint: /search\n\n**Jupyter Notebook Demo of Zebra-Llama:**\n\nhttps://github.com/karthiksoman/zebra-Llama/blob/main/code/notebook/zebra_llama_v0.2_demo.ipynb\n\n## Uses\n\nZebra-Llama can be used to generate answers related to EDS questions.\n\n### Out-of-Scope Use\n\nThis Language Model is intended for academic and research purposes only. It is not for clinical use or medical decision-making. Consult a healthcare professional for medical advice.\n\n## Training Details\n\nFine tuning method : LoRA\n\nLoRA rank : 16\n\nLoRA alpha : 16\n\nLORA dropout : 0.01\n\nLORA target modules : [\"q_proj\", \"k_proj\", \"v_proj\"]\n\nTrain epochs : 2\n\nLearning rate : 1e-4\n\nLR scheduler type : constant\n\nMax grad norm : 1\n\nBATCH_SIZE_PER_GPU_FOR_TRAINING : 2\n\nGRADIENT_ACCUMULATION_STEPS : 1\n\n## Citation\n```\n@misc{soman2024zebrallamacontextawarelargelanguage,\n      title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, \n      author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},\n      year={2024},\n      eprint={2411.02657},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2411.02657}, \n}\n```\n\n## Contact\n\nDr. Karthik Soman - karthi.soman@gmail.com\n\nAndrew Langdon - andrewlngdn@gmail.com\n\nChinmay Agrawal - chag7212@colorado.edu\n\nCatalina Villouta - catalina.villouta.r@gmail.com\n\nDr. Orion Buske - orion@phenotips.com\n\nLashaw Salta - lashawsalta@gmail.com\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2411.02657",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 168,
  "gated": false,
  "private": false,
  "last_modified": "2025-06-08T02:53:42.000Z",
  "created_at": "2025-06-08T01:38:55.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "6844e9af14c7a1b16f5d1ad9",
  "id": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
  "modelId": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
  "sha": "062fdf83f5cfafb6c3cbfea8cf7498cd0562d701",
  "createdAt": "2025-06-08T01:38:55.000Z",
  "lastModified": "2025-06-08T02:53:42.000Z",
  "author": "RichardErkhov",
  "downloads": 168,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}