richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf overview
Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2K.gguf | Q2K | 2.96GB | | zebra-Llama-v0.2.IQ3XS.gguf | IQ3XS | 3.28GB | | zebra-Llama-v0.2.IQ3S.gguf | IQ3S | 3.43GB | | zebra-Llama-v0.2.Q3KS.gguf | Q3KS | 3.41GB | | zebra-Llama-v0.2.IQ3M.gguf | IQ3M | 3.52GB | | zebra-Llama-v0.2.Q3K.gguf | Q3K | 3.74GB | | zebra-Llama-v0.2.Q3KM.gguf | Q3KM | 3.74GB | | zebra-Llama-v0.2.Q3KL.gguf | Q3KL | 4.03GB | | zebra-Llama-v0.2.IQ4XS.gguf | IQ4XS | 4.18GB | | zebra-Llama-v0.2.Q40.gguf | Q40 | 4.34GB | | zebra-Llama-v0.2.IQ4NL.gguf | IQ4NL | 4.38GB | | zebra-Llama-v0.2.Q4KS.gguf | Q4KS | 4.37GB | | zebra-Llama-v0.2.Q4K.gguf | Q4K | 4.58GB | | zebra-Llama-v0.2.Q4KM.gguf | Q4KM | 4.58GB | | zebra-Llama-v0.2.Q41.gguf | Q41 | 4.78GB | | zebra-Llama-v0.2.Q50.gguf | Q50 | 5.21GB | | zebra-Llama-v0.2.Q5KS.gguf | Q5KS | 5.21GB | | zebra-Llama-v0.2.Q5K.gguf | Q5K | 5.34GB | | zebra-Llama-v0.2.Q5KM.gguf | Q5KM | 5.34GB | | zebra-Llama-v0.2.Q51.gguf | Q51 | 5.65GB | | zebra-Llama-v0.2.Q6K.gguf | Q6K | 6.14GB | | zebra-Llama-v0.2.Q80.gguf | Q80 | 7.95GB | Original model description: --- library_name: transformers tags: ---
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| zebra-Llama-v0.2.IQ3_M.gguf | GGUF | IQ3_M | 3.52 GB | Download |
| zebra-Llama-v0.2.IQ3_S.gguf | GGUF | IQ3_S | 3.43 GB | Download |
| zebra-Llama-v0.2.IQ3_XS.gguf | GGUF | IQ3_XS | 3.28 GB | Download |
| zebra-Llama-v0.2.IQ4_NL.gguf | GGUF | IQ4_NL | 4.38 GB | Download |
| zebra-Llama-v0.2.IQ4_XS.gguf | GGUF | IQ4_XS | 4.18 GB | Download |
| zebra-Llama-v0.2.Q2_K.gguf | GGUF | Q2_K | 2.96 GB | Download |
| zebra-Llama-v0.2.Q3_K.gguf | GGUF | Q3_K | 3.74 GB | Download |
| zebra-Llama-v0.2.Q3_K_L.gguf | GGUF | Q3_K_L | 4.03 GB | Download |
| zebra-Llama-v0.2.Q3_K_M.gguf | GGUF | Q3_K_M | 3.74 GB | Download |
| zebra-Llama-v0.2.Q3_K_S.gguf | GGUF | Q3_K_S | 3.41 GB | Download |
| zebra-Llama-v0.2.Q4_0.gguf | GGUF | — | 4.34 GB | Download |
| zebra-Llama-v0.2.Q4_1.gguf | GGUF | — | 4.78 GB | Download |
| zebra-Llama-v0.2.Q4_K.gguf | GGUF | Q4_K | 4.58 GB | Download |
| zebra-Llama-v0.2.Q4_K_M.gguf | GGUF | Q4_K_M | 4.58 GB | Download |
| zebra-Llama-v0.2.Q4_K_S.gguf | GGUF | Q4_K_S | 4.37 GB | Download |
| zebra-Llama-v0.2.Q5_0.gguf | GGUF | — | 5.21 GB | Download |
| zebra-Llama-v0.2.Q5_1.gguf | GGUF | — | 5.65 GB | Download |
| zebra-Llama-v0.2.Q5_K.gguf | GGUF | Q5_K | 5.34 GB | Download |
| zebra-Llama-v0.2.Q5_K_M.gguf | GGUF | Q5_K_M | 5.34 GB | Download |
| zebra-Llama-v0.2.Q5_K_S.gguf | GGUF | Q5_K_S | 5.21 GB | Download |
| zebra-Llama-v0.2.Q6_K.gguf | GGUF | Q6_K | 6.14 GB | Download |
| zebra-Llama-v0.2.Q8_0.gguf | GGUF | — | 7.95 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"frontmatter": {},
"hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png",
"summary": "Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2_K.gguf | Q2_K | 2.96GB | | zebra-Llama-v0.2.IQ3_XS.gguf | IQ3_XS | 3.28GB | | zebra-Llama-v0.2.IQ3_S.gguf | IQ3_S | 3.43GB | | zebra-Llama-v0.2.Q3_K_S.gguf | Q3_K_S | 3.41GB | | zebra-Llama-v0.2.IQ3_M.gguf | IQ3_M | 3.52GB | | zebra-Llama-v0.2.Q3_K.gguf | Q3_K | 3.74GB | | zebra-Llama-v0.2.Q3_K_M.gguf | Q3_K_M | 3.74GB | | zebra-Llama-v0.2.Q3_K_L.gguf | Q3_K_L | 4.03GB | | zebra-Llama-v0.2.IQ4_XS.gguf | IQ4_XS | 4.18GB | | zebra-Llama-v0.2.Q4_0.gguf | Q4_0 | 4.34GB | | zebra-Llama-v0.2.IQ4_NL.gguf | IQ4_NL | 4.38GB | | zebra-Llama-v0.2.Q4_K_S.gguf | Q4_K_S | 4.37GB | | zebra-Llama-v0.2.Q4_K.gguf | Q4_K | 4.58GB | | zebra-Llama-v0.2.Q4_K_M.gguf | Q4_K_M | 4.58GB | | zebra-Llama-v0.2.Q4_1.gguf | Q4_1 | 4.78GB | | zebra-Llama-v0.2.Q5_0.gguf | Q5_0 | 5.21GB | | zebra-Llama-v0.2.Q5_K_S.gguf | Q5_K_S | 5.21GB | | zebra-Llama-v0.2.Q5_K.gguf | Q5_K | 5.34GB | | zebra-Llama-v0.2.Q5_K_M.gguf | Q5_K_M | 5.34GB | | zebra-Llama-v0.2.Q5_1.gguf | Q5_1 | 5.65GB | | zebra-Llama-v0.2.Q6_K.gguf | Q6_K | 6.14GB | | zebra-Llama-v0.2.Q8_0.gguf | Q8_0 | 7.95GB | Original model description: --- library_name: transformers tags: ---",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nzebra-Llama-v0.2 - GGUF\n- Model creator: https://huggingface.co/zebraLLAMA/\n- Original model: https://huggingface.co/zebraLLAMA/zebra-Llama-v0.2/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [zebra-Llama-v0.2.Q2_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q2_K.gguf) | Q2_K | 2.96GB |\n| [zebra-Llama-v0.2.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_XS.gguf) | IQ3_XS | 3.28GB |\n| [zebra-Llama-v0.2.IQ3_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_S.gguf) | IQ3_S | 3.43GB |\n| [zebra-Llama-v0.2.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_S.gguf) | Q3_K_S | 3.41GB |\n| [zebra-Llama-v0.2.IQ3_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_M.gguf) | IQ3_M | 3.52GB |\n| [zebra-Llama-v0.2.Q3_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K.gguf) | Q3_K | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_M.gguf) | Q3_K_M | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_L.gguf) | Q3_K_L | 4.03GB |\n| [zebra-Llama-v0.2.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_XS.gguf) | IQ4_XS | 4.18GB |\n| [zebra-Llama-v0.2.Q4_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_0.gguf) | Q4_0 | 4.34GB |\n| [zebra-Llama-v0.2.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_NL.gguf) | IQ4_NL | 4.38GB |\n| [zebra-Llama-v0.2.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_S.gguf) | Q4_K_S | 4.37GB |\n| [zebra-Llama-v0.2.Q4_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K.gguf) | Q4_K | 4.58GB |\n| [zebra-Llama-v0.2.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_M.gguf) | Q4_K_M | 4.58GB |\n| [zebra-Llama-v0.2.Q4_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_1.gguf) | Q4_1 | 4.78GB |\n| [zebra-Llama-v0.2.Q5_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_0.gguf) | Q5_0 | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_S.gguf) | Q5_K_S | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K.gguf) | Q5_K | 5.34GB |\n| [zebra-Llama-v0.2.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_M.gguf) | Q5_K_M | 5.34GB |\n| [zebra-Llama-v0.2.Q5_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_1.gguf) | Q5_1 | 5.65GB |\n| [zebra-Llama-v0.2.Q6_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q6_K.gguf) | Q6_K | 6.14GB |\n| [zebra-Llama-v0.2.Q8_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q8_0.gguf) | Q8_0 | 7.95GB |\n\n\n\n\nOriginal model description:\n---\nlibrary_name: transformers\ntags:\n- biology\n- medical\n---\n\n## zebra-Llama/zebra-Llama-v0.2\n\nZebra-Llama v0.2 is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.\n\nThe model is trained using a specialized approach called \"context-aware training,\" where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.\n\n\n## What is new in this version of zebra-Llama?\n- Compared to the previous version (zebraLLAMA/zebra-Llama-v0.1), the latest Zebra-Llama model delivers more comprehensive and in-depth explanations for questions about the rare disease Ehlers-Danlos Syndrome.\n\n- The latest version has a greater ability to provide citations consistently compared to the previous version. \n\n- In addition to improved citation ability, it has also been benchmarked against the base model (meta-llama/Llama-3.1-8B-Instruct) and demonstrates superior text generation capabilities in terms of thoroughness, accuracy, and clarity, based on expert evaluation.\n\n- From a modeling perspective, the latest version utilizes \"meta-llama/Llama-3.1-8B-Instruct\" as its base model, while the earlier version (v0.1) was built on \"meta-llama/Meta-Llama-3-8B-Instruct\".\n\n## Model Details\n\nBase model : meta-llama/Llama-3.1-8B-Instruct\n\n\n<img src=\"https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png\" alt=\"Model Diagram\" width=\"1000\"/>\n\n### Model Sources\n\n**Repository:** https://github.com/karthiksoman/zebra-Llama\n\n**Custom built RAG API for rare diseases (focused on EDS):**\n\n• Base URL: https://zebra-llama-rag.onrender.com\n\n• Endpoint: /search\n\n**Jupyter Notebook Demo of Zebra-Llama:**\n\nhttps://github.com/karthiksoman/zebra-Llama/blob/main/code/notebook/zebra_llama_v0.2_demo.ipynb\n\n## Uses\n\nZebra-Llama can be used to generate answers related to EDS questions.\n\n### Out-of-Scope Use\n\nThis Language Model is intended for academic and research purposes only. It is not for clinical use or medical decision-making. Consult a healthcare professional for medical advice.\n\n## Training Details\n\nFine tuning method : LoRA\n\nLoRA rank : 16\n\nLoRA alpha : 16\n\nLORA dropout : 0.01\n\nLORA target modules : [\"q_proj\", \"k_proj\", \"v_proj\"]\n\nTrain epochs : 2\n\nLearning rate : 1e-4\n\nLR scheduler type : constant\n\nMax grad norm : 1\n\nBATCH_SIZE_PER_GPU_FOR_TRAINING : 2\n\nGRADIENT_ACCUMULATION_STEPS : 1\n\n## Citation\n```\n@misc{soman2024zebrallamacontextawarelargelanguage,\n title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, \n author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},\n year={2024},\n eprint={2411.02657},\n archivePrefix={arXiv},\n primaryClass={cs.CL},\n url={https://arxiv.org/abs/2411.02657}, \n}\n```\n\n## Contact\n\nDr. Karthik Soman - karthi.soman@gmail.com\n\nAndrew Langdon - andrewlngdn@gmail.com\n\nChinmay Agrawal - chag7212@colorado.edu\n\nCatalina Villouta - catalina.villouta.r@gmail.com\n\nDr. Orion Buske - orion@phenotips.com\n\nLashaw Salta - lashawsalta@gmail.com\n\n",
"related_quantizations": []
},
"tags": [
"gguf",
"arxiv:2411.02657",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 0,
"downloads": 168,
"gated": false,
"private": false,
"last_modified": "2025-06-08T02:53:42.000Z",
"created_at": "2025-06-08T01:38:55.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "6844e9af14c7a1b16f5d1ad9",
"id": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
"modelId": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
"sha": "062fdf83f5cfafb6c3cbfea8cf7498cd0562d701",
"createdAt": "2025-06-08T01:38:55.000Z",
"lastModified": "2025-06-08T02:53:42.000Z",
"author": "RichardErkhov",
"downloads": 168,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 24
}