GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf overview

Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2K.gguf | Q2K | 2.96GB | | zebra-Llama-v0.2.IQ3XS.gguf | IQ3XS | 3.28GB | | zebra-Llama-v0.2.IQ3S.gguf | IQ3S | 3.43GB | | zebra-Llama-v0.2.Q3KS.gguf | Q3KS | 3.41GB | | zebra-Llama-v0.2.IQ3M.gguf | IQ3M | 3.52GB | | zebra-Llama-v0.2.Q3K.gguf | Q3K | 3.74GB | | zebra-Llama-v0.2.Q3KM.gguf | Q3KM | 3.74GB | | zebra-Llama-v0.2.Q3KL.gguf | Q3KL | 4.03GB | | zebra-Llama-v0.2.IQ4XS.gguf | IQ4XS | 4.18GB | | zebra-Llama-v0.2.Q40.gguf | Q40 | 4.34GB | | zebra-Llama-v0.2.IQ4NL.gguf | IQ4NL | 4.38GB | | zebra-Llama-v0.2.Q4KS.gguf | Q4KS | 4.37GB | | zebra-Llama-v0.2.Q4K.gguf | Q4K | 4.58GB | | zebra-Llama-v0.2.Q4KM.gguf | Q4KM | 4.58GB | | zebra-Llama-v0.2.Q41.gguf | Q41 | 4.78GB | | zebra-Llama-v0.2.Q50.gguf | Q50 | 5.21GB | | zebra-Llama-v0.2.Q5KS.gguf | Q5KS | 5.21GB | | zebra-Llama-v0.2.Q5K.gguf | Q5K | 5.34GB | | zebra-Llama-v0.2.Q5KM.gguf | Q5KM | 5.34GB | | zebra-Llama-v0.2.Q51.gguf | Q51 | 5.65GB | | zebra-Llama-v0.2.Q6K.gguf | Q6K | 6.14GB | | zebra-Llama-v0.2.Q80.gguf | Q80 | 7.95GB | Original model description: --- library_name: transformers tags: ---

ggufarxiv:2411.02657endpoints_compatibleregion:usconversational
richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf visual
Downloads
168
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

22 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
zebra-Llama-v0.2.IQ3_M.gguf GGUF IQ3_M 3.52 GB Download
zebra-Llama-v0.2.IQ3_S.gguf GGUF IQ3_S 3.43 GB Download
zebra-Llama-v0.2.IQ3_XS.gguf GGUF IQ3_XS 3.28 GB Download
zebra-Llama-v0.2.IQ4_NL.gguf GGUF IQ4_NL 4.38 GB Download
zebra-Llama-v0.2.IQ4_XS.gguf GGUF IQ4_XS 4.18 GB Download
zebra-Llama-v0.2.Q2_K.gguf GGUF Q2_K 2.96 GB Download
zebra-Llama-v0.2.Q3_K.gguf GGUF Q3_K 3.74 GB Download
zebra-Llama-v0.2.Q3_K_L.gguf GGUF Q3_K_L 4.03 GB Download
zebra-Llama-v0.2.Q3_K_M.gguf GGUF Q3_K_M 3.74 GB Download
zebra-Llama-v0.2.Q3_K_S.gguf GGUF Q3_K_S 3.41 GB Download
zebra-Llama-v0.2.Q4_0.gguf GGUF 4.34 GB Download
zebra-Llama-v0.2.Q4_1.gguf GGUF 4.78 GB Download
zebra-Llama-v0.2.Q4_K.gguf GGUF Q4_K 4.58 GB Download
zebra-Llama-v0.2.Q4_K_M.gguf GGUF Q4_K_M 4.58 GB Download
zebra-Llama-v0.2.Q4_K_S.gguf GGUF Q4_K_S 4.37 GB Download
zebra-Llama-v0.2.Q5_0.gguf GGUF 5.21 GB Download
zebra-Llama-v0.2.Q5_1.gguf GGUF 5.65 GB Download
zebra-Llama-v0.2.Q5_K.gguf GGUF Q5_K 5.34 GB Download
zebra-Llama-v0.2.Q5_K_M.gguf GGUF Q5_K_M 5.34 GB Download
zebra-Llama-v0.2.Q5_K_S.gguf GGUF Q5_K_S 5.21 GB Download
zebra-Llama-v0.2.Q6_K.gguf GGUF Q6_K 6.14 GB Download
zebra-Llama-v0.2.Q8_0.gguf GGUF 7.95 GB Download

Model Details Live

Model Slug
richarderkhov/zebrallama_-_zebra-llama-v0.2-gguf
Author
RichardErkhov
Pipeline Task
Library
Created
2025-06-08
Last Modified
2025-06-08
Gated
No
Private
No
HF SHA
062fdf83f5cfafb6c3cbfea8cf7498cd0562d701
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png",
    "summary": "Quantization made by Richard Erkhov. Github Discord Request more models zebra-Llama-v0.2 - GGUF | Name | Quant method | Size | | ---- | ---- | ---- | | zebra-Llama-v0.2.Q2_K.gguf | Q2_K | 2.96GB | | zebra-Llama-v0.2.IQ3_XS.gguf | IQ3_XS | 3.28GB | | zebra-Llama-v0.2.IQ3_S.gguf | IQ3_S | 3.43GB | | zebra-Llama-v0.2.Q3_K_S.gguf | Q3_K_S | 3.41GB | | zebra-Llama-v0.2.IQ3_M.gguf | IQ3_M | 3.52GB | | zebra-Llama-v0.2.Q3_K.gguf | Q3_K | 3.74GB | | zebra-Llama-v0.2.Q3_K_M.gguf | Q3_K_M | 3.74GB | | zebra-Llama-v0.2.Q3_K_L.gguf | Q3_K_L | 4.03GB | | zebra-Llama-v0.2.IQ4_XS.gguf | IQ4_XS | 4.18GB | | zebra-Llama-v0.2.Q4_0.gguf | Q4_0 | 4.34GB | | zebra-Llama-v0.2.IQ4_NL.gguf | IQ4_NL | 4.38GB | | zebra-Llama-v0.2.Q4_K_S.gguf | Q4_K_S | 4.37GB | | zebra-Llama-v0.2.Q4_K.gguf | Q4_K | 4.58GB | | zebra-Llama-v0.2.Q4_K_M.gguf | Q4_K_M | 4.58GB | | zebra-Llama-v0.2.Q4_1.gguf | Q4_1 | 4.78GB | | zebra-Llama-v0.2.Q5_0.gguf | Q5_0 | 5.21GB | | zebra-Llama-v0.2.Q5_K_S.gguf | Q5_K_S | 5.21GB | | zebra-Llama-v0.2.Q5_K.gguf | Q5_K | 5.34GB | | zebra-Llama-v0.2.Q5_K_M.gguf | Q5_K_M | 5.34GB | | zebra-Llama-v0.2.Q5_1.gguf | Q5_1 | 5.65GB | | zebra-Llama-v0.2.Q6_K.gguf | Q6_K | 6.14GB | | zebra-Llama-v0.2.Q8_0.gguf | Q8_0 | 7.95GB | Original model description: --- library_name: transformers tags: ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nzebra-Llama-v0.2 - GGUF\n- Model creator: https://huggingface.co/zebraLLAMA/\n- Original model: https://huggingface.co/zebraLLAMA/zebra-Llama-v0.2/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [zebra-Llama-v0.2.Q2_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q2_K.gguf) | Q2_K | 2.96GB |\n| [zebra-Llama-v0.2.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_XS.gguf) | IQ3_XS | 3.28GB |\n| [zebra-Llama-v0.2.IQ3_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_S.gguf) | IQ3_S | 3.43GB |\n| [zebra-Llama-v0.2.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_S.gguf) | Q3_K_S | 3.41GB |\n| [zebra-Llama-v0.2.IQ3_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ3_M.gguf) | IQ3_M | 3.52GB |\n| [zebra-Llama-v0.2.Q3_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K.gguf) | Q3_K | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_M.gguf) | Q3_K_M | 3.74GB |\n| [zebra-Llama-v0.2.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q3_K_L.gguf) | Q3_K_L | 4.03GB |\n| [zebra-Llama-v0.2.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_XS.gguf) | IQ4_XS | 4.18GB |\n| [zebra-Llama-v0.2.Q4_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_0.gguf) | Q4_0 | 4.34GB |\n| [zebra-Llama-v0.2.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.IQ4_NL.gguf) | IQ4_NL | 4.38GB |\n| [zebra-Llama-v0.2.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_S.gguf) | Q4_K_S | 4.37GB |\n| [zebra-Llama-v0.2.Q4_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K.gguf) | Q4_K | 4.58GB |\n| [zebra-Llama-v0.2.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_K_M.gguf) | Q4_K_M | 4.58GB |\n| [zebra-Llama-v0.2.Q4_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q4_1.gguf) | Q4_1 | 4.78GB |\n| [zebra-Llama-v0.2.Q5_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_0.gguf) | Q5_0 | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_S.gguf) | Q5_K_S | 5.21GB |\n| [zebra-Llama-v0.2.Q5_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K.gguf) | Q5_K | 5.34GB |\n| [zebra-Llama-v0.2.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_K_M.gguf) | Q5_K_M | 5.34GB |\n| [zebra-Llama-v0.2.Q5_1.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q5_1.gguf) | Q5_1 | 5.65GB |\n| [zebra-Llama-v0.2.Q6_K.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q6_K.gguf) | Q6_K | 6.14GB |\n| [zebra-Llama-v0.2.Q8_0.gguf](https://huggingface.co/RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf/blob/main/zebra-Llama-v0.2.Q8_0.gguf) | Q8_0 | 7.95GB |\n\n\n\n\nOriginal model description:\n---\nlibrary_name: transformers\ntags:\n- biology\n- medical\n---\n\n## zebra-Llama/zebra-Llama-v0.2\n\nZebra-Llama v0.2 is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.\n\nThe model is trained using a specialized approach called \"context-aware training,\" where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.\n\n\n## What is new in this version of zebra-Llama?\n- Compared to the previous version (zebraLLAMA/zebra-Llama-v0.1), the latest Zebra-Llama model delivers more comprehensive and in-depth explanations for questions about the rare disease Ehlers-Danlos Syndrome.\n\n- The latest version has a greater ability to provide citations consistently compared to the previous version. \n\n- In addition to improved citation ability, it has also been benchmarked against the base model (meta-llama/Llama-3.1-8B-Instruct) and demonstrates superior text generation capabilities in terms of thoroughness, accuracy, and clarity, based on expert evaluation.\n\n- From a modeling perspective, the latest version utilizes \"meta-llama/Llama-3.1-8B-Instruct\" as its base model, while the earlier version (v0.1) was built on \"meta-llama/Meta-Llama-3-8B-Instruct\".\n\n## Model Details\n\nBase model : meta-llama/Llama-3.1-8B-Instruct\n\n\n<img src=\"https://cdn-uploads.huggingface.co/production/uploads/6515dc0cca07b261439e8f0d/QB1jsyiNpiolGRFy6AItA.png\" alt=\"Model Diagram\" width=\"1000\"/>\n\n### Model Sources\n\n**Repository:** https://github.com/karthiksoman/zebra-Llama\n\n**Custom built RAG API for rare diseases (focused on EDS):**\n\n• Base URL: https://zebra-llama-rag.onrender.com\n\n• Endpoint: /search\n\n**Jupyter Notebook Demo of Zebra-Llama:**\n\nhttps://github.com/karthiksoman/zebra-Llama/blob/main/code/notebook/zebra_llama_v0.2_demo.ipynb\n\n## Uses\n\nZebra-Llama can be used to generate answers related to EDS questions.\n\n### Out-of-Scope Use\n\nThis Language Model is intended for academic and research purposes only. It is not for clinical use or medical decision-making. Consult a healthcare professional for medical advice.\n\n## Training Details\n\nFine tuning method : LoRA\n\nLoRA rank : 16\n\nLoRA alpha : 16\n\nLORA dropout : 0.01\n\nLORA target modules : [\"q_proj\", \"k_proj\", \"v_proj\"]\n\nTrain epochs : 2\n\nLearning rate : 1e-4\n\nLR scheduler type : constant\n\nMax grad norm : 1\n\nBATCH_SIZE_PER_GPU_FOR_TRAINING : 2\n\nGRADIENT_ACCUMULATION_STEPS : 1\n\n## Citation\n```\n@misc{soman2024zebrallamacontextawarelargelanguage,\n      title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, \n      author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},\n      year={2024},\n      eprint={2411.02657},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2411.02657}, \n}\n```\n\n## Contact\n\nDr. Karthik Soman - karthi.soman@gmail.com\n\nAndrew Langdon - andrewlngdn@gmail.com\n\nChinmay Agrawal - chag7212@colorado.edu\n\nCatalina Villouta - catalina.villouta.r@gmail.com\n\nDr. Orion Buske - orion@phenotips.com\n\nLashaw Salta - lashawsalta@gmail.com\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2411.02657",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 168,
  "gated": false,
  "private": false,
  "last_modified": "2025-06-08T02:53:42.000Z",
  "created_at": "2025-06-08T01:38:55.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "6844e9af14c7a1b16f5d1ad9",
  "id": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
  "modelId": "RichardErkhov/zebraLLAMA_-_zebra-Llama-v0.2-gguf",
  "sha": "062fdf83f5cfafb6c3cbfea8cf7498cd0562d701",
  "createdAt": "2025-06-08T01:38:55.000Z",
  "lastModified": "2025-06-08T02:53:42.000Z",
  "author": "RichardErkhov",
  "downloads": 168,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}