lewdiculous/sovl_llama3_8b-gguf-iq-imatrix Q6_K GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
lewdiculous/sovl_llama3_8b-gguf-iq-imatrix overview
GGUF-IQ-Imatrix quants for @jeiku's ResplendentAI/SOVLLlama38B. Give them some love! Updated! These quants have been redone with the fixes from llama.cpp/pull/6920 in mind. Use KoboldCpp version 1.64 or higher. Well...! Turns out it was not just a hallucination and this model actually is pretty cool so give it a chance! For 8GB VRAM GPUs, I recommend the Q4KM-imat quant for up to 12288 context sizes. Use the provided presets. Compatible SillyTavern presets here (simple) or here (Virt's roleplay). Use the latest version of KoboldCpp. !image/png
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| SOVL_Llama3_8B-F16.gguf | GGUF | F16 | 14.97 GB | Download |
| SOVL_Llama3_8B-IQ3_M-imat.gguf | GGUF | IQ3_M | 3.52 GB | Download |
| SOVL_Llama3_8B-IQ3_XXS-imat.gguf | GGUF | IQ3_XXS | 3.05 GB | Download |
| SOVL_Llama3_8B-IQ4_NL-imat.gguf | GGUF | IQ4_NL | 4.36 GB | Download |
| SOVL_Llama3_8B-IQ4_XS-imat.gguf | GGUF | IQ4_XS | 4.14 GB | Download |
| SOVL_Llama3_8B-Q4_K_M-imat.gguf | GGUF | Q4_K_M | 4.58 GB | Download |
| SOVL_Llama3_8B-Q4_K_S-imat.gguf | GGUF | Q4_K_S | 4.37 GB | Download |
| SOVL_Llama3_8B-Q5_K_M-imat.gguf | GGUF | Q5_K_M | 5.34 GB | Download |
| SOVL_Llama3_8B-Q5_K_S-imat.gguf | GGUF | Q5_K_S | 5.21 GB | Download |
| SOVL_Llama3_8B-Q6_K-imat.gguf | GGUF | Q6_K | 6.14 GB | Download |
| SOVL_Llama3_8B-Q8_0-imat.gguf | GGUF | — | 7.95 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "apache-2.0",
"frontmatter": {
"license": "apache-2.0"
},
"hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png",
"summary": "GGUF-IQ-Imatrix quants for @jeiku's ResplendentAI/SOVL_Llama3_8B. Give them some love! > [!IMPORTANT] > **Updated!** > These quants have been redone with the fixes from llama.cpp/pull/6920 in mind. > Use **KoboldCpp version 1.64** or higher. > [!NOTE] > **Well...!** > Turns out it was not just a hallucination and this model actually is pretty cool so **give it a chance!** > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes. > [!WARNING] > **Use the provided presets.** > Compatible SillyTavern presets here (simple) or here (Virt's roleplay). > Use the latest version of KoboldCpp. !image/png",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlicense: apache-2.0\n---\n\n> [!TIP]\n> My upload speeds have been cooked and unstable lately. <br>\n> Realistically I'd need to move to get a better provider. <br>\n> If you **want** and you are able to, you can [**support various endeavors here (Ko-fi)**](https://ko-fi.com/Lewdiculous). <br>\n> I apologize for disrupting your experience.\n\n# #llama-3 #experimental #work-in-progress\n\nGGUF-IQ-Imatrix quants for @jeiku's [ResplendentAI/SOVL_Llama3_8B](https://huggingface.co/ResplendentAI/SOVL_Llama3_8B). <br> Give them some love!\n\n> [!IMPORTANT] \n> **Updated!**\n> These quants have been redone with the fixes from [llama.cpp/pull/6920](https://github.com/ggerganov/llama.cpp/pull/6920) in mind. <br>\n> Use **KoboldCpp version 1.64** or higher.\n\n> [!NOTE]\n> **Well...!** <br>\n> Turns out it was not just a hallucination and this model actually is pretty cool so **give it a chance!** <br>\n> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** quant for up to 12288 context sizes.\n\n> [!WARNING]\n> **Use the provided presets.** <br>\n> Compatible SillyTavern presets [here (simple)](https://huggingface.co/Lewdiculous/Model-Requests/tree/main/data/presets/cope-llama-3-0.1) or [here (Virt's roleplay)](https://huggingface.co/Virt-io/SillyTavern-Presets).\n> Use the latest version of KoboldCpp.\n\n",
"related_quantizations": []
},
"tags": [
"gguf",
"license:apache-2.0",
"endpoints_compatible",
"region:us"
],
"likes": 26,
"downloads": 121,
"gated": false,
"private": false,
"last_modified": "2024-05-04T14:30:30.000Z",
"created_at": "2024-04-25T03:46:59.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "6629d23359854b02da0d49ea",
"id": "Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix",
"modelId": "Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix",
"sha": "d3e5bebffdfd15a339d4c579c2ca23f96545693e",
"createdAt": "2024-04-25T03:46:59.000Z",
"lastModified": "2024-05-04T14:30:30.000Z",
"author": "Lewdiculous",
"downloads": 121,
"likes": 26,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 15
}