GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

neopolita/llama-3.1-storm-8b-gguf overview

GGUF quants for akjindal53244/Llama-3.1-Storm-8B using llama.cpp Terms of Use: Please check the original model

ggufendpoints_compatibleregion:usconversational
neopolita/llama-3.1-storm-8b-gguf visual
Downloads
243
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

7 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
ggml-model-f16.gguf GGUF F16 14.97 GB Download
llama-3.1-storm-8b_q2_k.gguf GGUF Q2_K 2.96 GB Download
llama-3.1-storm-8b_q3_k_m.gguf GGUF Q3_K_M 3.74 GB Download
llama-3.1-storm-8b_q4_k_m.gguf GGUF Q4_K_M 4.58 GB Download
llama-3.1-storm-8b_q5_k_m.gguf GGUF Q5_K_M 5.34 GB Download
llama-3.1-storm-8b_q6_k.gguf GGUF Q6_K 6.14 GB Download
llama-3.1-storm-8b_q8_0.gguf GGUF 7.95 GB Download

Model Details Live

Model Slug
neopolita/llama-3.1-storm-8b-gguf
Author
neopolita
Pipeline Task
Library
Created
2024-08-21
Last Modified
2024-08-21
Gated
No
Private
No
HF SHA
c6883ab731b67e360128326a13d543db0e86f008
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://huggingface.co/neopolita/common/resolve/main/profile.png",
    "summary": "# GGUF quants for **akjindal53244/Llama-3.1-Storm-8B** using llama.cpp **Terms of Use**: Please check the **original model**",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\n{}\n---\n# GGUF quants for [**akjindal53244/Llama-3.1-Storm-8B**](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B) using [llama.cpp](https://github.com/ggerganov/llama.cpp)\n\n**Terms of Use**: Please check the [**original model**](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)\n\n<picture>\n<img alt=\"cthulhu\" src=\"https://huggingface.co/neopolita/common/resolve/main/profile.png\">\n</picture>\n\n## Quants\n\n* `q2_k`: Uses Q4_K for the attention.vw and feed_forward.w2 tensors, Q2_K for the other tensors.\n* `q3_k_s`: Uses Q3_K for all tensors\n* `q3_k_m`: Uses Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else Q3_K\n* `q3_k_l`: Uses Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else Q3_K\n* `q4_0`: Original quant method, 4-bit.\n* `q4_1`: Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models.\n* `q4_k_s`: Uses Q4_K for all tensors\n* `q4_k_m`: Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K\n* `q5_0`: Higher accuracy, higher resource usage and slower inference.\n* `q5_1`: Even higher accuracy, resource usage and slower inference.\n* `q5_k_s`:  Uses Q5_K for all tensors\n* `q5_k_m`: Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K\n* `q6_k`: Uses Q8_K for all tensors\n* `q8_0`: Almost indistinguishable from float16. High resource use and slow. Not recommended for most users.",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 243,
  "gated": false,
  "private": false,
  "last_modified": "2024-08-21T22:35:07.000Z",
  "created_at": "2024-08-21T21:54:54.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "66c6622ee2fdf1d811bb07c3",
  "id": "neopolita/llama-3.1-storm-8b-gguf",
  "modelId": "neopolita/llama-3.1-storm-8b-gguf",
  "sha": "c6883ab731b67e360128326a13d543db0e86f008",
  "createdAt": "2024-08-21T21:54:54.000Z",
  "lastModified": "2024-08-21T22:35:07.000Z",
  "author": "neopolita",
  "downloads": 243,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 9
}