duyntnet/meta-llama-3.1-8b-instruct-imatrix-gguf Q5_1 GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

duyntnet/meta-llama-3.1-8b-instruct-imatrix-gguf overview

Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. Note: You can also find detailed recipes on how to use the model locally, with torch.compile(), assisted generations, quantised and more at huggingface-llama-recipes ### Use with llama Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging huggingface-cli:

transformersggufimatrixMeta-Llama-3.1-8B-Instructtext-generationenlicense:otherregion:usconversational

duyntnet/meta-llama-3.1-8b-instruct-imatrix-gguf visual

Downloads

1,508

Likes

Pipeline

text-generation

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

27 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Meta-Llama-3.1-8B-Instruct-IQ1_M.gguf	GGUF	IQ1_M	2.01 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ1_S.gguf	GGUF	IQ1_S	1.88 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ2_M.gguf	GGUF	IQ2_M	2.75 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ2_S.gguf	GGUF	IQ2_S	2.57 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ2_XS.gguf	GGUF	IQ2_XS	2.43 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ2_XXS.gguf	GGUF	IQ2_XXS	2.23 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ3_M.gguf	GGUF	IQ3_M	3.52 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ3_S.gguf	GGUF	IQ3_S	3.43 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ3_XS.gguf	GGUF	IQ3_XS	3.28 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ3_XXS.gguf	GGUF	IQ3_XXS	3.05 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ4_NL.gguf	GGUF	IQ4_NL	4.36 GB	Download
Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf	GGUF	IQ4_XS	4.14 GB	Download
Meta-Llama-3.1-8B-Instruct-Q2_K.gguf	GGUF	Q2_K	2.96 GB	Download
Meta-Llama-3.1-8B-Instruct-Q2_K_S.gguf	GGUF	Q2_K_S	2.78 GB	Download
Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf	GGUF	Q3_K_L	4.03 GB	Download
Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf	GGUF	Q3_K_M	3.74 GB	Download
Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf	GGUF	Q3_K_S	3.41 GB	Download
Meta-Llama-3.1-8B-Instruct-Q4_0.gguf	GGUF	—	4.35 GB	Download
Meta-Llama-3.1-8B-Instruct-Q4_1.gguf	GGUF	—	4.78 GB	Download
Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf	GGUF	Q4_K_M	4.58 GB	Download
Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf	GGUF	Q4_K_S	4.37 GB	Download
Meta-Llama-3.1-8B-Instruct-Q5_0.gguf	GGUF	—	5.23 GB	Download
Meta-Llama-3.1-8B-Instruct-Q5_1.gguf	GGUF	—	5.65 GB	Download
Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf	GGUF	Q5_K_M	5.34 GB	Download
Meta-Llama-3.1-8B-Instruct-Q5_K_S.gguf	GGUF	Q5_K_S	5.21 GB	Download
Meta-Llama-3.1-8B-Instruct-Q6_K.gguf	GGUF	Q6_K	6.14 GB	Download
Meta-Llama-3.1-8B-Instruct-Q8_0.gguf	GGUF	—	7.95 GB	Download

Model Details Live

Model Slug

duyntnet/meta-llama-3.1-8b-instruct-imatrix-gguf

Author

duyntnet

Pipeline Task

text-generation

Library

transformers

Created

2024-07-28

Last Modified

2024-07-28

Gated

Private

HF SHA

a21df4c6162f378274bf8ef8953cdd24acbaa9f2

License

other

Language

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "license": "other",
    "language": [
      "en"
    ],
    "pipeline_tag": "text-generation",
    "inference": false,
    "tags": [
      "transformers",
      "gguf",
      "imatrix",
      "Meta-Llama-3.1-8B-Instruct"
    ],
    "frontmatter": {
      "license": "other",
      "language": [
        "en"
      ],
      "pipeline_tag": "text-generation",
      "inference": "false",
      "tags": [
        "transformers",
        "gguf",
        "imatrix",
        "Meta-Llama-3.1-8B-Instruct"
      ]
    },
    "hero_image_url": "",
    "summary": "### Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. ``python import transformers import torch model_id = \"meta-llama/Meta-Llama-3.1-8B-Instruct\" pipeline = transformers.pipeline( \"text-generation\", model=model_id, model_kwargs={\"torch_dtype\": torch.bfloat16}, device_map=\"auto\", ) messages = [ {\"role\": \"system\", \"content\": \"You are a pirate chatbot who always responds in pirate speak!\"}, {\"role\": \"user\", \"content\": \"Who are you?\"}, ] outputs = pipeline( messages, max_new_tokens=256, ) print(outputs[0][\"generated_text\"][-1]) ` Note: You can also find detailed recipes on how to use the model locally, with torch.compile(), assisted generations, quantised and more at huggingface-llama-recipes ### Use with llama Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging huggingface-cli: ` huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include \"original/*\" --local-dir Meta-Llama-3.1-8B-Instruct ``",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: other\nlanguage:\n- en\npipeline_tag: text-generation\ninference: false\ntags:\n- transformers\n- gguf\n- imatrix\n- Meta-Llama-3.1-8B-Instruct\n---\nQuantizations of https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct\n\n**Note**: Quantized using version b3482 of llama.cpp, which should resolve the rope scaling issue. You will need the latest llama.cpp to use these quants. \n\n### Inference Clients/UIs\n* [llama.cpp](https://github.com/ggerganov/llama.cpp)\n* [JanAI](https://github.com/janhq/jan)\n* [KoboldCPP](https://github.com/LostRuins/koboldcpp)\n* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)\n* [ollama](https://github.com/ollama/ollama)\n* [GPT4All](https://github.com/nomic-ai/gpt4all)\n\n---\n\n# From original readme\n\n### Use with transformers\n\nStarting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.\n\nMake sure to update your transformers installation via `pip install --upgrade transformers`.\n\n```python\nimport transformers\nimport torch\n\nmodel_id = \"meta-llama/Meta-Llama-3.1-8B-Instruct\"\n\npipeline = transformers.pipeline(\n    \"text-generation\",\n    model=model_id,\n    model_kwargs={\"torch_dtype\": torch.bfloat16},\n    device_map=\"auto\",\n)\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a pirate chatbot who always responds in pirate speak!\"},\n    {\"role\": \"user\", \"content\": \"Who are you?\"},\n]\n\noutputs = pipeline(\n    messages,\n    max_new_tokens=256,\n)\nprint(outputs[0][\"generated_text\"][-1])\n```\n\nNote: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)\n\n### Use with `llama`\n\nPlease, follow the instructions in the [repository](https://github.com/meta-llama/llama)\n\nTo download Original checkpoints, see the example command below leveraging `huggingface-cli`:\n\n```\nhuggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include \"original/*\" --local-dir Meta-Llama-3.1-8B-Instruct\n```",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "imatrix",
    "Meta-Llama-3.1-8B-Instruct",
    "text-generation",
    "en",
    "license:other",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 1508,
  "gated": false,
  "private": false,
  "last_modified": "2024-07-28T21:33:53.000Z",
  "created_at": "2024-07-28T16:40:06.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "66a67466a435cd8f46251637",
  "id": "duyntnet/Meta-Llama-3.1-8B-Instruct-imatrix-GGUF",
  "modelId": "duyntnet/Meta-Llama-3.1-8B-Instruct-imatrix-GGUF",
  "sha": "a21df4c6162f378274bf8ef8953cdd24acbaa9f2",
  "createdAt": "2024-07-28T16:40:06.000Z",
  "lastModified": "2024-07-28T21:33:53.000Z",
  "author": "duyntnet",
  "downloads": 1508,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 29
}

duyntnet/meta-llama-3.1-8b-instruct-imatrix-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard