GraySoft
Projects Models About FAQ Contact Download guIDE →

duyntnet/deepseek-r1-distill-llama-8b-imatrix-gguf IQ1_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

duyntnet/deepseek-r1-distill-llama-8b-imatrix-gguf overview

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.

transformersggufimatrixDeepSeek-R1-Distill-Llama-8Btext-generationenlicense:otherregion:usconversational
duyntnet/deepseek-r1-distill-llama-8b-imatrix-gguf visual
Downloads
126
Likes
1
Pipeline
text-generation
Library
transformers
Visibility
Public
Access
Open

Repository Files & Downloads

27 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
DeepSeek-R1-Distill-Llama-8B-IQ1_M.gguf GGUF IQ1_M 2.01 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ1_S.gguf GGUF IQ1_S 1.88 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ2_M.gguf GGUF IQ2_M 2.75 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ2_S.gguf GGUF IQ2_S 2.57 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ2_XS.gguf GGUF IQ2_XS 2.43 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ2_XXS.gguf GGUF IQ2_XXS 2.23 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ3_M.gguf GGUF IQ3_M 3.52 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ3_S.gguf GGUF IQ3_S 3.43 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ3_XS.gguf GGUF IQ3_XS 3.28 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ3_XXS.gguf GGUF IQ3_XXS 3.05 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ4_NL.gguf GGUF IQ4_NL 4.36 GB Download
DeepSeek-R1-Distill-Llama-8B-IQ4_XS.gguf GGUF IQ4_XS 4.14 GB Download
DeepSeek-R1-Distill-Llama-8B-Q2_K.gguf GGUF Q2_K 2.96 GB Download
DeepSeek-R1-Distill-Llama-8B-Q2_K_S.gguf GGUF Q2_K_S 2.78 GB Download
DeepSeek-R1-Distill-Llama-8B-Q3_K_L.gguf GGUF Q3_K_L 4.03 GB Download
DeepSeek-R1-Distill-Llama-8B-Q3_K_M.gguf GGUF Q3_K_M 3.74 GB Download
DeepSeek-R1-Distill-Llama-8B-Q3_K_S.gguf GGUF Q3_K_S 3.41 GB Download
DeepSeek-R1-Distill-Llama-8B-Q4_0.gguf GGUF 4.35 GB Download
DeepSeek-R1-Distill-Llama-8B-Q4_1.gguf GGUF 4.78 GB Download
DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf GGUF Q4_K_M 4.58 GB Download
DeepSeek-R1-Distill-Llama-8B-Q4_K_S.gguf GGUF Q4_K_S 4.37 GB Download
DeepSeek-R1-Distill-Llama-8B-Q5_0.gguf GGUF 5.23 GB Download
DeepSeek-R1-Distill-Llama-8B-Q5_1.gguf GGUF 5.65 GB Download
DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf GGUF Q5_K_M 5.34 GB Download
DeepSeek-R1-Distill-Llama-8B-Q5_K_S.gguf GGUF Q5_K_S 5.21 GB Download
DeepSeek-R1-Distill-Llama-8B-Q6_K.gguf GGUF Q6_K 6.14 GB Download
DeepSeek-R1-Distill-Llama-8B-Q8_0.gguf GGUF 7.95 GB Download

Model Details Live

Model Slug
duyntnet/deepseek-r1-distill-llama-8b-imatrix-gguf
Author
duyntnet
Pipeline Task
text-generation
Library
transformers
Created
2025-03-08
Last Modified
2025-03-08
Gated
No
Private
No
HF SHA
d18a4874a782c97964d913a5b7f755f80cfbad77
License
other
Language
en
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "other",
    "language": [
      "en"
    ],
    "pipeline_tag": "text-generation",
    "inference": false,
    "tags": [
      "transformers",
      "gguf",
      "imatrix",
      "DeepSeek-R1-Distill-Llama-8B"
    ],
    "frontmatter": {
      "license": "other",
      "language": [
        "en"
      ],
      "pipeline_tag": "text-generation",
      "inference": "false",
      "tags": [
        "transformers",
        "gguf",
        "imatrix",
        "DeepSeek-R1-Distill-Llama-8B"
      ]
    },
    "hero_image_url": "",
    "summary": "We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: other\nlanguage:\n- en\npipeline_tag: text-generation\ninference: false\ntags:\n- transformers\n- gguf\n- imatrix\n- DeepSeek-R1-Distill-Llama-8B\n---\nQuantizations of https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B\n\n### Open source inference clients/UIs\n* [llama.cpp](https://github.com/ggerganov/llama.cpp)\n* [KoboldCPP](https://github.com/LostRuins/koboldcpp)\n* [ollama](https://github.com/ollama/ollama)\n* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)\n* [jan](https://github.com/janhq/jan)\n* [GPT4All](https://github.com/nomic-ai/gpt4all)\n\n### Closed source inference clients/UIs\n* [LM Studio](https://lmstudio.ai/)\n* [Msty](https://msty.app/)\n* [Backyard AI](https://backyard.ai/)\n\n---\n\n# From original readme\n\nWe introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. \nDeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.\nWith RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.\nHowever, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,\nwe introduce DeepSeek-R1, which incorporates cold-start data before RL.\nDeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. \nTo support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\n\n## How to Run Locally\n\n### DeepSeek-R1 Models\n\nPlease visit [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repo for more information about running DeepSeek-R1 locally.\n\n**NOTE: Hugging Face's Transformers has not been directly supported yet.**\n\n### DeepSeek-R1-Distill Models\n\nDeepSeek-R1-Distill models can be utilized in the same manner as Qwen or Llama models.\n\nFor instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm):\n\n```shell\nvllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager\n```\n\nYou can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)\n\n```bash\npython3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --trust-remote-code --tp 2\n```\n\n### Usage Recommendations\n\n**We recommend adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:**\n\n1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.\n2. **Avoid adding a system prompt; all instructions should be contained within the user prompt.**\n3. For mathematical problems, it is advisable to include a directive in your prompt such as: \"Please reason step by step, and put your final answer within \\boxed{}.\"\n4. When evaluating model performance, it is recommended to conduct multiple tests and average the results.\n\nAdditionally, we have observed that the DeepSeek-R1 series models tend to bypass thinking pattern (i.e., outputting \"\\<think\\>\\n\\n\\</think\\>\") when responding to certain queries, which can adversely affect the model's performance.\n**To ensure that the model engages in thorough reasoning, we recommend enforcing the model to initiate its response with \"\\<think\\>\\n\" at the beginning of every output.**",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "imatrix",
    "DeepSeek-R1-Distill-Llama-8B",
    "text-generation",
    "en",
    "license:other",
    "region:us",
    "conversational"
  ],
  "likes": 1,
  "downloads": 126,
  "gated": false,
  "private": false,
  "last_modified": "2025-03-08T12:12:54.000Z",
  "created_at": "2025-03-08T10:51:00.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "67cc2114101af374ebf62387",
  "id": "duyntnet/DeepSeek-R1-Distill-Llama-8B-imatrix-GGUF",
  "modelId": "duyntnet/DeepSeek-R1-Distill-Llama-8B-imatrix-GGUF",
  "sha": "d18a4874a782c97964d913a5b7f755f80cfbad77",
  "createdAt": "2025-03-08T10:51:00.000Z",
  "lastModified": "2025-03-08T12:12:54.000Z",
  "author": "duyntnet",
  "downloads": 126,
  "likes": 1,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 29
}