duyntnet/mistral-small-24b-instruct-2501-imatrix-gguf IQ3_XS GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
duyntnet/mistral-small-24b-instruct-2501-imatrix-gguf overview
Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501. Mistral Small can be deployed locally and is exceptionally "knowledge-dense", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for: For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community. This release demonstrates our commitment to open source, serving as a strong base model. Learn more about Mistral Small in our blog post. Model developper: Mistral AI Team
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Mistral-Small-24B-Instruct-2501-IQ1_M.gguf | GGUF | IQ1_M | 5.36 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ1_S.gguf | GGUF | IQ1_S | 4.91 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ2_M.gguf | GGUF | IQ2_M | 7.56 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ2_S.gguf | GGUF | IQ2_S | 6.96 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ2_XS.gguf | GGUF | IQ2_XS | 6.71 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ2_XXS.gguf | GGUF | IQ2_XXS | 6.10 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ3_M.gguf | GGUF | IQ3_M | 9.92 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ3_S.gguf | GGUF | IQ3_S | 9.71 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ3_XS.gguf | GGUF | IQ3_XS | 9.23 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ3_XXS.gguf | GGUF | IQ3_XXS | 8.64 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ4_NL.gguf | GGUF | IQ4_NL | 12.54 GB | Download |
| Mistral-Small-24B-Instruct-2501-IQ4_XS.gguf | GGUF | IQ4_XS | 11.88 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q2_K.gguf | GGUF | Q2_K | 8.28 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q2_K_S.gguf | GGUF | Q2_K_S | 7.75 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q3_K_L.gguf | GGUF | Q3_K_L | 11.55 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q3_K_M.gguf | GGUF | Q3_K_M | 10.69 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q3_K_S.gguf | GGUF | Q3_K_S | 9.69 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q4_0.gguf | GGUF | — | 12.57 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q4_1.gguf | GGUF | — | 13.85 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf | GGUF | Q4_K_M | 13.35 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q4_K_S.gguf | GGUF | Q4_K_S | 12.62 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q5_0.gguf | GGUF | — | 15.23 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q5_1.gguf | GGUF | — | 16.52 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q5_K_M.gguf | GGUF | Q5_K_M | 15.61 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q5_K_S.gguf | GGUF | Q5_K_S | 15.18 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q6_K.gguf | GGUF | Q6_K | 18.02 GB | Download |
| Mistral-Small-24B-Instruct-2501-Q8_0.gguf | GGUF | — | 23.33 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "other",
"language": [
"en"
],
"pipeline_tag": "text-generation",
"inference": false,
"tags": [
"transformers",
"gguf",
"imatrix",
"Mistral-Small-24B-Instruct-2501"
],
"frontmatter": {
"license": "other",
"language": [
"en"
],
"pipeline_tag": "text-generation",
"inference": "false",
"tags": [
"transformers",
"gguf",
"imatrix",
"Mistral-Small-24B-Instruct-2501"
]
},
"hero_image_url": "",
"summary": "Mistral Small 3 ( 2501 ) sets a new benchmark in the \"small\" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501. Mistral Small can be deployed locally and is exceptionally \"knowledge-dense\", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for: For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community. This release demonstrates our commitment to open source, serving as a strong base model. Learn more about Mistral Small in our blog post. Model developper: Mistral AI Team",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlicense: other\nlanguage:\n- en\npipeline_tag: text-generation\ninference: false\ntags:\n- transformers\n- gguf\n- imatrix\n- Mistral-Small-24B-Instruct-2501\n---\nQuantizations of https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501\n\n### Inference Clients/UIs\n* [llama.cpp](https://github.com/ggerganov/llama.cpp)\n* [KoboldCPP](https://github.com/LostRuins/koboldcpp)\n* [ollama](https://github.com/ollama/ollama)\n* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)\n* [jan](https://github.com/janhq/jan)\n* [GPT4All](https://github.com/nomic-ai/gpt4all)\n---\n\n# From original readme\n\nMistral Small 3 ( 2501 ) sets a new benchmark in the \"small\" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! \nThis model is an instruction-fine-tuned version of the base model: [Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501).\n\nMistral Small can be deployed locally and is exceptionally \"knowledge-dense\", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. \nPerfect for:\n- Fast response conversational agents.\n- Low latency function calling.\n- Subject matter experts via fine-tuning.\n- Local inference for hobbyists and organizations handling sensitive data.\n\nFor enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community.\n\nThis release demonstrates our commitment to open source, serving as a strong base model. \n\nLearn more about Mistral Small in our [blog post](https://mistral.ai/news/mistral-small-3/).\n\nModel developper: Mistral AI Team\n\n## Key Features\n- **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.\n- **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting.\n- **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities.\n- **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes.\n- **Context Window:** A 32k context window.\n- **System Prompt:** Maintains strong adherence and support for system prompts.\n- **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.\n\n### Basic Instruct Template (V7-Tekken)\n\n```\n<s>[SYSTEM_PROMPT]<system prompt>[/SYSTEM_PROMPT][INST]<user message>[/INST]<assistant response></s>[INST]<user message>[/INST]\n```\n*`<system_prompt>`, `<user message>` and `<assistant response>` are placeholders.*\n\n***Please make sure to use [mistral-common](https://github.com/mistralai/mistral-common) as the source of truth***\n\n## Usage\n\nThe model can be used with the following frameworks;\n- [`vllm`](https://github.com/vllm-project/vllm): See [here](#vllm)\n- [`transformers`](https://github.com/huggingface/transformers): See [here](#transformers)\n\n### vLLM\n\nWe recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)\nto implement production-ready inference pipelines.\n\n**Note 1**: We recommond using a relatively low temperature, such as `temperature=0.15`.\n\n**Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following \nsystem prompt:\n\n```\nsystem_prompt = \"\"\"You are Mistral Small 3, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYour knowledge base was last updated on 2023-10-01. The current date is 2025-01-30.\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \\\"What are some good restaurants around me?\\\" => \\\"Where are you?\\\" or \\\"When is the next flight to Tokyo\\\" => \\\"Where do you travel from?\\\")\"\"\"\n```",
"related_quantizations": []
},
"tags": [
"transformers",
"gguf",
"imatrix",
"Mistral-Small-24B-Instruct-2501",
"text-generation",
"en",
"license:other",
"region:us",
"conversational"
],
"likes": 0,
"downloads": 462,
"gated": false,
"private": false,
"last_modified": "2025-02-04T01:50:22.000Z",
"created_at": "2025-02-03T17:43:07.000Z",
"pipeline_tag": "text-generation",
"library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "67a1002bf39a62235066f5c7",
"id": "duyntnet/Mistral-Small-24B-Instruct-2501-imatrix-GGUF",
"modelId": "duyntnet/Mistral-Small-24B-Instruct-2501-imatrix-GGUF",
"sha": "46c58d82295bad65f2e5289300395b31667b7151",
"createdAt": "2025-02-03T17:43:07.000Z",
"lastModified": "2025-02-04T01:50:22.000Z",
"author": "duyntnet",
"downloads": 462,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "transformers",
"siblings_count": 29
}