Model Intelligence Sheet
unsloth/qwen3-coder-30b-a3b-instruct-gguf overview
Comprehensive model page for unsloth/qwen3-coder-30b-a3b-instruct-gguf
Downloads
147,452
Likes
591
Pipeline
text-generation
Library
transformers
Visibility
Public
Access
Open
Repository Files & Downloads
28 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3-Coder-30B-A3B-Instruct-BF16-00001-of-00002.gguf | GGUF | BF16 | 46.24 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-BF16-00002-of-00002.gguf | GGUF | BF16 | 10.65 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-IQ4_NL.gguf | GGUF | IQ4_NL | 16.12 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-IQ4_XS.gguf | GGUF | IQ4_XS | 15.25 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q2_K.gguf | GGUF | Q2_K | 10.49 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q2_K_L.gguf | GGUF | Q2_K_L | 10.55 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q3_K_M.gguf | GGUF | Q3_K_M | 13.70 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q3_K_S.gguf | GGUF | Q3_K_S | 12.38 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q4_0.gguf | GGUF | — | 16.19 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q4_1.gguf | GGUF | — | 17.87 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf | GGUF | Q4_K_M | 17.28 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q4_K_S.gguf | GGUF | Q4_K_S | 16.26 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q5_K_M.gguf | GGUF | Q5_K_M | 20.23 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q5_K_S.gguf | GGUF | Q5_K_S | 19.63 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q6_K.gguf | GGUF | Q6_K | 23.37 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-Q8_0.gguf | GGUF | — | 30.25 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-IQ1_M.gguf | GGUF | IQ1_M | 8.97 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-IQ1_S.gguf | GGUF | IQ1_S | 8.30 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-IQ2_M.gguf | GGUF | IQ2_M | 10.09 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-IQ2_XXS.gguf | GGUF | IQ2_XXS | 9.62 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-IQ3_XXS.gguf | GGUF | IQ3_XXS | 11.97 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q2_K_XL.gguf | GGUF | Q2_K_XL | 10.98 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q3_K_XL.gguf | GGUF | Q3_K_XL | 12.86 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf | GGUF | Q4_K_XL | 16.45 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q5_K_XL.gguf | GGUF | Q5_K_XL | 20.25 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q6_K_XL.gguf | GGUF | Q6_K_XL | 24.53 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-Q8_K_XL.gguf | GGUF | Q8_K_XL | 33.52 GB | Download |
| Qwen3-Coder-30B-A3B-Instruct-UD-TQ1_0.gguf | GGUF | — | 7.46 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"tags": [
"unsloth",
"qwen3",
"qwen"
],
"base_model": [
"Qwen/Qwen3-Coder-30B-A3B-Instruct"
],
"library_name": "transformers",
"license": "apache-2.0",
"license_link": "https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE",
"pipeline_tag": "text-generation",
"frontmatter": {
"tags": [
"unsloth",
"qwen3",
"qwen"
],
"base_model": [
"Qwen/Qwen3-Coder-30B-A3B-Instruct"
],
"library_name": "transformers",
"license": "apache-2.0",
"license_link": "https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE",
"pipeline_tag": "text-generation"
},
"hero_image_url": "https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png",
"summary": "",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\ntags:\n- unsloth\n- qwen3\n- qwen\nbase_model:\n- Qwen/Qwen3-Coder-30B-A3B-Instruct\nlibrary_name: transformers\nlicense: apache-2.0\nlicense_link: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE\npipeline_tag: text-generation\n---\n<div>\n <p style=\"margin-bottom: 0; margin-top: 0;\">\n <strong>See <a href=\"https://huggingface.co/collections/unsloth/qwen3-680edabfb790c8c34a242f95\">our collection</a> for all versions of Qwen3 including GGUF, 4-bit & 16-bit formats.</strong>\n </p>\n <p style=\"margin-bottom: 0;\">\n <em>Learn to run Qwen3-Coder correctly - <a href=\"https://docs.unsloth.ai/basics/qwen3-coder\">Read our Guide</a>.</em>\n </p>\n<p style=\"margin-top: 0;margin-bottom: 0;\">\n <em>See <a href=\"https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf\">Unsloth Dynamic 2.0 GGUFs</a> for our quantization benchmarks.</em>\n </p>\n <div style=\"display: flex; gap: 5px; align-items: center; \">\n <a href=\"https://github.com/unslothai/unsloth/\">\n <img src=\"https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png\" width=\"133\">\n </a>\n <a href=\"https://discord.gg/unsloth\">\n <img src=\"https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png\" width=\"173\">\n </a>\n <a href=\"https://docs.unsloth.ai/basics/qwen3-coder\">\n <img src=\"https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png\" width=\"143\">\n </a>\n </div>\n<h1 style=\"margin-top: 0rem;\">✨ Read our Qwen3-Coder Guide <a href=\"https://docs.unsloth.ai/basics/qwen3-coder\">here</a>!</h1>\n</div>\n\n- Fine-tune Qwen3 (14B) for free using our Google [Colab notebook](https://docs.unsloth.ai/get-started/unsloth-notebooks)!\n- Read our Blog about Qwen3 support: [unsloth.ai/blog/qwen3](https://unsloth.ai/blog/qwen3)\n- View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).\n| Unsloth supports | Free Notebooks | Performance | Memory use |\n|-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|\n| **Qwen3 (14B)** | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks) | 3x faster | 70% less |\n| **GRPO with Qwen3 (8B)** | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks) | 3x faster | 80% less |\n| **Llama-3.2 (3B)** | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(1B_and_3B)-Conversational.ipynb) | 2.4x faster | 58% less |\n| **Llama-3.2 (11B vision)** | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb) | 2x faster | 60% less |\n| **Qwen2.5 (7B)** | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(7B)-Alpaca.ipynb) | 2x faster | 60% less |\n\n# Qwen3-Coder-30B-A3B-Instruct\n<a href=\"https://chat.qwen.ai/\" target=\"_blank\" style=\"margin: 2px;\">\n <img alt=\"Chat\" src=\"https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5\" style=\"display: inline-block; vertical-align: middle;\"/>\n</a>\n\n## Highlights\n\n**Qwen3-Coder** is available in multiple sizes. Today, we're excited to introduce **Qwen3-Coder-30B-A3B-Instruct**. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements: \n\n- **Significant Performance** among open models on **Agentic Coding**, **Agentic Browser-Use**, and other foundational coding tasks.\n- **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.\n- **Agentic Coding** supporting for most platform such as **Qwen Code**, **CLINE**, featuring a specially designed function call format.\n\n\n\n## Model Overview\n\n**Qwen3-Coder-30B-A3B-Instruct** has the following features:\n- Type: Causal Language Models\n- Training Stage: Pretraining & Post-training\n- Number of Parameters: 30.5B in total and 3.3B activated\n- Number of Layers: 48\n- Number of Attention Heads (GQA): 32 for Q and 4 for KV\n- Number of Experts: 128\n- Number of Activated Experts: 8\n- Context Length: **262,144 natively**. \n\n**NOTE: This model supports only non-thinking mode and does not generate ``<think></think>`` blocks in its output. Meanwhile, specifying `enable_thinking=False` is no longer required.**\n\nFor more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3-coder/), [GitHub](https://github.com/QwenLM/Qwen3-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).\n\n\n## Quickstart\n\nWe advise you to use the latest version of `transformers`.\n\nWith `transformers<4.51.0`, you will encounter the following error:\n```\nKeyError: 'qwen3_moe'\n```\n\nThe following contains a code snippet illustrating how to use the model generate content based on given inputs. \n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_name = \"Qwen/Qwen3-Coder-30B-A3B-Instruct\"\n\n# load the tokenizer and the model\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(\n model_name,\n torch_dtype=\"auto\",\n device_map=\"auto\"\n)\n\n# prepare the model input\nprompt = \"Write a quick sort algorithm.\"\nmessages = [\n {\"role\": \"user\", \"content\": prompt}\n]\ntext = tokenizer.apply_chat_template(\n messages,\n tokenize=False,\n add_generation_prompt=True,\n)\nmodel_inputs = tokenizer([text], return_tensors=\"pt\").to(model.device)\n\n# conduct text completion\ngenerated_ids = model.generate(\n **model_inputs,\n max_new_tokens=65536\n)\noutput_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() \n\ncontent = tokenizer.decode(output_ids, skip_special_tokens=True)\n\nprint(\"content:\", content)\n```\n\n**Note: If you encounter out-of-memory (OOM) issues, consider reducing the context length to a shorter value, such as `32,768`.**\n\nFor local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported Qwen3.\n\n## Agentic Coding\n\nQwen3-Coder excels in tool calling capabilities. \n\nYou can simply define or use any tools as following example.\n```python\n# Your tool implementation\ndef square_the_number(num: float) -> dict:\n return num ** 2\n\n# Define Tools\ntools=[\n {\n \"type\":\"function\",\n \"function\":{\n \"name\": \"square_the_number\",\n \"description\": \"output the square of the number.\",\n \"parameters\": {\n \"type\": \"object\",\n \"required\": [\"input_num\"],\n \"properties\": {\n 'input_num': {\n 'type': 'number', \n 'description': 'input_num is a number that will be squared'\n }\n },\n }\n }\n }\n]\n\nimport OpenAI\n# Define LLM\nclient = OpenAI(\n # Use a custom endpoint compatible with OpenAI API\n base_url='http://localhost:8000/v1', # api_base\n api_key=\"EMPTY\"\n)\n \nmessages = [{'role': 'user', 'content': 'square the number 1024'}]\n\ncompletion = client.chat.completions.create(\n messages=messages,\n model=\"Qwen3-Coder-30B-A3B-Instruct\",\n max_tokens=65536,\n tools=tools,\n)\n\nprint(completion.choice[0])\n```\n\n## Best Practices\n\nTo achieve optimal performance, we recommend the following settings:\n\n1. **Sampling Parameters**:\n - We suggest using `temperature=0.7`, `top_p=0.8`, `top_k=20`, `repetition_penalty=1.05`.\n\n2. **Adequate Output Length**: We recommend using an output length of 65,536 tokens for most queries, which is adequate for instruct models.\n\n\n### Citation\n\nIf you find our work helpful, feel free to give us a cite.\n\n```\n@misc{qwen3technicalreport,\n title={Qwen3 Technical Report}, \n author={Qwen Team},\n year={2025},\n eprint={2505.09388},\n archivePrefix={arXiv},\n primaryClass={cs.CL},\n url={https://arxiv.org/abs/2505.09388}, \n}\n```\n",
"related_quantizations": []
},
"tags": [
"transformers",
"gguf",
"unsloth",
"qwen3",
"qwen",
"text-generation",
"arxiv:2505.09388",
"base_model:Qwen/Qwen3-Coder-30B-A3B-Instruct",
"base_model:quantized:Qwen/Qwen3-Coder-30B-A3B-Instruct",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"imatrix",
"conversational"
],
"likes": 591,
"downloads": 147452,
"gated": false,
"private": false,
"last_modified": "2026-01-30T06:29:38.000Z",
"created_at": "2025-07-31T10:27:38.000Z",
"pipeline_tag": "text-generation",
"library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "688b451a53e70a07b0669a7c",
"id": "unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF",
"modelId": "unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF",
"sha": "b17cb02dd882d5b6ab62fc777ad2995f19668350",
"createdAt": "2025-07-31T10:27:38.000Z",
"lastModified": "2026-01-30T06:29:38.000Z",
"author": "unsloth",
"downloads": 147452,
"likes": 591,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "transformers",
"siblings_count": 33
}