leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-gguf Q5_K_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
Model Intelligence Sheet
leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-gguf overview
This repository contains multiple quantized versions of the gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts model in GGUF format. It is intended for efficient inference on consumer hardware, making large model deployment more accessible.
Downloads
150
Likes
2
Pipeline
text-generation
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
14 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q2_k.gguf | GGUF | Q2_K | 2.45 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q3_k_l.gguf | GGUF | Q3_K_L | 2.66 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q3_k_m.gguf | GGUF | Q3_K_M | 2.58 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q3_k_s.gguf | GGUF | Q3_K_S | 2.44 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q4_0.gguf | GGUF | — | 2.48 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q4_1.gguf | GGUF | — | 2.69 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q4_k_m.gguf | GGUF | Q4_K_M | 3.01 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q4_k_s.gguf | GGUF | Q4_K_S | 2.87 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q5_0.gguf | GGUF | — | 2.90 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q5_1.gguf | GGUF | — | 3.11 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q5_k_m.gguf | GGUF | Q5_K_M | 3.21 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q5_k_s.gguf | GGUF | Q5_K_S | 3.09 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q6_k.gguf | GGUF | Q6_K | 4.09 GB | Download |
| AmanPriyanshu_gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q8_0.gguf | GGUF | — | 4.16 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "apache-2.0",
"base_model": "gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts",
"pipeline_tag": "text-generation",
"frontmatter": {
"license": "apache-2.0",
"base_model": "gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts",
"pipeline_tag": "text-generation"
},
"hero_image_url": "",
"summary": "This repository contains multiple quantized versions of the gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts model in GGUF format. It is intended for efficient inference on consumer hardware, making large model deployment more accessible.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlicense: apache-2.0\nbase_model: gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts\npipeline_tag: text-generation\n---\n\n# Model Card for gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF\n\nThis repository contains multiple quantized versions of the gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts model in GGUF format. \nIt is intended for efficient inference on consumer hardware, making large model deployment more accessible. \n\n## Model Details\n\n### Model Description\n\n- **Developed by:** leeminwaan \n- **Funded by [optional]:** Independent project \n- **Shared by [optional]:** leeminwaan \n- **Model type:** Decoder-only transformer language model \n- **Language(s) (NLP):** English (primary), multilingual capabilities not benchmarked \n- **License:** Apache-2.0 \n\n### Model Sources\n\n- **Repository:** [Hugging Face Repo](https://huggingface.co/leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF) \n- **Paper [optional]:** Not available \n- **Demo [optional]:** To be released \n\n## How to Get Started with the Model\n\n```python\nfrom huggingface_hub import hf_hub_download\n\nmodel_path = hf_hub_download(\"leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF\", \"gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-q4_k_m.gguf\")\nprint(\"Downloaded:\", model_path)\n````\n\nQuantized versions available:\n\n* Q2\\_K, Q3\\_K\\_S, Q3\\_K\\_M, Q3\\_K\\_L\n* Q4\\_0, Q4\\_1, Q4\\_K\\_S, Q4\\_K\\_M\n* Q5\\_0, Q5\\_1, Q5\\_K\\_S, Q5\\_K\\_M\n* Q6\\_K, Q8\\_0\n\n## Training Details\n\n### Training Data\n\n* Based on gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts pretraining corpus (public large-scale web text, open datasets).\n* No additional fine-tuning was performed for this release.\n\n### Training Procedure\n\n* Original gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts → quantized to GGUF formats.\n\n### Quantization Results\n\n| Quantization | Size (vs. FP16) | Speed | Quality | Recommended For |\n|--------------|-----------------|-----------|------------|--------------------------------------|\n| Q2_K | Smallest | Fastest | Low | Prototyping, minimal RAM/CPU |\n| Q3_K_S | Very Small | Very Fast | Low-Med | Lightweight devices, testing |\n| Q3_K_M | Small | Fast | Med | Lightweight, slightly better quality |\n| Q3_K_L | Small-Med | Fast | Med | Faster inference, fair quality |\n| Q4_0 | Medium | Fast | Good | General use, chats, low RAM |\n| Q4_1 | Medium | Fast | Good+ | Recommended, slightly better quality |\n| Q4_K_S | Medium | Fast | Good+ | Recommended, balanced |\n| Q4_K_M | Medium | Fast | Good++ | Recommended, best Q4 option |\n| Q5_0 | Larger | Moderate | Very Good | Chatbots, longer responses |\n| Q5_1 | Larger | Moderate | Very Good+ | More demanding tasks |\n| Q5_K_S | Larger | Moderate | Very Good+ | Advanced users, better accuracy |\n| Q5_K_M | Larger | Moderate | Excellent | Demanding tasks, high quality |\n| Q6_K | Large | Slower | Near FP16 | Power users, best quantized quality |\n| Q8_0 | Largest | Slowest | FP16-like | Maximum quality, high RAM/CPU |\n\n> **Note:** \n> - Lower quantization = smaller model, faster inference, but lower output quality. \n> - Q4_K_M is ideal for most users; Q6_K/Q8_0 offer the highest quality, best for advanced use. \n> - All quantizations are suitable for consumer hardware—select based on your quality/speed needs.\n\n\n## Technical Specifications\n\n#### Software\n\n* llama.cpp for quantization\n* Python 3.10, huggingface\\_hub\n\n## Citation\n\n**BibTeX:**\n\n```bibtex\n@miscgpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF,\n title=gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF Quantized Models},\n author={leeminwaan},\n year={2025},\n howpublished={\\url{https://huggingface.co/leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF}}\n}\n```\n\n**APA:**\n\n```\nleeminwaan. (2025). gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF Quantized Models [Computer software]. Hugging Face. https://huggingface.co/leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF\n```\n\n## Glossary\n\n* **Quantization:** Reducing precision of weights to lower memory usage.\n* **GGUF:** Optimized format for llama.cpp inference.\n\n## More Information\n\n* This project is experimental.\n* Expect further updates and quantization benchmarks.\n\n## Model Card Authors\n\n* leeminwaan\n\n## Model Card Contact\n\n* Hugging Face: [leeminwaan](https://huggingface.co/leeminwaan)\n",
"related_quantizations": []
},
"tags": [
"gguf",
"text-generation",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 2,
"downloads": 150,
"gated": false,
"private": false,
"last_modified": "2025-09-01T12:09:37.000Z",
"created_at": "2025-09-01T12:02:57.000Z",
"pipeline_tag": "text-generation",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "68b58b713eab91ae003ab36b",
"id": "leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF",
"modelId": "leeminwaan/gpt-oss-4.2b-specialized-all-pruned-moe-only-4-experts-GGUF",
"sha": "16687acc51dd81660ee2d2ae96ddf15424f9dc0b",
"createdAt": "2025-09-01T12:02:57.000Z",
"lastModified": "2025-09-01T12:09:37.000Z",
"author": "leeminwaan",
"downloads": 150,
"likes": 2,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "",
"siblings_count": 16
}