matrixdose/mistral-nemo-2407-12b-thinking-claude-gemini-gpt5.2-uncensored-heretic-gguf HERETIC_Q8_0 GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
matrixdose/mistral-nemo-2407-12b-thinking-claude-gemini-gpt5.2-uncensored-heretic-gguf overview
HERETIC is a reasoning-oriented variant of the Mistral-Nemo 2407 12B architecture distributed in GGUF format for efficient local inference. The model is intended for users who want a flexible conversational assistant capable of analytical reasoning, long-form explanations, and open-ended dialogue while running entirely on local hardware. This repository provides quantized versions optimized for llama.cpp–based runtimes and other compatible inference tools. --- # Model Details Model Name: Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC Architecture: Mistral-Nemo (12B parameters) Format: GGUF Base Model: Mistral-Nemo-2407 Distribution: Quantized builds for local inference Primary Capability: Instruction-following with extended reasoning and conversational flexibility HERETIC focuses on encouraging multi-step reasoning and detailed responses while maintaining a natural conversational style. --- # Intended Use This model is designed primarily for local deployments and experimentation. Typical use cases include: --- # Out-of-Scope Use The model should not be relied upon for: Outputs may contain inaccuracies or biased information. --- # Prompt Format The model works best with structured role-based prompts. Example conversation template: Some interfaces automatically apply a compatible chat template. --- # Running the Model This model uses the GGUF format, making it compatible with several local inference tools.
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_F16.gguf | GGUF | F16 | 22.82 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q2_k.gguf | GGUF | Q2_K | 4.46 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q3_k_m.gguf | GGUF | Q3_K_M | 5.67 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q4_k_m.gguf | GGUF | Q4_K_M | 6.96 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q5_k_m.gguf | GGUF | Q5_K_M | 8.13 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q6_k.gguf | GGUF | Q6_K | 9.37 GB | Download |
| Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q8_0.gguf | GGUF | — | 12.13 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "other",
"language": [
"en"
],
"tags": [
"gguf",
"mistral",
"mistral-nemo",
"local-inference",
"reasoning",
"uncensored",
"text-generation"
],
"pipeline_tag": "text-generation",
"library_name": "llama.cpp",
"base_model": "mistral-nemo-2407",
"frontmatter": {},
"hero_image_url": "",
"summary": "HERETIC is a reasoning-oriented variant of the **Mistral-Nemo 2407 12B** architecture distributed in **GGUF format** for efficient local inference. The model is intended for users who want a flexible conversational assistant capable of analytical reasoning, long-form explanations, and open-ended dialogue while running entirely on local hardware. This repository provides quantized versions optimized for **llama.cpp–based runtimes** and other compatible inference tools. --- # Model Details **Model Name:** Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC **Architecture:** Mistral-Nemo (12B parameters) **Format:** GGUF **Base Model:** Mistral-Nemo-2407 **Distribution:** Quantized builds for local inference **Primary Capability:** Instruction-following with extended reasoning and conversational flexibility HERETIC focuses on encouraging multi-step reasoning and detailed responses while maintaining a natural conversational style. --- # Intended Use This model is designed primarily for **local deployments** and experimentation. Typical use cases include: --- # Out-of-Scope Use The model should not be relied upon for: Outputs may contain inaccuracies or biased information. --- # Prompt Format The model works best with structured role-based prompts. Example conversation template: `` You are a helpful AI assistant. Explain how neural networks learn. `` Some interfaces automatically apply a compatible chat template. --- # Running the Model This model uses the **GGUF format**, making it compatible with several local inference tools.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\r\nlicense: other\r\nlanguage:\r\n- en\r\ntags:\r\n- gguf\r\n- mistral\r\n- mistral-nemo\r\n- local-inference\r\n- reasoning\r\n- uncensored\r\n- text-generation\r\npipeline_tag: text-generation\r\nlibrary_name: llama.cpp\r\nbase_model: mistral-nemo-2407\r\n---\r\n\r\n# HERETIC – Mistral-Nemo 2407 12B Thinking (GGUF)\r\n\r\nHERETIC is a reasoning-oriented variant of the **Mistral-Nemo 2407 12B** architecture distributed in **GGUF format** for efficient local inference. \r\nThe model is intended for users who want a flexible conversational assistant capable of analytical reasoning, long-form explanations, and open-ended dialogue while running entirely on local hardware.\r\n\r\nThis repository provides quantized versions optimized for **llama.cpp–based runtimes** and other compatible inference tools.\r\n\r\n---\r\n\r\n# Model Details\r\n\r\n**Model Name:** Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC \r\n**Architecture:** Mistral-Nemo (12B parameters) \r\n**Format:** GGUF \r\n**Base Model:** Mistral-Nemo-2407 \r\n**Distribution:** Quantized builds for local inference \r\n**Primary Capability:** Instruction-following with extended reasoning and conversational flexibility\r\n\r\nHERETIC focuses on encouraging multi-step reasoning and detailed responses while maintaining a natural conversational style.\r\n\r\n---\r\n\r\n# Intended Use\r\n\r\nThis model is designed primarily for **local deployments** and experimentation.\r\n\r\nTypical use cases include:\r\n\r\n- Personal AI assistants\r\n- Coding help and technical explanations\r\n- Analytical reasoning tasks\r\n- Brainstorming and creative writing\r\n- Prompt engineering and LLM experimentation\r\n- Offline or privacy-focused AI workflows\r\n\r\n---\r\n\r\n# Out-of-Scope Use\r\n\r\nThe model should not be relied upon for:\r\n\r\n- Legal advice\r\n- Medical advice\r\n- Safety-critical decision making\r\n- Automated moderation systems\r\n\r\nOutputs may contain inaccuracies or biased information.\r\n\r\n---\r\n\r\n# Prompt Format\r\n\r\nThe model works best with structured role-based prompts.\r\n\r\nExample conversation template:\r\n\r\n```\r\n\r\n<|system|>\r\nYou are a helpful AI assistant.\r\n\r\n<|user|>\r\nExplain how neural networks learn.\r\n\r\n<|assistant|>\r\n\r\n```\r\n\r\nSome interfaces automatically apply a compatible chat template.\r\n\r\n---\r\n\r\n# Running the Model\r\n\r\nThis model uses the **GGUF format**, making it compatible with several local inference tools.\r\n\r\n## llama.cpp\r\n\r\nExample command:\r\n\r\n```bash\r\n./llama.exe -m Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC_Q4_K_M.gguf -p \"Explain quantum computing in simple terms.\"\r\n```\r\n\r\n---\r\n\r\n# Limitations\r\n\r\nLike most large language models:\r\n\r\n- The model can generate incorrect information.\r\n- It may hallucinate facts or citations.\r\n- Output quality depends heavily on prompt design.\r\n- Responses reflect biases present in training data.\r\n\r\nUsers should critically evaluate outputs before relying on them.\r\n\r\n---\r\n\r\n# Acknowledgements\r\n\r\nThis model builds on contributions from several open-source projects:\r\n\r\n- The **Mistral** research team for the underlying architecture\r\n- The **llama.cpp** ecosystem enabling efficient local inference\r\n- The **GGUF format** used for optimized model distribution\r\n- The open-source community that develops tools for local LLM deployment\r\n\r\n---\r\n\r\n# Disclaimer\r\n\r\nThis model is provided for **research, experimentation, and local use**.\r\nUsers are responsible for ensuring that deployments comply with applicable laws and the licensing terms of the underlying base model.\r\n",
"related_quantizations": []
},
"tags": [
"llama.cpp",
"gguf",
"mistral",
"mistral-nemo",
"local-inference",
"reasoning",
"uncensored",
"text-generation",
"en",
"license:other",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 1,
"downloads": 1947,
"gated": false,
"private": false,
"last_modified": "2026-03-30T09:01:20.000Z",
"created_at": "2026-03-30T09:01:20.000Z",
"pipeline_tag": "text-generation",
"library_name": "llama.cpp"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69ca3be0bf9b00fc3ff2750d",
"id": "matrixdose/Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC-GGUF",
"modelId": "matrixdose/Mistral-Nemo-2407-12B-Thinking-Claude-Gemini-GPT5.2-Uncensored-HERETIC-GGUF",
"sha": "6bd088a4d5b41e75b61bc3646c5f040a6ab33a0b",
"createdAt": "2026-03-30T09:01:20.000Z",
"lastModified": "2026-03-30T09:01:20.000Z",
"author": "matrixdose",
"downloads": 1947,
"likes": 1,
"gated": false,
"private": false,
"pipeline_tag": "text-generation",
"library_name": "llama.cpp",
"siblings_count": 9
}