GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf overview

The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale. Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach. 122B total parameters, 10B active per token (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16. ⚡ Forged on 8×H200 SXM5 | 1.1TB VRAM ---

ggufqwen3.5moeuncensoredreasoningfine-tunedopusclaudelora122b10b-activetext-generationenzhjakofrdeesptruardataset:nohurry/Opus-4.6-Reasoning-3000x-filtereddataset:Roman1111111/claude-opus-4.6-10000xdataset:Jackrong/Qwen3.5-reasoning-700xdataset:TeichAI/claude-4.5-opus-high-reasoning-250xbase_model:Qwen/Qwen3.5-122B-A10Bbase_model:adapter:Qwen/Qwen3.5-122B-A10Blicense:apache-2.0endpoints_compatible
rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf visual
Downloads
2,207
Likes
2
Pipeline
text-generation
Library
Visibility
Public
Access
Open

Repository Files & Downloads

7 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Qwen3.5-122B-A10B-Opus-Reasoning-BF16.gguf GGUF BF16 227.54 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q2_K.gguf GGUF Q2_K 41.81 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q3_K_M.gguf GGUF Q3_K_M 54.59 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf GGUF Q4_K_M 69.12 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q5_K_M.gguf GGUF Q5_K_M 80.90 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q6_K.gguf GGUF Q6_K 93.42 GB Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q8_0.gguf GGUF 120.95 GB Download

Model Details Live

Model Slug
rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf
Author
rcorvohan
Pipeline Task
text-generation
Library
Created
2026-04-01
Last Modified
2026-04-01
Gated
No
Private
No
HF SHA
a8ee863129b037d5a740640197af2826a68d5a2c
License
apache-2.0
Language
en, zh, ja, ko, fr, de, es, pt, ru, ar
Base Model
Qwen/Qwen3.5-122B-A10B

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "apache-2.0",
    "base_model": "Qwen/Qwen3.5-122B-A10B",
    "tags": [
      "qwen3.5",
      "moe",
      "gguf",
      "uncensored",
      "reasoning",
      "fine-tuned",
      "opus",
      "claude",
      "lora",
      "122b",
      "10b-active"
    ],
    "model_type": "qwen3_5_moe",
    "quantized_by": "timteh673",
    "datasets": [
      "nohurry/Opus-4.6-Reasoning-3000x-filtered",
      "Roman1111111/claude-opus-4.6-10000x",
      "Jackrong/Qwen3.5-reasoning-700x",
      "TeichAI/claude-4.5-opus-high-reasoning-250x"
    ],
    "language": [
      "en",
      "zh",
      "ja",
      "ko",
      "fr",
      "de",
      "es",
      "pt",
      "ru",
      "ar"
    ],
    "pipeline_tag": "text-generation",
    "frontmatter": {
      "license": "apache-2.0",
      "base_model": "Qwen/Qwen3.5-122B-A10B",
      "tags": [
        "qwen3.5",
        "moe",
        "gguf",
        "uncensored",
        "reasoning",
        "fine-tuned",
        "opus",
        "claude",
        "lora",
        "122b",
        "10b-active"
      ],
      "model_type": "qwen3_5_moe",
      "quantized_by": "timteh673",
      "datasets": [
        "nohurry/Opus-4.6-Reasoning-3000x-filtered",
        "Roman1111111/claude-opus-4.6-10000x",
        "Jackrong/Qwen3.5-reasoning-700x",
        "TeichAI/claude-4.5-opus-high-reasoning-250x"
      ],
      "language": [
        "en",
        "zh",
        "ja",
        "ko",
        "fr",
        "de",
        "es",
        "pt",
        "ru",
        "ar"
      ],
      "pipeline_tag": "text-generation"
    },
    "hero_image_url": "https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png",
    "summary": "**The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale.** Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach. 122B total parameters, **10B active per token** (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16. ⚡ **Forged on 8×H200 SXM5 | 1.1TB VRAM** ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: apache-2.0\nbase_model: Qwen/Qwen3.5-122B-A10B\ntags:\n  - qwen3.5\n  - moe\n  - gguf\n  - uncensored\n  - reasoning\n  - fine-tuned\n  - opus\n  - claude\n  - lora\n  - 122b\n  - 10b-active\nmodel_type: qwen3_5_moe\nquantized_by: timteh673\ndatasets:\n  - nohurry/Opus-4.6-Reasoning-3000x-filtered\n  - Roman1111111/claude-opus-4.6-10000x\n  - Jackrong/Qwen3.5-reasoning-700x\n  - TeichAI/claude-4.5-opus-high-reasoning-250x\nlanguage:\n  - en\n  - zh\n  - ja\n  - ko\n  - fr\n  - de\n  - es\n  - pt\n  - ru\n  - ar\npipeline_tag: text-generation\n---\n\n# Qwen3.5-122B-A10B-Opus-Reasoning-GGUF\n\n**The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale.** Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach.\n\n122B total parameters, **10B active per token** (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16.\n\n⚡ **Forged on 8×H200 SXM5 | 1.1TB VRAM**\n\n---\n\n## Why This Model\n\n| | Base Qwen3.5-122B | Jackrong (27B) | **TIMTEH (this)** |\n|---|---|---|---|\n| Scale | 122B/10B active | 27B dense | **122B/10B active** |\n| Training data | Base alignment | Opus distillation | **Opus distillation** |\n| Reasoning quality | Standard | Enhanced (small scale) | **Enhanced (full MoE scale)** |\n| Uncensored | ❌ | ✅ | **✅** |\n| Hardware required to train | Any | Consumer GPU | **8×H200 (1.1TB VRAM)** |\n\n**Nobody else has fine-tuned Qwen3.5-122B on Opus reasoning data.** Jackrong stopped at 27B because they don't have the hardware. We do.\n\n---\n\n## Quantizations\n\n| Quant | File | Size | BPW | RAM Required | Use Case |\n|-------|------|------|-----|-------------|----------|\n| **BF16** | `...-BF16.gguf` | 228 GB | 16.0 | ~235 GB | Maximum quality, reference |\n| **Q8_0** | `...-Q8_0.gguf` | 121 GB | 8.5 | ~125 GB | Near-lossless, high-VRAM setups |\n| **Q6_K** | `...-Q6_K.gguf` | 94 GB | 6.6 | ~98 GB | Excellent quality |\n| **Q5_K_M** | `...-Q5_K_M.gguf` | 81 GB | 5.7 | ~85 GB | Great balance |\n| **Q4_K_M** | `...-Q4_K_M.gguf` | 70 GB | 4.9 | ~74 GB | ⭐ **Recommended** — best quality/size |\n| **Q3_K_M** | `...-Q3_K_M.gguf` | 55 GB | 3.9 | ~58 GB | Fits 2×48GB GPUs |\n| **Q2_K** | `...-Q2_K.gguf` | 42 GB | 2.9 | ~45 GB | Single 48GB GPU |\n\n---\n\n## Training Details\n\n| Parameter | Value |\n|-----------|-------|\n| **Base Model** | [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) |\n| **Method** | LoRA (r=64, alpha=128, dropout=0.05) |\n| **Trainable Parameters** | 66.8M / 122.1B (0.05%) |\n| **Training Samples** | 12,840 |\n| **Epochs** | 2 |\n| **Steps** | 1,266 |\n| **Final Avg Loss** | 0.1502 |\n| **Training Time** | 6 hours 34 minutes |\n| **Hardware** | 8× NVIDIA H200 SXM5 (141GB HBM3e each, NVLink 478 GB/s) |\n| **Precision** | BF16 (full, no quantized training) |\n| **Effective Batch Size** | 64 |\n| **Learning Rate** | Cosine schedule, peak 2e-4 |\n| **Max Sequence Length** | 4096 |\n\n### Training Datasets\n\n| Dataset | Samples | Source |\n|---------|---------|--------|\n| opus-10000x | 9,633 | Claude Opus 4.6 reasoning traces (10K filtered) |\n| opus-3000x | 2,326 | Claude Opus 4.6 reasoning traces (3K filtered) |\n| reasoning-700x | 633 | Qwen3.5 reasoning samples |\n| high-reasoning-250x | 250 | High-quality Opus reasoning (curated) |\n\n---\n\n## Architecture\n\n- **Type:** Qwen3_5MoeForCausalLM (Mixture-of-Experts)\n- **Total Parameters:** 122.1B\n- **Active Parameters:** ~10B per token\n- **Hidden Size:** 3,072\n- **Layers:** 48\n- **Attention Heads:** 32 (GQA)\n- **Experts:** 256 routed + shared expert, 10 active per token\n- **Context Length:** 131,072 tokens (default), extensible to 262K\n- **Vocab Size:** 248,320\n- **Thinking Mode:** Supports `<think>` tags for explicit chain-of-thought\n- **License:** Apache 2.0\n\n---\n\n## Usage\n\n### llama.cpp\n\n```bash\n# Recommended: Q4_K_M\n./llama-cli -m Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf \\\n  -p \"Analyze the following problem step by step:\" \\\n  -n 2048 --temp 0.7 --top-p 0.9\n\n# Server mode\n./llama-server -m Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf \\\n  --port 8080 --host 0.0.0.0 -c 65536\n```\n\n### Ollama\n\n```bash\nollama run timteh673/Qwen3.5-122B-A10B-Opus-Reasoning\n```\n\n### LM Studio\n\nDownload the GGUF file and load in LM Studio. Supports thinking/non-thinking modes via `enable_thinking` in chat template.\n\n### Open WebUI / SillyTavern\n\nPoint your backend to a llama.cpp server running any quant. Full OpenAI-compatible API at `/v1/chat/completions`.\n\n### Recommended Settings\n\n| Setting | Value | Notes |\n|---------|-------|-------|\n| Temperature | 0.6–0.7 | Reasoning tasks |\n| Temperature | 0.8–1.0 | Creative tasks |\n| Top-P | 0.9 | |\n| Min-P | 0.05 | Good alternative to Top-P |\n| Context | 32K+ | Supports up to 131K |\n| Thinking | Enabled | Use `enable_thinking=True` for best results |\n\n---\n\n## What's Different From Base\n\n- **Enhanced reasoning chains** — trained on 12,840 Opus-quality multi-step analytical traces\n- **Better instruction following** — deeper engagement with complex prompts\n- **Uncensored** — no refusal training, responds to all prompts\n- **MoE efficiency** — only 10B params active per token despite 122B total\n- **Thinking mode** — native `<think>` tag support for explicit chain-of-thought\n\n---\n\n## Pipeline\n\n```\nQwen3.5-122B-A10B (base)\n  → LoRA fine-tune (r=64, 12,840 Opus traces, 8×H200, 6.5h)\n  → Merge adapter into base weights\n  → Convert to BF16 GGUF (llama.cpp, 879 tensors)\n  → Quantize: Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M, Q2_K\n```\n\nAll steps executed natively in BF16 — no quantized training, no optimization hacks. When you have 1.1TB VRAM, you use it.\n\n---\n\n## Model Provenance\n\n- **Base:** [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) (Apache 2.0)\n- **Training Framework:** transformers + PEFT + TRL (raw, no wrappers)\n- **Quantization:** llama.cpp (build 8c60b8a)\n- **Hardware:** 8×NVIDIA H200 SXM5 (IBM Cloud, 1.1TB VRAM total)\n\n---\n\n## Also From TIMTEH\n\n| Model | Status | Description |\n|-------|--------|-------------|\n| [Qwen3.5-397B-A17B-Uncensored-GGUF](https://huggingface.co/timteh673/Qwen3.5-397B-A17B-Uncensored-GGUF) | ✅ Live | Abliterated 397B MoE — 7 quants |\n| [Mistral-Small-4-119B-Uncensored-GGUF](https://huggingface.co/timteh673/Mistral-Small-4-119B-Uncensored-GGUF) | ✅ Live | First TIMTEH release — 7 quants |\n| [Nemotron-3-Super-120B-A12B-Uncensored-GGUF](https://huggingface.co/timteh673/Nemotron-3-Super-120B-A12B-Uncensored-GGUF) | ✅ Live | Benchmarked — 7 quants |\n| Qwen3.5-397B Opus-Reasoning | 🔥 Training | Stage 2 fine-tune (same technique, 397B scale) |\n\n---\n\n## ⚠️ Disclaimer\n\nThis model has been fine-tuned on uncensored reasoning data. It may generate content that is harmful, offensive, or inappropriate. Users are solely responsible for ensuring their use complies with applicable laws and ethical standards. Intended for research, testing, and controlled environments.\n\n---\n\n## ☕ Support This Work\n\nRunning 8×H200 GPUs isn't free. Every donation directly funds more open-weight model releases, better abliteration techniques, and pushing the frontier of what's possible with open models.\n\n<a href=\"https://buymeacoffee.com/timteh\" target=\"_blank\">\n  <img src=\"https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png\" alt=\"Buy Me A Coffee\" width=\"217\" height=\"60\">\n</a>\n\n<p align=\"center\">\n  <img src=\"bmac-qr.png\" alt=\"Buy Me a Coffee QR Code\" width=\"250\">\n</p>\n\n### 💎 Crypto Donations\n\n| Currency | Address |\n|----------|---------|\n| **BTC** | `bc1p4q7vpwucvww2y3x4nhps4y4vekye8uwm9re5a0kx8l6u5nky5ucszm2qhh` |\n| **ETH** | `0xe5Aa16E53b141D42458ABeEDb00a157c3Fea2108` |\n| **SOL** | `9CXwjG1mm9uLkxRevdMQiF61cr6TNHSiWtFRHmUEgzkG` |\n\n---\n\n## 🏢 Enterprise & Custom Models\n\n**Need a custom 120B+ model aligned to your proprietary data?** TIMTEH provides bespoke enterprise fine-tuning, abliteration, and deployment on 8×H200 SXM5.\n\n- Custom fine-tuning on your data (up to 400B+ parameters)\n- Private CARE abliteration (Phase 2 technique)\n- Deployment architecture consulting (tensor parallelism, speculative decoding)\n- Bespoke distillation datasets\n\n**📧 Contact:** [tim@timlex.co](mailto:tim@timlex.co)\n\n---\n\n*Part of the TIMTEH Cognitive Preservation Foundry — surgical capability preservation at scale.*\n⚡ Forged on 8×NVIDIA H200 SXM5 | 1.1TB VRAM\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "qwen3.5",
    "moe",
    "uncensored",
    "reasoning",
    "fine-tuned",
    "opus",
    "claude",
    "lora",
    "122b",
    "10b-active",
    "text-generation",
    "en",
    "zh",
    "ja",
    "ko",
    "fr",
    "de",
    "es",
    "pt",
    "ru",
    "ar",
    "dataset:nohurry/Opus-4.6-Reasoning-3000x-filtered",
    "dataset:Roman1111111/claude-opus-4.6-10000x",
    "dataset:Jackrong/Qwen3.5-reasoning-700x",
    "dataset:TeichAI/claude-4.5-opus-high-reasoning-250x",
    "base_model:Qwen/Qwen3.5-122B-A10B",
    "base_model:adapter:Qwen/Qwen3.5-122B-A10B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 2,
  "downloads": 2207,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-01T16:03:29.000Z",
  "created_at": "2026-04-01T16:03:29.000Z",
  "pipeline_tag": "text-generation",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "69cd41d1e963168f8d874e57",
  "id": "rcorvohan/Qwen3.5-122B-A10B-Opus-Reasoning-GGUF",
  "modelId": "rcorvohan/Qwen3.5-122B-A10B-Opus-Reasoning-GGUF",
  "sha": "a8ee863129b037d5a740640197af2826a68d5a2c",
  "createdAt": "2026-04-01T16:03:29.000Z",
  "lastModified": "2026-04-01T16:03:29.000Z",
  "author": "rcorvohan",
  "downloads": 2207,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "",
  "siblings_count": 10
}