Model Intelligence Sheet

rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf overview

The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale. Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach. 122B total parameters, 10B active per token (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16. ⚡ Forged on 8×H200 SXM5 | 1.1TB VRAM ---

ggufqwen3.5moeuncensoredreasoningfine-tunedopusclaudelora122b10b-activetext-generationenzhjakofrdeesptruardataset:nohurry/Opus-4.6-Reasoning-3000x-filtereddataset:Roman1111111/claude-opus-4.6-10000xdataset:Jackrong/Qwen3.5-reasoning-700xdataset:TeichAI/claude-4.5-opus-high-reasoning-250xbase_model:Qwen/Qwen3.5-122B-A10Bbase_model:adapter:Qwen/Qwen3.5-122B-A10Blicense:apache-2.0endpoints_compatible

rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf visual

Downloads

2,207

Likes

Pipeline

text-generation

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

7 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen3.5-122B-A10B-Opus-Reasoning-BF16.gguf	GGUF	BF16	227.54 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q2_K.gguf	GGUF	Q2_K	41.81 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q3_K_M.gguf	GGUF	Q3_K_M	54.59 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf	GGUF	Q4_K_M	69.12 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q5_K_M.gguf	GGUF	Q5_K_M	80.90 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q6_K.gguf	GGUF	Q6_K	93.42 GB	Download
Qwen3.5-122B-A10B-Opus-Reasoning-Q8_0.gguf	GGUF	—	120.95 GB	Download

Model Details Live

Model Slug

rcorvohan/qwen3.5-122b-a10b-opus-reasoning-gguf

Author

rcorvohan

Pipeline Task

text-generation

Library

—

Created

2026-04-01

Last Modified

2026-04-01

Gated

Private

HF SHA

a8ee863129b037d5a740640197af2826a68d5a2c

License

apache-2.0

Language

en, zh, ja, ko, fr, de, es, pt, ru, ar

Base Model

Qwen/Qwen3.5-122B-A10B

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "license": "apache-2.0",
    "base_model": "Qwen/Qwen3.5-122B-A10B",
    "tags": [
      "qwen3.5",
      "moe",
      "gguf",
      "uncensored",
      "reasoning",
      "fine-tuned",
      "opus",
      "claude",
      "lora",
      "122b",
      "10b-active"
    ],
    "model_type": "qwen3_5_moe",
    "quantized_by": "timteh673",
    "datasets": [
      "nohurry/Opus-4.6-Reasoning-3000x-filtered",
      "Roman1111111/claude-opus-4.6-10000x",
      "Jackrong/Qwen3.5-reasoning-700x",
      "TeichAI/claude-4.5-opus-high-reasoning-250x"
    ],
    "language": [
      "en",
      "zh",
      "ja",
      "ko",
      "fr",
      "de",
      "es",
      "pt",
      "ru",
      "ar"
    ],
    "pipeline_tag": "text-generation",
    "frontmatter": {
      "license": "apache-2.0",
      "base_model": "Qwen/Qwen3.5-122B-A10B",
      "tags": [
        "qwen3.5",
        "moe",
        "gguf",
        "uncensored",
        "reasoning",
        "fine-tuned",
        "opus",
        "claude",
        "lora",
        "122b",
        "10b-active"
      ],
      "model_type": "qwen3_5_moe",
      "quantized_by": "timteh673",
      "datasets": [
        "nohurry/Opus-4.6-Reasoning-3000x-filtered",
        "Roman1111111/claude-opus-4.6-10000x",
        "Jackrong/Qwen3.5-reasoning-700x",
        "TeichAI/claude-4.5-opus-high-reasoning-250x"
      ],
      "language": [
        "en",
        "zh",
        "ja",
        "ko",
        "fr",
        "de",
        "es",
        "pt",
        "ru",
        "ar"
      ],
      "pipeline_tag": "text-generation"
    },
    "hero_image_url": "https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png",
    "summary": "**The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale.** Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach. 122B total parameters, **10B active per token** (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16. ⚡ **Forged on 8×H200 SXM5 | 1.1TB VRAM** ---",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: apache-2.0\nbase_model: Qwen/Qwen3.5-122B-A10B\ntags:\n  - qwen3.5\n  - moe\n  - gguf\n  - uncensored\n  - reasoning\n  - fine-tuned\n  - opus\n  - claude\n  - lora\n  - 122b\n  - 10b-active\nmodel_type: qwen3_5_moe\nquantized_by: timteh673\ndatasets:\n  - nohurry/Opus-4.6-Reasoning-3000x-filtered\n  - Roman1111111/claude-opus-4.6-10000x\n  - Jackrong/Qwen3.5-reasoning-700x\n  - TeichAI/claude-4.5-opus-high-reasoning-250x\nlanguage:\n  - en\n  - zh\n  - ja\n  - ko\n  - fr\n  - de\n  - es\n  - pt\n  - ru\n  - ar\npipeline_tag: text-generation\n---\n\n# Qwen3.5-122B-A10B-Opus-Reasoning-GGUF\n\n**The first Claude Opus-distilled reasoning fine-tune of Qwen3.5-122B at full scale.** Enhanced multi-step reasoning, analytical depth, and uncensored output — trained where competitors can't reach.\n\n122B total parameters, **10B active per token** (Mixture-of-Experts). LoRA fine-tuned on 12,840 Claude Opus 4.6 reasoning traces. 7 quantization levels from Q2_K to BF16.\n\n⚡ **Forged on 8×H200 SXM5 | 1.1TB VRAM**\n\n---\n\n## Why This Model\n\n| | Base Qwen3.5-122B | Jackrong (27B) | **TIMTEH (this)** |\n|---|---|---|---|\n| Scale | 122B/10B active | 27B dense | **122B/10B active** |\n| Training data | Base alignment | Opus distillation | **Opus distillation** |\n| Reasoning quality | Standard | Enhanced (small scale) | **Enhanced (full MoE scale)** |\n| Uncensored | ❌ | ✅ | **✅** |\n| Hardware required to train | Any | Consumer GPU | **8×H200 (1.1TB VRAM)** |\n\n**Nobody else has fine-tuned Qwen3.5-122B on Opus reasoning data.** Jackrong stopped at 27B because they don't have the hardware. We do.\n\n---\n\n## Quantizations\n\n| Quant | File | Size | BPW | RAM Required | Use Case |\n|-------|------|------|-----|-------------|----------|\n| **BF16** | `...-BF16.gguf` | 228 GB | 16.0 | ~235 GB | Maximum quality, reference |\n| **Q8_0** | `...-Q8_0.gguf` | 121 GB | 8.5 | ~125 GB | Near-lossless, high-VRAM setups |\n| **Q6_K** | `...-Q6_K.gguf` | 94 GB | 6.6 | ~98 GB | Excellent quality |\n| **Q5_K_M** | `...-Q5_K_M.gguf` | 81 GB | 5.7 | ~85 GB | Great balance |\n| **Q4_K_M** | `...-Q4_K_M.gguf` | 70 GB | 4.9 | ~74 GB | ⭐ **Recommended** — best quality/size |\n| **Q3_K_M** | `...-Q3_K_M.gguf` | 55 GB | 3.9 | ~58 GB | Fits 2×48GB GPUs |\n| **Q2_K** | `...-Q2_K.gguf` | 42 GB | 2.9 | ~45 GB | Single 48GB GPU |\n\n---\n\n## Training Details\n\n| Parameter | Value |\n|-----------|-------|\n| **Base Model** | [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) |\n| **Method** | LoRA (r=64, alpha=128, dropout=0.05) |\n| **Trainable Parameters** | 66.8M / 122.1B (0.05%) |\n| **Training Samples** | 12,840 |\n| **Epochs** | 2 |\n| **Steps** | 1,266 |\n| **Final Avg Loss** | 0.1502 |\n| **Training Time** | 6 hours 34 minutes |\n| **Hardware** | 8× NVIDIA H200 SXM5 (141GB HBM3e each, NVLink 478 GB/s) |\n| **Precision** | BF16 (full, no quantized training) |\n| **Effective Batch Size** | 64 |\n| **Learning Rate** | Cosine schedule, peak 2e-4 |\n| **Max Sequence Length** | 4096 |\n\n### Training Datasets\n\n| Dataset | Samples | Source |\n|---------|---------|--------|\n| opus-10000x | 9,633 | Claude Opus 4.6 reasoning traces (10K filtered) |\n| opus-3000x | 2,326 | Claude Opus 4.6 reasoning traces (3K filtered) |\n| reasoning-700x | 633 | Qwen3.5 reasoning samples |\n| high-reasoning-250x | 250 | High-quality Opus reasoning (curated) |\n\n---\n\n## Architecture\n\n- **Type:** Qwen3_5MoeForCausalLM (Mixture-of-Experts)\n- **Total Parameters:** 122.1B\n- **Active Parameters:** ~10B per token\n- **Hidden Size:** 3,072\n- **Layers:** 48\n- **Attention Heads:** 32 (GQA)\n- **Experts:** 256 routed + shared expert, 10 active per token\n- **Context Length:** 131,072 tokens (default), extensible to 262K\n- **Vocab Size:** 248,320\n- **Thinking Mode:** Supports `<think>` tags for explicit chain-of-thought\n- **License:** Apache 2.0\n\n---\n\n## Usage\n\n### llama.cpp\n\n```bash\n# Recommended: Q4_K_M\n./llama-cli -m Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf \\\n  -p \"Analyze the following problem step by step:\" \\\n  -n 2048 --temp 0.7 --top-p 0.9\n\n# Server mode\n./llama-server -m Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf \\\n  --port 8080 --host 0.0.0.0 -c 65536\n```\n\n### Ollama\n\n```bash\nollama run timteh673/Qwen3.5-122B-A10B-Opus-Reasoning\n```\n\n### LM Studio\n\nDownload the GGUF file and load in LM Studio. Supports thinking/non-thinking modes via `enable_thinking` in chat template.\n\n### Open WebUI / SillyTavern\n\nPoint your backend to a llama.cpp server running any quant. Full OpenAI-compatible API at `/v1/chat/completions`.\n\n### Recommended Settings\n\n| Setting | Value | Notes |\n|---------|-------|-------|\n| Temperature | 0.6–0.7 | Reasoning tasks |\n| Temperature | 0.8–1.0 | Creative tasks |\n| Top-P | 0.9 | |\n| Min-P | 0.05 | Good alternative to Top-P |\n| Context | 32K+ | Supports up to 131K |\n| Thinking | Enabled | Use `enable_thinking=True` for best results |\n\n---\n\n## What's Different From Base\n\n- **Enhanced reasoning chains** — trained on 12,840 Opus-quality multi-step analytical traces\n- **Better instruction following** — deeper engagement with complex prompts\n- **Uncensored** — no refusal training, responds to all prompts\n- **MoE efficiency** — only 10B params active per token despite 122B total\n- **Thinking mode** — native `<think>` tag support for explicit chain-of-thought\n\n---\n\n## Pipeline\n\n```\nQwen3.5-122B-A10B (base)\n  → LoRA fine-tune (r=64, 12,840 Opus traces, 8×H200, 6.5h)\n  → Merge adapter into base weights\n  → Convert to BF16 GGUF (llama.cpp, 879 tensors)\n  → Quantize: Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M, Q2_K\n```\n\nAll steps executed natively in BF16 — no quantized training, no optimization hacks. When you have 1.1TB VRAM, you use it.\n\n---\n\n## Model Provenance\n\n- **Base:** [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) (Apache 2.0)\n- **Training Framework:** transformers + PEFT + TRL (raw, no wrappers)\n- **Quantization:** llama.cpp (build 8c60b8a)\n- **Hardware:** 8×NVIDIA H200 SXM5 (IBM Cloud, 1.1TB VRAM total)\n\n---\n\n## Also From TIMTEH\n\n| Model | Status | Description |\n|-------|--------|-------------|\n| [Qwen3.5-397B-A17B-Uncensored-GGUF](https://huggingface.co/timteh673/Qwen3.5-397B-A17B-Uncensored-GGUF) | ✅ Live | Abliterated 397B MoE — 7 quants |\n| [Mistral-Small-4-119B-Uncensored-GGUF](https://huggingface.co/timteh673/Mistral-Small-4-119B-Uncensored-GGUF) | ✅ Live | First TIMTEH release — 7 quants |\n| [Nemotron-3-Super-120B-A12B-Uncensored-GGUF](https://huggingface.co/timteh673/Nemotron-3-Super-120B-A12B-Uncensored-GGUF) | ✅ Live | Benchmarked — 7 quants |\n| Qwen3.5-397B Opus-Reasoning | 🔥 Training | Stage 2 fine-tune (same technique, 397B scale) |\n\n---\n\n## ⚠️ Disclaimer\n\nThis model has been fine-tuned on uncensored reasoning data. It may generate content that is harmful, offensive, or inappropriate. Users are solely responsible for ensuring their use complies with applicable laws and ethical standards. Intended for research, testing, and controlled environments.\n\n---\n\n## ☕ Support This Work\n\nRunning 8×H200 GPUs isn't free. Every donation directly funds more open-weight model releases, better abliteration techniques, and pushing the frontier of what's possible with open models.\n\n<a href=\"https://buymeacoffee.com/timteh\" target=\"_blank\">\n  <img src=\"https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png\" alt=\"Buy Me A Coffee\" width=\"217\" height=\"60\">\n</a>\n\n<p align=\"center\">\n  <img src=\"bmac-qr.png\" alt=\"Buy Me a Coffee QR Code\" width=\"250\">\n</p>\n\n### 💎 Crypto Donations\n\n| Currency | Address |\n|----------|---------|\n| **BTC** | `bc1p4q7vpwucvww2y3x4nhps4y4vekye8uwm9re5a0kx8l6u5nky5ucszm2qhh` |\n| **ETH** | `0xe5Aa16E53b141D42458ABeEDb00a157c3Fea2108` |\n| **SOL** | `9CXwjG1mm9uLkxRevdMQiF61cr6TNHSiWtFRHmUEgzkG` |\n\n---\n\n## 🏢 Enterprise & Custom Models\n\n**Need a custom 120B+ model aligned to your proprietary data?** TIMTEH provides bespoke enterprise fine-tuning, abliteration, and deployment on 8×H200 SXM5.\n\n- Custom fine-tuning on your data (up to 400B+ parameters)\n- Private CARE abliteration (Phase 2 technique)\n- Deployment architecture consulting (tensor parallelism, speculative decoding)\n- Bespoke distillation datasets\n\n**📧 Contact:** [tim@timlex.co](mailto:tim@timlex.co)\n\n---\n\n*Part of the TIMTEH Cognitive Preservation Foundry — surgical capability preservation at scale.*\n⚡ Forged on 8×NVIDIA H200 SXM5 | 1.1TB VRAM\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "qwen3.5",
    "moe",
    "uncensored",
    "reasoning",
    "fine-tuned",
    "opus",
    "claude",
    "lora",
    "122b",
    "10b-active",
    "text-generation",
    "en",
    "zh",
    "ja",
    "ko",
    "fr",
    "de",
    "es",
    "pt",
    "ru",
    "ar",
    "dataset:nohurry/Opus-4.6-Reasoning-3000x-filtered",
    "dataset:Roman1111111/claude-opus-4.6-10000x",
    "dataset:Jackrong/Qwen3.5-reasoning-700x",
    "dataset:TeichAI/claude-4.5-opus-high-reasoning-250x",
    "base_model:Qwen/Qwen3.5-122B-A10B",
    "base_model:adapter:Qwen/Qwen3.5-122B-A10B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 2,
  "downloads": 2207,
  "gated": false,
  "private": false,
  "last_modified": "2026-04-01T16:03:29.000Z",
  "created_at": "2026-04-01T16:03:29.000Z",
  "pipeline_tag": "text-generation",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "69cd41d1e963168f8d874e57",
  "id": "rcorvohan/Qwen3.5-122B-A10B-Opus-Reasoning-GGUF",
  "modelId": "rcorvohan/Qwen3.5-122B-A10B-Opus-Reasoning-GGUF",
  "sha": "a8ee863129b037d5a740640197af2826a68d5a2c",
  "createdAt": "2026-04-01T16:03:29.000Z",
  "lastModified": "2026-04-01T16:03:29.000Z",
  "author": "rcorvohan",
  "downloads": 2207,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "",
  "siblings_count": 10
}