Model Intelligence Sheet
noctrex/qwen3.5-35b-a3b-claude-4.6-opus-reasoning-distilled-mxfp4_moe-gguf overview
These are quantizations of the model Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled The mmproj files are the same from unsloth. Read the guide from unsloth in order to set up the model's recommended settings: Qwen3.5 - How to Run Locally Guide The mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest. So I created 2 new variants, where the other tensors are either BF16 or FP16 instead of Q8. The order of preference is BF16, then F16. On some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.
Downloads
10,175
Likes
7
Pipeline
image-text-to-text
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
5 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE.gguf | GGUF | — | 18.88 GB | Download |
| Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE_BF16.gguf | GGUF | BF16 | 20.55 GB | Download |
| Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE_F16.gguf | GGUF | F16 | 20.55 GB | Download |
| mmproj-BF16.gguf | GGUF | BF16 | 861.00 MB | Download |
| mmproj-F32.gguf | GGUF | F32 | 1.66 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"pipeline_tag": "image-text-to-text",
"base_model": [
"Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled"
],
"frontmatter": {
"pipeline_tag": "image-text-to-text",
"base_model": [
"Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled"
]
},
"hero_image_url": "",
"summary": "These are quantizations of the model Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled The mmproj files are the same from unsloth. Read the guide from unsloth in order to set up the model's recommended settings: Qwen3.5 - How to Run Locally Guide The mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest. So I created 2 new variants, where the other tensors are either BF16 or FP16 instead of Q8. The order of preference is BF16, then F16. On some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\npipeline_tag: image-text-to-text\nbase_model:\n- Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled\n---\nThese are quantizations of the model [Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled](https://huggingface.co/Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled)\n\n- Download the latest [llama.cpp](https://github.com/ggml-org/llama.cpp) to use these quantizations. \n- For the `mmproj` file, the F32 version is recommended for best results. \nThe mmproj files are the same from unsloth.\n\nRead the guide from unsloth in order to set up the model's recommended settings: \n[Qwen3.5 - How to Run Locally Guide](https://unsloth.ai/docs/models/qwen3.5)\n\nThe mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest. \nSo I created 2 new variants, where the other tensors are either BF16 or FP16 instead of Q8. \nThe order of preference is BF16, then F16. \nOn some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.\n",
"related_quantizations": []
},
"tags": [
"gguf",
"image-text-to-text",
"base_model:Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled",
"base_model:quantized:Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 7,
"downloads": 10175,
"gated": false,
"private": false,
"last_modified": "2026-03-17T10:08:19.000Z",
"created_at": "2026-03-14T18:30:32.000Z",
"pipeline_tag": "image-text-to-text",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "69b5a9482b0587383a1dd79a",
"id": "noctrex/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE-GGUF",
"modelId": "noctrex/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-MXFP4_MOE-GGUF",
"sha": "3c14d9fe668ede879817ecfa83ccbe2d146dd8a3",
"createdAt": "2026-03-14T18:30:32.000Z",
"lastModified": "2026-03-17T10:08:19.000Z",
"author": "noctrex",
"downloads": 10175,
"likes": 7,
"gated": false,
"private": false,
"pipeline_tag": "image-text-to-text",
"library_name": "",
"siblings_count": 7
}