Model Intelligence Sheet
ngxson/home-cook-mistral-small-omni-24b-2507-gguf overview
This is a multimodal model created by merging Mistral Small 2506 (with vision capabilities) and Voxtral 2507 (with audio capabilities) using a modified version of the mergekit tool. For detailed merging instructions, refer to the sections below.
Downloads
6,577
Likes
27
Pipeline
any-to-any
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
4 files detected
Direct downloads for all repository files
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"base_model": [
"mistralai/Voxtral-Small-24B-2507",
"mistralai/Mistral-Small-3.2-24B-Instruct-2506"
],
"license": "apache-2.0",
"pipeline_tag": "any-to-any",
"frontmatter": {
"base_model": [
"mistralai/Voxtral-Small-24B-2507",
"mistralai/Mistral-Small-3.2-24B-Instruct-2506"
],
"license": "apache-2.0",
"pipeline_tag": "any-to-any"
},
"hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/63ca214abedad7e2bf1d1517/-0Su33gQArUjp5gSco8pD.png",
"summary": "This is a multimodal model created by merging Mistral Small 2506 (with vision capabilities) and Voxtral 2507 (with audio capabilities) using a modified version of the mergekit tool. For detailed merging instructions, refer to the sections below.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nbase_model:\n- mistralai/Voxtral-Small-24B-2507\n- mistralai/Mistral-Small-3.2-24B-Instruct-2506\nlicense: apache-2.0\npipeline_tag: any-to-any\n---\n\n# Home-cooked Mistral Small Omni\n\nThis is a multimodal model created by merging Mistral Small 2506 (with vision capabilities) and Voxtral 2507 (with audio capabilities) using a modified version of the `mergekit` tool.\n\nFor detailed merging instructions, refer to the sections below.\n\n<img width=300 src=\"https://cdn-uploads.huggingface.co/production/uploads/63ca214abedad7e2bf1d1517/-0Su33gQArUjp5gSco8pD.png\" />\n\n## License and Attribution\n\nThis model is a merged derivative work combining Mistral Small 2506 and Voxtral 2507, both originally released by Mistral AI under the Apache 2.0 license. The merged model is also distributed under the Apache 2.0 license, and the full license text, along with original copyright notices, is included in this repository. I have no affiliation, sponsorship, or formal relationship with Mistral AI. This project is an independent effort to combine the vision and audio capabilities of the two models.\n\n## Steps to reproduce\n\n### Merge text model\n\nInstall `mergekit` from this version: https://github.com/arcee-ai/mergekit/tree/0027c5c51471fa891d438eccda5455ebe55b536e\n\nModify the `mergekit` source code, open file `mergekit/merge_methods/generalized_task_arithmetic.py`\n\n```py\n # Normalize the vectors to get the directions and angles\n v0 = normalize(v0, eps)\n v1 = normalize(v1, eps)\n\n if v0.shape != v1.shape: # ADD THIS\n res = np.array([0.0]) # ADD THIS\n return maybe_torch(res, is_torch) # ADD THIS\n\n # Dot product with the normalized vectors (can't use np.dot in W)\n dot = np.sum(v0 * v1)\n\n # If absolute value of dot product is almost 1, vectors are ~colinear, so use lerp\n if np.abs(dot) > DOT_THRESHOLD:\n res = lerp(t, v0_copy, v1_copy)\n return maybe_torch(res, is_torch)\n```\n\nPrepare YAML file for merging config:\n\n```yaml\nname: mistral-omni\nmerge_method: slerp\nmodels:\n - model: ../models/Voxtral-Small-24B-2507\n - model: ../models/Mistral-Small-3.2-24B-Instruct-2506\nbase_model: ../models/Mistral-Small-3.2-24B-Instruct-2506\nparameters:\n t:\n - filter: self_attn\n value: [0.1, 0.3, 0.5, 0.3, 0.1, 0]\n - filter: mlp\n value: [0.1, 0.3, 0.5, 0.3, 0.1, 0]\n - value: 0.5 # fallback for rest of tensors\ndtype: bfloat16\n```\n\nMerge it:\n\n```sh\nmergekit-yaml mistral_o.yaml ../models/mistral_o\n```\n\nGo to the `mistral_o` output directory, then download `tekken.json` from Voxtral and place it there: https://huggingface.co/mistralai/Voxtral-Small-24B-2507/blob/main/tekken.json\n\nFinally, use `convert_hf_to_gguf.py` to convert it back to GGUF as usual\n\n### Merge mmproj models\n\nDownload these mmproj files:\n- Audio: https://huggingface.co/ggml-org/Voxtral-Mini-3B-2507-GGUF/blob/main/mmproj-Voxtral-Mini-3B-2507-Q8_0.gguf\n- Vision: https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF/blob/main/mmproj-F16.gguf\n\nRename them to `audio.gguf`and `vision.gguf` respectively\n\nThen run [merge_mmproj_models.py](https://huggingface.co/ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF/blob/main/merge_mmproj_models.py) from this repo. The output file will be `mmproj-model.gguf`",
"related_quantizations": []
},
"tags": [
"gguf",
"any-to-any",
"base_model:mistralai/Mistral-Small-3.2-24B-Instruct-2506",
"base_model:quantized:mistralai/Mistral-Small-3.2-24B-Instruct-2506",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 27,
"downloads": 6577,
"gated": false,
"private": false,
"last_modified": "2025-07-28T20:37:39.000Z",
"created_at": "2025-07-28T19:50:10.000Z",
"pipeline_tag": "any-to-any",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "6887d472a7cd6d60807185fc",
"id": "ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF",
"modelId": "ngxson/Home-Cook-Mistral-Small-Omni-24B-2507-GGUF",
"sha": "3b5cd372ffc468c20a4b979a387e8079451ba495",
"createdAt": "2025-07-28T19:50:10.000Z",
"lastModified": "2025-07-28T20:37:39.000Z",
"author": "ngxson",
"downloads": 6577,
"likes": 27,
"gated": false,
"private": false,
"pipeline_tag": "any-to-any",
"library_name": "",
"siblings_count": 7
}