jinaai/jina-embeddings-v5-text-nano-clustering-gguf IQ1_S GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.
Model Intelligence Sheet
jinaai/jina-embeddings-v5-text-nano-clustering-gguf overview
GGUF quantizations of jina-embeddings-v5-text-nano-clustering using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference. Elastic Inference Service | ArXiv | Blog We highly recommend to first read this blog post for more technical details and customized llama.cpp build.
Downloads
177
Likes
0
Pipeline
feature-extraction
Library
llama.cpp
Visibility
Public
Access
Open
Repository Files & Downloads
14 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| v5-nano-clustering-F16.gguf | GGUF | F16 | 411.41 MB | Download |
| v5-nano-clustering-IQ1_M.gguf | GGUF | IQ1_M | 96.99 MB | Download |
| v5-nano-clustering-IQ1_S.gguf | GGUF | IQ1_S | 94.83 MB | Download |
| v5-nano-clustering-IQ2_M.gguf | GGUF | IQ2_M | 108.44 MB | Download |
| v5-nano-clustering-IQ2_XXS.gguf | GGUF | IQ2_XXS | 100.60 MB | Download |
| v5-nano-clustering-IQ4_NL.gguf | GGUF | IQ4_NL | 145.34 MB | Download |
| v5-nano-clustering-IQ4_XS.gguf | GGUF | IQ4_XS | 141.97 MB | Download |
| v5-nano-clustering-Q2_K.gguf | GGUF | Q2_K | 124.15 MB | Download |
| v5-nano-clustering-Q3_K_M.gguf | GGUF | Q3_K_M | 136.52 MB | Download |
| v5-nano-clustering-Q4_K_M.gguf | GGUF | Q4_K_M | 149.70 MB | Download |
| v5-nano-clustering-Q5_K_M.gguf | GGUF | Q5_K_M | 161.09 MB | Download |
| v5-nano-clustering-Q5_K_S.gguf | GGUF | Q5_K_S | 158.84 MB | Download |
| v5-nano-clustering-Q6_K.gguf | GGUF | Q6_K | 173.19 MB | Download |
| v5-nano-clustering-Q8_0.gguf | GGUF | — | 222.10 MB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"pipeline_tag": "feature-extraction",
"tags": [
"gguf",
"embedding",
"eurobert",
"llama-cpp",
"jina-embeddings-v5"
],
"language": [
"multilingual"
],
"base_model": "jinaai/jina-embeddings-v5-text-nano",
"base_model_relation": "quantized",
"inference": false,
"license": "cc-by-nc-4.0",
"library_name": "llama.cpp",
"frontmatter": {
"pipeline_tag": "feature-extraction",
"tags": [
"gguf",
"embedding",
"eurobert",
"llama-cpp",
"jina-embeddings-v5"
],
"language": [
"multilingual"
],
"base_model": "jinaai/jina-embeddings-v5-text-nano",
"base_model_relation": "quantized",
"inference": "false",
"license": "cc-by-nc-4.0",
"library_name": "llama.cpp"
},
"hero_image_url": "https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_architecture_1771470917.png",
"summary": "GGUF quantizations of jina-embeddings-v5-text-nano-clustering using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference. Elastic Inference Service | ArXiv | Blog > [!IMPORTANT] > We highly recommend to first read this blog post for more technical details and customized llama.cpp build.",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\npipeline_tag: feature-extraction\ntags:\n- gguf\n- embedding\n- eurobert\n- llama-cpp\n- jina-embeddings-v5\nlanguage:\n- multilingual\nbase_model: jinaai/jina-embeddings-v5-text-nano\nbase_model_relation: quantized\ninference: false\nlicense: cc-by-nc-4.0\nlibrary_name: llama.cpp\n---\n\n# jina-embeddings-v5-text-nano-clustering-GGUF\n\nGGUF quantizations of [jina-embeddings-v5-text-nano-clustering](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference.\n\n[Elastic Inference Service](https://www.elastic.co/docs/explore-analyze/elastic-inference/eis) | [ArXiv](https://arxiv.org/abs/2602.15547) | [Blog](https://jina.ai/news/jina-embeddings-v5-text-distilling-4b-quality-into-sub-1b-multilingual-embeddings)\n\n> [!IMPORTANT]\n> We highly recommend to first read [this blog post for more technical details and customized llama.cpp build](https://jina.ai/news/optimizing-ggufs-for-decoder-only-embedding-models).\n\n## Overview\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_architecture_1771470917.png\" alt=\"jina-embeddings-v5-text Architecture\" width=\"600px\">\n</p>\n\n`jina-embeddings-v5-text-nano-clustering` is a task-specific embedding model for **clustering**, part of the [jina-embeddings-v5-text](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) model family.\n| Feature | Value |\n| --- | --- |\n| Parameters | 239M |\n| Task | `clustering` |\n| Embedding Dimension | 768 |\n| Matryoshka Dimensions | 32, 64, 128, 256, 512, 768 |\n| Pooling Strategy | Last-token pooling |\n| Base Model | [jina-embeddings-v5-text-nano](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) |\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_mmteb-4.png\" alt=\"MMTEB Multilingual Benchmark\" width=\"500px\">\n</p>\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_mteb_en-4.png\" alt=\"MTEB English Benchmark\" width=\"500px\">\n</p>\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_retrieval-4.png\" alt=\"Retrieval Benchmark Results\" width=\"500px\">\n</p>\n\n\n## Usage with llama.cpp\n\n<details open>\n <summary>via <a href=\"https://www.elastic.co/docs/explore-analyze/elastic-inference/eis\">Elastic Inference Service</a></summary>\n\nThe fastest way to use v5-text in production. Elastic Inference Service (EIS) provides managed embedding inference with built-in scaling, so you can generate embeddings directly within your Elastic deployment.\n\n```bash\nPUT _inference/text_embedding/jina-v5\n{\n \"service\": \"elastic\",\n \"service_settings\": {\n \"model_id\": \"jina-embeddings-v5-text-nano\"\n }\n}\n```\n\nSee the [Elastic Inference Service documentation](https://www.elastic.co/docs/explore-analyze/elastic-inference/eis) for setup details.\n\n</details>\n\n```bash\n# Build llama.cpp (upstream)\ngit clone https://github.com/ggml-org/llama.cpp\ncd llama.cpp && cmake -B build && cmake --build build --config Release\n\n# Run embedding\n./build/bin/llama-embedding -m jina-embeddings-v5-text-nano-clustering-Q8_0.gguf \\\n --pooling last -p \"Your text here\"\n```\n\n## License\n\nCC-BY-NC-4.0. For commercial use, please [contact us](https://jina.ai/contact-sales).\n",
"related_quantizations": []
},
"tags": [
"llama.cpp",
"gguf",
"embedding",
"eurobert",
"llama-cpp",
"jina-embeddings-v5",
"feature-extraction",
"multilingual",
"arxiv:2602.15547",
"base_model:jinaai/jina-embeddings-v5-text-nano",
"base_model:quantized:jinaai/jina-embeddings-v5-text-nano",
"license:cc-by-nc-4.0",
"region:eu"
],
"likes": 0,
"downloads": 177,
"gated": false,
"private": false,
"last_modified": "2026-02-27T12:23:33.000Z",
"created_at": "2026-02-18T16:16:49.000Z",
"pipeline_tag": "feature-extraction",
"library_name": "llama.cpp"
}
Source payload excerpt (from Hugging Face API)
{
"_id": "6995e5f1aa3c4d5606365283",
"id": "jinaai/jina-embeddings-v5-text-nano-clustering-GGUF",
"modelId": "jinaai/jina-embeddings-v5-text-nano-clustering-GGUF",
"sha": "bc43da70719b9e941bc2094d8157e3b0ac762c17",
"createdAt": "2026-02-18T16:16:49.000Z",
"lastModified": "2026-02-27T12:23:33.000Z",
"author": "jinaai",
"downloads": 177,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "feature-extraction",
"library_name": "llama.cpp",
"siblings_count": 16
}