jinaai/jina-embeddings-v5-text-nano-clustering-gguf IQ1_S GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

jinaai/jina-embeddings-v5-text-nano-clustering-gguf overview

GGUF quantizations of jina-embeddings-v5-text-nano-clustering using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference. Elastic Inference Service | ArXiv | Blog We highly recommend to first read this blog post for more technical details and customized llama.cpp build.

llama.cppggufembeddingeurobertllama-cppjina-embeddings-v5feature-extractionmultilingualarxiv:2602.15547base_model:jinaai/jina-embeddings-v5-text-nanobase_model:quantized:jinaai/jina-embeddings-v5-text-nanolicense:cc-by-nc-4.0region:eu

jinaai/jina-embeddings-v5-text-nano-clustering-gguf visual

Downloads

177

Likes

Pipeline

feature-extraction

Library

llama.cpp

Visibility

Public

Access

Open

Repository Files & Downloads

14 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
v5-nano-clustering-F16.gguf	GGUF	F16	411.41 MB	Download
v5-nano-clustering-IQ1_M.gguf	GGUF	IQ1_M	96.99 MB	Download
v5-nano-clustering-IQ1_S.gguf	GGUF	IQ1_S	94.83 MB	Download
v5-nano-clustering-IQ2_M.gguf	GGUF	IQ2_M	108.44 MB	Download
v5-nano-clustering-IQ2_XXS.gguf	GGUF	IQ2_XXS	100.60 MB	Download
v5-nano-clustering-IQ4_NL.gguf	GGUF	IQ4_NL	145.34 MB	Download
v5-nano-clustering-IQ4_XS.gguf	GGUF	IQ4_XS	141.97 MB	Download
v5-nano-clustering-Q2_K.gguf	GGUF	Q2_K	124.15 MB	Download
v5-nano-clustering-Q3_K_M.gguf	GGUF	Q3_K_M	136.52 MB	Download
v5-nano-clustering-Q4_K_M.gguf	GGUF	Q4_K_M	149.70 MB	Download
v5-nano-clustering-Q5_K_M.gguf	GGUF	Q5_K_M	161.09 MB	Download
v5-nano-clustering-Q5_K_S.gguf	GGUF	Q5_K_S	158.84 MB	Download
v5-nano-clustering-Q6_K.gguf	GGUF	Q6_K	173.19 MB	Download
v5-nano-clustering-Q8_0.gguf	GGUF	—	222.10 MB	Download

Model Details Live

Model Slug

jinaai/jina-embeddings-v5-text-nano-clustering-gguf

Author

jinaai

Pipeline Task

feature-extraction

Library

llama.cpp

Created

2026-02-18

Last Modified

2026-02-27

Gated

Private

HF SHA

bc43da70719b9e941bc2094d8157e3b0ac762c17

License

cc-by-nc-4.0

Language

multilingual

Base Model

jinaai/jina-embeddings-v5-text-nano

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "pipeline_tag": "feature-extraction",
    "tags": [
      "gguf",
      "embedding",
      "eurobert",
      "llama-cpp",
      "jina-embeddings-v5"
    ],
    "language": [
      "multilingual"
    ],
    "base_model": "jinaai/jina-embeddings-v5-text-nano",
    "base_model_relation": "quantized",
    "inference": false,
    "license": "cc-by-nc-4.0",
    "library_name": "llama.cpp",
    "frontmatter": {
      "pipeline_tag": "feature-extraction",
      "tags": [
        "gguf",
        "embedding",
        "eurobert",
        "llama-cpp",
        "jina-embeddings-v5"
      ],
      "language": [
        "multilingual"
      ],
      "base_model": "jinaai/jina-embeddings-v5-text-nano",
      "base_model_relation": "quantized",
      "inference": "false",
      "license": "cc-by-nc-4.0",
      "library_name": "llama.cpp"
    },
    "hero_image_url": "https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_architecture_1771470917.png",
    "summary": "GGUF quantizations of jina-embeddings-v5-text-nano-clustering using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference. Elastic Inference Service | ArXiv | Blog > [!IMPORTANT] > We highly recommend to first read this blog post for more technical details and customized llama.cpp build.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\npipeline_tag: feature-extraction\ntags:\n- gguf\n- embedding\n- eurobert\n- llama-cpp\n- jina-embeddings-v5\nlanguage:\n- multilingual\nbase_model: jinaai/jina-embeddings-v5-text-nano\nbase_model_relation: quantized\ninference: false\nlicense: cc-by-nc-4.0\nlibrary_name: llama.cpp\n---\n\n# jina-embeddings-v5-text-nano-clustering-GGUF\n\nGGUF quantizations of [jina-embeddings-v5-text-nano-clustering](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference.\n\n[Elastic Inference Service](https://www.elastic.co/docs/explore-analyze/elastic-inference/eis) | [ArXiv](https://arxiv.org/abs/2602.15547) | [Blog](https://jina.ai/news/jina-embeddings-v5-text-distilling-4b-quality-into-sub-1b-multilingual-embeddings)\n\n> [!IMPORTANT]\n> We highly recommend to first read [this blog post for more technical details and customized llama.cpp build](https://jina.ai/news/optimizing-ggufs-for-decoder-only-embedding-models).\n\n## Overview\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_architecture_1771470917.png\" alt=\"jina-embeddings-v5-text Architecture\" width=\"600px\">\n</p>\n\n`jina-embeddings-v5-text-nano-clustering` is a task-specific embedding model for **clustering**, part of the [jina-embeddings-v5-text](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) model family.\n| Feature | Value |\n| --- | --- |\n| Parameters | 239M |\n| Task | `clustering` |\n| Embedding Dimension | 768 |\n| Matryoshka Dimensions | 32, 64, 128, 256, 512, 768 |\n| Pooling Strategy | Last-token pooling |\n| Base Model | [jina-embeddings-v5-text-nano](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano) |\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_mmteb-4.png\" alt=\"MMTEB Multilingual Benchmark\" width=\"500px\">\n</p>\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_mteb_en-4.png\" alt=\"MTEB English Benchmark\" width=\"500px\">\n</p>\n\n<p align=\"center\">\n<img src=\"https://jina-ai-gmbh.ghost.io/content/images/2026/02/v5_retrieval-4.png\" alt=\"Retrieval Benchmark Results\" width=\"500px\">\n</p>\n\n\n## Usage with llama.cpp\n\n<details open>\n  <summary>via <a href=\"https://www.elastic.co/docs/explore-analyze/elastic-inference/eis\">Elastic Inference Service</a></summary>\n\nThe fastest way to use v5-text in production. Elastic Inference Service (EIS) provides managed embedding inference with built-in scaling, so you can generate embeddings directly within your Elastic deployment.\n\n```bash\nPUT _inference/text_embedding/jina-v5\n{\n  \"service\": \"elastic\",\n  \"service_settings\": {\n    \"model_id\": \"jina-embeddings-v5-text-nano\"\n  }\n}\n```\n\nSee the [Elastic Inference Service documentation](https://www.elastic.co/docs/explore-analyze/elastic-inference/eis) for setup details.\n\n</details>\n\n```bash\n# Build llama.cpp (upstream)\ngit clone https://github.com/ggml-org/llama.cpp\ncd llama.cpp && cmake -B build && cmake --build build --config Release\n\n# Run embedding\n./build/bin/llama-embedding -m jina-embeddings-v5-text-nano-clustering-Q8_0.gguf \\\n  --pooling last -p \"Your text here\"\n```\n\n## License\n\nCC-BY-NC-4.0. For commercial use, please [contact us](https://jina.ai/contact-sales).\n",
    "related_quantizations": []
  },
  "tags": [
    "llama.cpp",
    "gguf",
    "embedding",
    "eurobert",
    "llama-cpp",
    "jina-embeddings-v5",
    "feature-extraction",
    "multilingual",
    "arxiv:2602.15547",
    "base_model:jinaai/jina-embeddings-v5-text-nano",
    "base_model:quantized:jinaai/jina-embeddings-v5-text-nano",
    "license:cc-by-nc-4.0",
    "region:eu"
  ],
  "likes": 0,
  "downloads": 177,
  "gated": false,
  "private": false,
  "last_modified": "2026-02-27T12:23:33.000Z",
  "created_at": "2026-02-18T16:16:49.000Z",
  "pipeline_tag": "feature-extraction",
  "library_name": "llama.cpp"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "6995e5f1aa3c4d5606365283",
  "id": "jinaai/jina-embeddings-v5-text-nano-clustering-GGUF",
  "modelId": "jinaai/jina-embeddings-v5-text-nano-clustering-GGUF",
  "sha": "bc43da70719b9e941bc2094d8157e3b0ac762c17",
  "createdAt": "2026-02-18T16:16:49.000Z",
  "lastModified": "2026-02-27T12:23:33.000Z",
  "author": "jinaai",
  "downloads": 177,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "feature-extraction",
  "library_name": "llama.cpp",
  "siblings_count": 16
}

jinaai/jina-embeddings-v5-text-nano-clustering-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard