GraySoft
Projects Models About FAQ Contact Download guIDE →

hiratagoh/nvidia-nemotron-nano-9b-v2-japanese-gguf IQ4_NL GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

hiratagoh/nvidia-nemotron-nano-9b-v2-japanese-gguf overview

Comprehensive model page for hiratagoh/nvidia-nemotron-nano-9b-v2-japanese-gguf

gguftext-generationjaendataset:TFMC/imatrix-dataset-for-japanese-llmbase_model:nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanesebase_model:quantized:nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japaneselicense:otherendpoints_compatibleregion:usconversational
hiratagoh/nvidia-nemotron-nano-9b-v2-japanese-gguf visual
Downloads
115
Likes
0
Pipeline
text-generation
Library
Visibility
Public
Access
Open

Repository Files & Downloads

7 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
NVIDIA-Nemotron-Nano-9B-v2-Japanese-BF16.gguf GGUF BF16 16.57 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-IQ4_NL.gguf GGUF IQ4_NL 4.94 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-IQ4_XS.gguf GGUF IQ4_XS 4.91 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-Q4_K_M.gguf GGUF Q4_K_M 6.08 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-Q5_K_M.gguf GGUF Q5_K_M 6.58 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-Q6_K.gguf GGUF Q6_K 8.51 GB Download
NVIDIA-Nemotron-Nano-9B-v2-Japanese-Q8_0.gguf GGUF 8.81 GB Download

Model Details Live

Model Slug
hiratagoh/nvidia-nemotron-nano-9b-v2-japanese-gguf
Author
hiratagoh
Pipeline Task
text-generation
Library
Created
2026-02-21
Last Modified
2026-02-25
Gated
No
Private
No
HF SHA
4b20eb1b0e07295b64b2dd06320aae40d49ebfac
License
other
Language
ja, en
Base Model
nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "other",
    "license_name": "nvidia-nemotron-open-model-license",
    "license_link": "https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-nemotron-open-model-license/",
    "base_model": "nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese",
    "datasets": [
      "TFMC/imatrix-dataset-for-japanese-llm"
    ],
    "track_downloads": true,
    "language": [
      "ja",
      "en"
    ],
    "pipeline_tag": "text-generation",
    "frontmatter": {
      "license": "other",
      "license_name": "nvidia-nemotron-open-model-license",
      "license_link": ">-",
      "base_model": "nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese",
      "datasets": [
        "TFMC/imatrix-dataset-for-japanese-llm"
      ],
      "track_downloads": "true",
      "language": [
        "ja",
        "en"
      ],
      "pipeline_tag": "text-generation"
    },
    "hero_image_url": "",
    "summary": "",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: other\nlicense_name: nvidia-nemotron-open-model-license\nlicense_link: >-\n  https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-nemotron-open-model-license/\nbase_model: nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese\ndatasets:\n- TFMC/imatrix-dataset-for-japanese-llm\ntrack_downloads: true\nlanguage:\n- ja\n- en\npipeline_tag: text-generation\n---\n\n# NVIDIA-Nemotron-Nano-9B-v2-Japanese-GGUF\n\n## GGUF変換と量子化\n\n[nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese)を\n[llama.cpp](https://github.com/ggml-org/llama.cpp.git)の`convert_hf_to_gguf.py`でGGUF形式変換し、`llama-quantize`で量子化しました。\n\n元モデルが軽量ですので、実行環境が許せばBF16かQ8_0での利用をお勧めします。\n\n## iMatrix生成\n\niMatrixは\n[TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm/tree/main)\nの`c4_en_ja_imatrix.txt`を教師データに使用し`llama-imatrix`で生成しました。\n\n## IQ4_XS量子化\n\n**IQ4_XS量子化**では`llama-quantize`で\n```\nllama_model_quantize_impl : tensor cols 4480 x 131072 are not divisible by 256, required for iq4_xs - using fallback quantization iq4_nl\n```\nなどとログ出力され、**4ビット量子化されたLayerの多くはIQ4_NL**になってます。表面上はIQ4_XSと表記していますが、中身はほぼIQ4_NLです。\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "text-generation",
    "ja",
    "en",
    "dataset:TFMC/imatrix-dataset-for-japanese-llm",
    "base_model:nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese",
    "base_model:quantized:nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese",
    "license:other",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 115,
  "gated": false,
  "private": false,
  "last_modified": "2026-02-25T20:28:39.000Z",
  "created_at": "2026-02-21T07:21:41.000Z",
  "pipeline_tag": "text-generation",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "69995d05abe59a025b8c9d6b",
  "id": "hiratagoh/NVIDIA-Nemotron-Nano-9B-v2-Japanese-GGUF",
  "modelId": "hiratagoh/NVIDIA-Nemotron-Nano-9B-v2-Japanese-GGUF",
  "sha": "4b20eb1b0e07295b64b2dd06320aae40d49ebfac",
  "createdAt": "2026-02-21T07:21:41.000Z",
  "lastModified": "2026-02-25T20:28:39.000Z",
  "author": "hiratagoh",
  "downloads": 115,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "",
  "siblings_count": 10
}