NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF

云碩科技 · xCloudinfo　·　系列：社群量化 · Community GGUF

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B 的 GGUF（llama.cpp / Ollama） 量化版本（30B 總參、A3B≈3B 活躍 MoE），供地端部署。

> 各量化等級見 Files 分頁。

用法

llama-server -m NVIDIA-Nemotron-3-Nano-30B-A3B-<quant>.gguf -c 4096 -ngl 99

---

由云碩科技 xCloudinfo 重新量化、散布。

Question 2

What license applies to xCloudinfo/NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF?

Accepted Answer

License: other. Verify terms on Hugging Face before commercial use.

Question 3

How do I run xCloudinfo/NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does xCloudinfo/NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF need?

Accepted Answer

Runs locally from ~22.83 GB disk (24 GB VRAM class GPUs with llama.cpp / guIDE).

File	Type	Quantization	Size	Link
NVIDIA-Nemotron-3-Nano-30B-A3B-Q4_K_M.gguf	GGUF	Q4_K_M	22.83 GB	Download
NVIDIA-Nemotron-3-Nano-30B-A3B-Q6_K.gguf	GGUF	Q6_K	31.21 GB	Download
NVIDIA-Nemotron-3-Nano-30B-A3B-Q8_0.gguf	GGUF	Q8_0	31.28 GB	Download
NVIDIA-Nemotron-3-Nano-30B-A3B-f16.gguf	GGUF	F16	58.84 GB	Download

Model ID	xCloudinfo/NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF
Author	xCloudinfo
Pipeline	text-generation
License	other
Base model	nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Last modified	2026-06-14T02:05:34.000Z