Question 1

What is avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf?

Accepted Answer

--- base_model: - nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16 tags: - gguf - optimized - mixed-gguf --- Llamacpp mainline compatible gguf quants of nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16. Note that this is NOT the instruct tuned model, its nvidia's base checkpoint. These quants are made using the same scheme as Aes Sedai's optimized quants for nemotron ultra instruct. He graciously provided the imatrix and commands used for these. Though as of this commit (10June2026) llamacpp still needs to be patched in order to make nemotron ultra ggufs With chat completions, this model has some artifacts and strings that show up in the chat but it responds well enough to turn based chatting. Using text completions in with the below instruct json, the model actually reaponds normally. Tested in silly tavern without thinking ``` { "input_sequence": "<|im_start|>user", "output_sequen…

Question 2

What license applies to avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf?

Accepted Answer

License: See model card. Verify terms on Hugging Face before commercial use.

Question 3

How do I run avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf need?

Accepted Answer

Runs locally from ~31.71 GB disk (32 GB+ VRAM class GPUs with llama.cpp / guIDE).

File	Type	Quantization	Size	Link
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00001-of-00006.gguf	GGUF	IQ4_XS	44.83 GB	Download
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00002-of-00006.gguf	GGUF	IQ4_XS	45.59 GB	Download
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00003-of-00006.gguf	GGUF	IQ4_XS	45.50 GB	Download
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00004-of-00006.gguf	GGUF	IQ4_XS	45.59 GB	Download
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00005-of-00006.gguf	GGUF	IQ4_XS	45.39 GB	Download
IQ4_XS/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16-IQ4_XS.gguf-00006-of-00006.gguf	GGUF	IQ4_XS	31.71 GB	Download

Model ID	avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf
Author	avar6
Pipeline	—
License	—
Base model	nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-BF16
Last modified	2026-06-20T18:08:34.000Z

avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf overview

Repository Files & Downloads

Model Details

Model README

Run avar6/NVIDIA-Nemotron-3-Ultra-550B-A55B-Base-gguf with guIDE