GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →

NVIDIA cloud models

18 models tracked via Artificial Analysis. Compare cloud performance, then find local GGUF versions in the GraySoft model catalog.

ModelIntelligenceSpeed (tok/s)
Nemotron 3 Ultra 550B A55B (Reasoning)47.7171.82
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)36217.675
Nemotron Cascade 2 30B A3B28.40
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)24.3115.627
Nemotron 3 Nano Omni 30B A3B Reasoning21.4302.289
Llama Nemotron Super 49B v1.5 (Reasoning)18.750.346
Llama 3.3 Nemotron Super 49B v1 (Reasoning)18.50
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)1551.481
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)14.9288.725
NVIDIA Nemotron Nano 9B V2 (Reasoning)14.8114.324
NVIDIA Nemotron 3 Nano 4B14.70
Llama Nemotron Super 49B v1.5 (Non-reasoning)14.648.537
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)14.40
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)14.30
Llama 3.1 Nemotron Instruct 70B13.4305.943
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)13.256.801
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)13.2152.58
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)10.1217.512

Run models locally with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models