← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs GPT-5.2 (xhigh)

NVIDIA vs OpenAI — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)GPT-5.2 (xhigh)
Intelligence Index15.051.3
Coding Index13.148.7
Math Index63.799.0
Output speed (tok/s)52.377.2
Blended price ($/1M)$0.90$4.81
Time to first token (s)0.76s64.17s
aime74.7%
aime 2563.7%99.0%
artificial analysis coding index13.1048.70
artificial analysis intelligence index15.0051.30
artificial analysis math index63.7099.00
gpqa72.8%90.3%
hle8.1%35.4%
ifbench38.2%75.4%
lcr7.3%72.7%
livecodebench64.1%88.9%
math 50095.2%
mmlu pro82.5%87.4%
scicode34.7%52.1%
tau211.4%84.8%
terminalbench hard2.3%47.0%

Benchmark data from Artificial Analysis.