← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs o4-mini (high)

NVIDIA vs OpenAI — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)o4-mini (high)
Intelligence Index15.033.1
Coding Index13.125.6
Math Index63.790.7
Output speed (tok/s)52.3160.5
Blended price ($/1M)$0.90$1.93
Time to first token (s)0.76s23.07s
aime74.7%94.0%
aime 2563.7%90.7%
artificial analysis coding index13.1025.60
artificial analysis intelligence index15.0033.10
artificial analysis math index63.7090.70
gpqa72.8%78.4%
hle8.1%17.5%
ifbench38.2%68.7%
lcr7.3%55.0%
livecodebench64.1%85.9%
math 50095.2%98.9%
mmlu pro82.5%83.2%
scicode34.7%46.5%
tau211.4%55.6%
terminalbench hard2.3%15.2%

Benchmark data from Artificial Analysis.