← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs GPT-4o (May '24)

NVIDIA vs OpenAI — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)GPT-4o (May '24)
Intelligence Index15.014.5
Coding Index13.124.2
Math Index63.7
Output speed (tok/s)52.3111.8
Blended price ($/1M)$0.90$7.50
Time to first token (s)0.76s0.61s
aime74.7%11.0%
aime 2563.7%
artificial analysis coding index13.1024.20
artificial analysis intelligence index15.0014.50
artificial analysis math index63.70
gpqa72.8%52.6%
hle8.1%2.8%
ifbench38.2%
lcr7.3%
livecodebench64.1%33.4%
math 50095.2%79.1%
mmlu pro82.5%74.0%
scicode34.7%30.9%
tau211.4%
terminalbench hard2.3%

Benchmark data from Artificial Analysis.