Grok-1 vs Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)

xAI vs NVIDIA — side-by-side benchmark comparison

	Grok-1	Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Intelligence Index	11.7	15.0
Coding Index	—	13.1
Math Index	—	63.7
Output speed (tok/s)	0.0	52.3
Blended price ($/1M)	$0.00	$0.90
Time to first token (s)	0.00s	0.76s
aime	—	74.7%
aime 25	—	63.7%
artificial analysis coding index	—	13.10
artificial analysis intelligence index	11.70	15.00
artificial analysis math index	—	63.70
gpqa	—	72.8%
hle	—	8.1%
ifbench	—	38.2%
lcr	—	7.3%
livecodebench	—	64.1%
math 500	—	95.2%
mmlu pro	—	82.5%
scicode	—	34.7%
tau2	—	11.4%
terminalbench hard	—	2.3%

Benchmark data from Artificial Analysis.