← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs Hermes 4 - Llama-3.1 70B (Reasoning)

NVIDIA vs Nous Research — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)Hermes 4 - Llama-3.1 70B (Reasoning)
Intelligence Index15.016.0
Coding Index13.114.4
Math Index63.768.7
Output speed (tok/s)52.392.8
Blended price ($/1M)$0.90$0.20
Time to first token (s)0.76s0.64s
aime74.7%
aime 2563.7%68.7%
artificial analysis coding index13.1014.40
artificial analysis intelligence index15.0016.00
artificial analysis math index63.7068.70
gpqa72.8%69.9%
hle8.1%7.9%
ifbench38.2%31.3%
lcr7.3%6.7%
livecodebench64.1%65.3%
math 50095.2%
mmlu pro82.5%81.1%
scicode34.7%34.1%
tau211.4%22.5%
terminalbench hard2.3%4.5%

Benchmark data from Artificial Analysis.