← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs DeepHermes 3 - Mistral 24B Preview (Non-reasoning)

NVIDIA vs Nous Research — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
Intelligence Index15.010.9
Coding Index13.1
Math Index63.7
Output speed (tok/s)52.30.0
Blended price ($/1M)$0.90$0.00
Time to first token (s)0.76s0.00s
aime74.7%4.7%
aime 2563.7%
artificial analysis coding index13.10
artificial analysis intelligence index15.0010.90
artificial analysis math index63.70
gpqa72.8%38.2%
hle8.1%3.9%
ifbench38.2%
lcr7.3%
livecodebench64.1%19.5%
math 50095.2%59.5%
mmlu pro82.5%58.0%
scicode34.7%22.8%
tau211.4%
terminalbench hard2.3%

Benchmark data from Artificial Analysis.