Hermes 4 - Llama-3.1 405B (Reasoning) vs DeepSeek V3.2 Speciale

Nous Research vs DeepSeek — side-by-side benchmark comparison

	Hermes 4 - Llama-3.1 405B (Reasoning)	DeepSeek V3.2 Speciale
Intelligence Index	18.6	29.4
Coding Index	16.0	37.9
Math Index	69.7	96.7
Output speed (tok/s)	38.6	0.0
Blended price ($/1M)	$1.50	$0.00
Time to first token (s)	0.79s	0.00s
aime	—	—
aime 25	69.7%	96.7%
artificial analysis coding index	16.00	37.90
artificial analysis intelligence index	18.60	29.40
artificial analysis math index	69.70	96.70
gpqa	72.7%	87.1%
hle	10.3%	26.1%
ifbench	32.7%	63.9%
lcr	20.7%	59.3%
livecodebench	68.6%	89.6%
math 500	—	—
mmlu pro	82.9%	86.3%
scicode	25.2%	44.0%
tau2	22.2%	0.0%
terminalbench hard	11.4%	34.8%

Benchmark data from Artificial Analysis.