Magistral Medium 1.2 vs Hermes 4 - Llama-3.1 405B (Non-reasoning)

Mistral vs Nous Research — side-by-side benchmark comparison

	Magistral Medium 1.2	Hermes 4 - Llama-3.1 405B (Non-reasoning)
Intelligence Index	27.1	17.6
Coding Index	21.7	18.1
Math Index	82.0	15.3
Output speed (tok/s)	39.4	40.8
Blended price ($/1M)	$2.75	$1.50
Time to first token (s)	0.54s	0.73s
aime	—	—
aime 25	82.0%	15.3%
artificial analysis coding index	21.70	18.10
artificial analysis intelligence index	27.10	17.60
artificial analysis math index	82.00	15.30
gpqa	73.9%	53.6%
hle	9.6%	4.2%
ifbench	43.0%	34.8%
lcr	51.3%	20.0%
livecodebench	75.0%	54.6%
math 500	—	—
mmlu pro	81.5%	72.9%
scicode	39.2%	34.6%
tau2	52.0%	26.6%
terminalbench hard	12.9%	9.8%

Benchmark data from Artificial Analysis.