Ministral 3 3B vs Hermes 4 - Llama-3.1 70B (Reasoning)

Mistral vs Nous Research — side-by-side benchmark comparison

	Ministral 3 3B	Hermes 4 - Llama-3.1 70B (Reasoning)
Intelligence Index	11.2	16.0
Coding Index	4.8	14.4
Math Index	22.0	68.7
Output speed (tok/s)	200.8	92.8
Blended price ($/1M)	$0.10	$0.20
Time to first token (s)	0.34s	0.64s
aime	—	—
aime 25	22.0%	68.7%
artificial analysis coding index	4.80	14.40
artificial analysis intelligence index	11.20	16.00
artificial analysis math index	22.00	68.70
gpqa	35.8%	69.9%
hle	5.3%	7.9%
ifbench	26.8%	31.3%
lcr	11.7%	6.7%
livecodebench	24.7%	65.3%
math 500	—	—
mmlu pro	52.4%	81.1%
scicode	14.4%	34.1%
tau2	24.9%	22.5%
terminalbench hard	0.0%	4.5%

Benchmark data from Artificial Analysis.