Gemma 4 26B A4B (Reasoning) vs DeepHermes 3 - Mistral 24B Preview (Non-reasoning)

Google vs Nous Research — side-by-side benchmark comparison

	Gemma 4 26B A4B (Reasoning)	DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
Intelligence Index	31.2	10.9
Coding Index	22.4	—
Math Index	—	—
Output speed (tok/s)	0.0	0.0
Blended price ($/1M)	$0.20	$0.00
Time to first token (s)	0.00s	0.00s
aime	—	4.7%
aime 25	—	—
artificial analysis coding index	22.40	—
artificial analysis intelligence index	31.20	10.90
artificial analysis math index	—	—
gpqa	79.2%	38.2%
hle	18.3%	3.9%
ifbench	72.4%	—
lcr	55.7%	—
livecodebench	—	19.5%
math 500	—	59.5%
mmlu pro	—	58.0%
scicode	40.0%	22.8%
tau2	43.6%	—
terminalbench hard	13.6%	—

Benchmark data from Artificial Analysis.