Gemma 4 31B (Reasoning) vs Claude 4.5 Haiku (Reasoning)

Google vs Anthropic — side-by-side benchmark comparison

	Gemma 4 31B (Reasoning)	Claude 4.5 Haiku (Reasoning)
Intelligence Index	39.2	37.1
Coding Index	38.7	32.6
Math Index	—	83.7
Output speed (tok/s)	35.3	142.2
Blended price ($/1M)	$0.00	$2.19
Time to first token (s)	1.00s	10.48s
aime	—	—
aime 25	—	83.7%
artificial analysis coding index	38.70	32.60
artificial analysis intelligence index	39.20	37.10
artificial analysis math index	—	83.70
gpqa	85.7%	67.2%
hle	22.7%	9.7%
ifbench	75.6%	54.3%
lcr	62.0%	70.3%
livecodebench	—	61.5%
math 500	—	—
mmlu pro	—	76.0%
scicode	43.4%	43.3%
tau2	59.9%	54.7%
terminalbench hard	36.4%	27.3%

Benchmark data from Artificial Analysis.