Gemma 4 E4B (Reasoning) vs Claude Opus 4.8 (Adaptive Reasoning, Max Effort)

Google vs Anthropic — side-by-side benchmark comparison

	Gemma 4 E4B (Reasoning)	Claude Opus 4.8 (Adaptive Reasoning, Max Effort)
Intelligence Index	18.8	61.4
Coding Index	13.7	56.7
Math Index	—	—
Output speed (tok/s)	0.0	66.9
Blended price ($/1M)	$0.00	$10.94
Time to first token (s)	0.00s	7.91s
aime	—	—
aime 25	—	—
artificial analysis coding index	13.70	56.70
artificial analysis intelligence index	18.80	61.40
artificial analysis math index	—	—
gpqa	57.6%	92.0%
hle	3.7%	45.7%
ifbench	44.2%	62.2%
lcr	30.7%	67.7%
livecodebench	—	—
math 500	—	—
mmlu pro	—	—
scicode	24.4%	53.5%
tau2	20.8%	94.4%
terminalbench hard	8.3%	58.3%

Benchmark data from Artificial Analysis.