Gemini 2.5 Flash Preview (Sep '25) (Reasoning) vs Claude 4.1 Opus (Non-reasoning)

Google vs Anthropic — side-by-side benchmark comparison

	Gemini 2.5 Flash Preview (Sep '25) (Reasoning)	Claude 4.1 Opus (Non-reasoning)
Intelligence Index	31.1	36.0
Coding Index	24.6	—
Math Index	78.3	—
Output speed (tok/s)	0.0	44.7
Blended price ($/1M)	$0.00	$32.81
Time to first token (s)	0.00s	1.63s
aime	—	—
aime 25	78.3%	—
artificial analysis coding index	24.60	—
artificial analysis intelligence index	31.10	36.00
artificial analysis math index	78.30	—
gpqa	79.3%	—
hle	12.7%	—
ifbench	52.3%	—
lcr	64.3%	—
livecodebench	71.3%	—
math 500	—	—
mmlu pro	84.2%	—
scicode	40.5%	—
tau2	45.6%	—
terminalbench hard	16.7%	—

Benchmark data from Artificial Analysis.