← All comparisons
Gemini 2.5 Flash Preview (Sep '25) (Reasoning) vs Claude 4.1 Opus (Non-reasoning)
Google vs Anthropic — side-by-side benchmark comparison
| Gemini 2.5 Flash Preview (Sep '25) (Reasoning) | Claude 4.1 Opus (Non-reasoning) | |
|---|---|---|
| Intelligence Index | 31.1 | 36.0 |
| Coding Index | 24.6 | — |
| Math Index | 78.3 | — |
| Output speed (tok/s) | 0.0 | 44.7 |
| Blended price ($/1M) | $0.00 | $32.81 |
| Time to first token (s) | 0.00s | 1.63s |
| aime | — | — |
| aime 25 | 78.3% | — |
| artificial analysis coding index | 24.60 | — |
| artificial analysis intelligence index | 31.10 | 36.00 |
| artificial analysis math index | 78.30 | — |
| gpqa | 79.3% | — |
| hle | 12.7% | — |
| ifbench | 52.3% | — |
| lcr | 64.3% | — |
| livecodebench | 71.3% | — |
| math 500 | — | — |
| mmlu pro | 84.2% | — |
| scicode | 40.5% | — |
| tau2 | 45.6% | — |
| terminalbench hard | 16.7% | — |
Benchmark data from Artificial Analysis.