← All comparisons
GPT-5.1 Codex (high) vs Gemma 3 4B Instruct
OpenAI vs Google — side-by-side benchmark comparison
| GPT-5.1 Codex (high) | Gemma 3 4B Instruct | |
|---|---|---|
| Intelligence Index | 43.1 | 6.3 |
| Coding Index | 36.6 | 2.9 |
| Math Index | 95.7 | 12.7 |
| Output speed (tok/s) | 182.1 | 0.0 |
| Blended price ($/1M) | $3.44 | $0.05 |
| Time to first token (s) | 5.42s | 0.00s |
| aime | — | 6.3% |
| aime 25 | 95.7% | 12.7% |
| artificial analysis coding index | 36.60 | 2.90 |
| artificial analysis intelligence index | 43.10 | 6.30 |
| artificial analysis math index | 95.70 | 12.70 |
| gpqa | 86.0% | 29.1% |
| hle | 23.4% | 5.2% |
| ifbench | 70.0% | 28.3% |
| lcr | 67.3% | 5.7% |
| livecodebench | 84.9% | 11.2% |
| math 500 | — | 76.6% |
| mmlu pro | 86.0% | 41.7% |
| scicode | 40.2% | 7.3% |
| tau2 | 83.0% | 5.0% |
| terminalbench hard | 34.8% | 0.8% |
Benchmark data from Artificial Analysis.