GPT-4o (March 2025, chatgpt-4o-latest) vs DeepSeek R1 Distill Llama 70B

OpenAI vs DeepSeek — side-by-side benchmark comparison

	GPT-4o (March 2025, chatgpt-4o-latest)	DeepSeek R1 Distill Llama 70B
Intelligence Index	18.6	16.0
Coding Index	—	11.4
Math Index	25.7	53.7
Output speed (tok/s)	0.0	46.8
Blended price ($/1M)	$0.00	$0.79
Time to first token (s)	0.00s	0.33s
aime	32.7%	67.0%
aime 25	25.7%	53.7%
artificial analysis coding index	—	11.40
artificial analysis intelligence index	18.60	16.00
artificial analysis math index	25.70	53.70
gpqa	65.5%	40.2%
hle	5.0%	6.1%
ifbench	—	27.6%
lcr	—	11.0%
livecodebench	42.5%	26.6%
math 500	89.3%	93.5%
mmlu pro	80.3%	79.5%
scicode	36.6%	31.3%
tau2	—	21.9%
terminalbench hard	—	1.5%

Benchmark data from Artificial Analysis.