GPT-4o (May '24) vs DeepSeek R1 Distill Qwen 14B

OpenAI vs DeepSeek — side-by-side benchmark comparison

	GPT-4o (May '24)	DeepSeek R1 Distill Qwen 14B
Intelligence Index	14.5	15.8
Coding Index	24.2	—
Math Index	—	55.7
Output speed (tok/s)	111.8	0.0
Blended price ($/1M)	$7.50	$0.00
Time to first token (s)	0.61s	0.00s
aime	11.0%	66.7%
aime 25	—	55.7%
artificial analysis coding index	24.20	—
artificial analysis intelligence index	14.50	15.80
artificial analysis math index	—	55.70
gpqa	52.6%	48.4%
hle	2.8%	4.4%
ifbench	—	22.1%
lcr	—	7.0%
livecodebench	33.4%	37.6%
math 500	79.1%	94.9%
mmlu pro	74.0%	74.0%
scicode	30.9%	23.9%
tau2	—	—
terminalbench hard	—	—

Benchmark data from Artificial Analysis.