GPT-3.5 Turbo vs DeepSeek R1 Distill Llama 70B

OpenAI vs DeepSeek — side-by-side benchmark comparison

	GPT-3.5 Turbo	DeepSeek R1 Distill Llama 70B
Intelligence Index	9.0	16.0
Coding Index	10.7	11.4
Math Index	—	53.7
Output speed (tok/s)	116.9	46.8
Blended price ($/1M)	$0.75	$0.79
Time to first token (s)	0.56s	0.33s
aime	—	67.0%
aime 25	—	53.7%
artificial analysis coding index	10.70	11.40
artificial analysis intelligence index	9.00	16.00
artificial analysis math index	—	53.70
gpqa	29.7%	40.2%
hle	—	6.1%
ifbench	—	27.6%
lcr	—	11.0%
livecodebench	—	26.6%
math 500	44.1%	93.5%
mmlu pro	46.2%	79.5%
scicode	—	31.3%
tau2	—	21.9%
terminalbench hard	—	1.5%

Benchmark data from Artificial Analysis.