DeepSeek V4 Flash (Reasoning, Max Effort) vs Claude 3.5 Sonnet (June '24)

DeepSeek vs Anthropic — side-by-side benchmark comparison

	DeepSeek V4 Flash (Reasoning, Max Effort)	Claude 3.5 Sonnet (June '24)
Intelligence Index	46.5	14.2
Coding Index	38.7	26.0
Math Index	—	—
Output speed (tok/s)	119.3	0.0
Blended price ($/1M)	$0.17	$6.56
Time to first token (s)	0.86s	0.00s
aime	—	9.7%
aime 25	—	—
artificial analysis coding index	38.70	26.00
artificial analysis intelligence index	46.50	14.20
artificial analysis math index	—	—
gpqa	89.4%	56.0%
hle	32.1%	3.7%
ifbench	79.2%	—
lcr	63.0%	—
livecodebench	—	—
math 500	—	69.5%
mmlu pro	—	75.1%
scicode	44.9%	31.6%
tau2	95.0%	—
terminalbench hard	35.6%	—

Benchmark data from Artificial Analysis.