Claude Sonnet 4.6 (Non-reasoning, High Effort) vs DeepSeek R1 Distill Qwen 1.5B

Anthropic vs DeepSeek — side-by-side benchmark comparison

	Claude Sonnet 4.6 (Non-reasoning, High Effort)	DeepSeek R1 Distill Qwen 1.5B
Intelligence Index	44.4	9.1
Coding Index	46.4	—
Math Index	—	22.0
Output speed (tok/s)	55.2	0.0
Blended price ($/1M)	$6.56	$0.00
Time to first token (s)	1.07s	0.00s
aime	—	17.7%
aime 25	—	22.0%
artificial analysis coding index	46.40	—
artificial analysis intelligence index	44.40	9.10
artificial analysis math index	—	22.00
gpqa	79.9%	9.8%
hle	13.2%	3.3%
ifbench	41.2%	13.2%
lcr	57.7%	0.3%
livecodebench	—	7.0%
math 500	—	68.7%
mmlu pro	—	26.9%
scicode	46.9%	6.6%
tau2	79.5%	—
terminalbench hard	46.2%	—

Benchmark data from Artificial Analysis.