Claude Opus 4.8 (Adaptive Reasoning, Max Effort) vs DeepSeek R1 Distill Qwen 32B

Anthropic vs DeepSeek — side-by-side benchmark comparison

	Claude Opus 4.8 (Adaptive Reasoning, Max Effort)	DeepSeek R1 Distill Qwen 32B
Intelligence Index	61.4	17.2
Coding Index	56.7	—
Math Index	—	63.0
Output speed (tok/s)	66.9	0.0
Blended price ($/1M)	$10.94	$0.00
Time to first token (s)	7.91s	0.00s
aime	—	68.7%
aime 25	—	63.0%
artificial analysis coding index	56.70	—
artificial analysis intelligence index	61.40	17.20
artificial analysis math index	—	63.00
gpqa	92.0%	61.5%
hle	45.7%	5.5%
ifbench	62.2%	22.9%
lcr	67.7%	9.7%
livecodebench	—	27.0%
math 500	—	94.1%
mmlu pro	—	73.9%
scicode	53.5%	37.6%
tau2	94.4%	—
terminalbench hard	58.3%	—

Benchmark data from Artificial Analysis.