Qwen3.5 397B A17B (Reasoning) vs DeepSeek LLM 67B Chat (V1)

Alibaba vs DeepSeek — side-by-side benchmark comparison

	Qwen3.5 397B A17B (Reasoning)	DeepSeek LLM 67B Chat (V1)
Intelligence Index	45.0	8.4
Coding Index	41.3	—
Math Index	—	—
Output speed (tok/s)	52.1	0.0
Blended price ($/1M)	$1.35	$0.00
Time to first token (s)	1.81s	0.00s
aime	—	—
aime 25	—	—
artificial analysis coding index	41.30	—
artificial analysis intelligence index	45.00	8.40
artificial analysis math index	—	—
gpqa	89.3%	—
hle	27.3%	—
ifbench	78.8%	—
lcr	65.7%	—
livecodebench	—	—
math 500	—	—
mmlu pro	—	—
scicode	42.0%	—
tau2	95.6%	—
terminalbench hard	40.9%	—

Benchmark data from Artificial Analysis.