GPT-4.1 vs Qwen3.5 27B (Non-reasoning)

OpenAI vs Alibaba — side-by-side benchmark comparison

	GPT-4.1	Qwen3.5 27B (Non-reasoning)
Intelligence Index	26.3	37.2
Coding Index	21.8	33.4
Math Index	34.7	—
Output speed (tok/s)	137.8	95.3
Blended price ($/1M)	$3.50	$0.88
Time to first token (s)	0.58s	1.40s
aime	43.7%	—
aime 25	34.7%	—
artificial analysis coding index	21.80	33.40
artificial analysis intelligence index	26.30	37.20
artificial analysis math index	34.70	—
gpqa	66.6%	84.2%
hle	4.6%	13.2%
ifbench	43.0%	46.9%
lcr	61.0%	55.7%
livecodebench	45.7%	—
math 500	91.3%	—
mmlu pro	80.6%	—
scicode	38.1%	36.7%
tau2	47.1%	87.1%
terminalbench hard	13.6%	31.8%

Benchmark data from Artificial Analysis.