Grok 4.20 0309 (Non-reasoning) vs Qwen3.5 27B (Non-reasoning)

xAI vs Alibaba — side-by-side benchmark comparison

	Grok 4.20 0309 (Non-reasoning)	Qwen3.5 27B (Non-reasoning)
Intelligence Index	29.7	37.2
Coding Index	25.4	33.4
Math Index	—	—
Output speed (tok/s)	202.6	95.3
Blended price ($/1M)	$3.00	$0.88
Time to first token (s)	0.50s	1.40s
aime	—	—
aime 25	—	—
artificial analysis coding index	25.40	33.40
artificial analysis intelligence index	29.70	37.20
artificial analysis math index	—	—
gpqa	78.5%	84.2%
hle	22.5%	13.2%
ifbench	47.8%	46.9%
lcr	18.0%	55.7%
livecodebench	—	—
math 500	—	—
mmlu pro	—	—
scicode	32.2%	36.7%
tau2	69.6%	87.1%
terminalbench hard	22.0%	31.8%

Benchmark data from Artificial Analysis.