Grok 4.20 0309 v2 (Non-reasoning) vs Qwen3 30B A3B 2507 Instruct

xAI vs Alibaba — side-by-side benchmark comparison

	Grok 4.20 0309 v2 (Non-reasoning)	Qwen3 30B A3B 2507 Instruct
Intelligence Index	29.0	15.0
Coding Index	22.0	14.2
Math Index	—	66.3
Output speed (tok/s)	175.2	102.1
Blended price ($/1M)	$3.00	$0.21
Time to first token (s)	0.47s	0.98s
aime	—	72.7%
aime 25	—	66.3%
artificial analysis coding index	22.00	14.20
artificial analysis intelligence index	29.00	15.00
artificial analysis math index	—	66.30
gpqa	77.6%	65.9%
hle	24.2%	6.8%
ifbench	49.3%	33.1%
lcr	17.3%	22.7%
livecodebench	—	51.5%
math 500	—	97.5%
mmlu pro	—	77.7%
scicode	32.8%	30.4%
tau2	59.9%	10.2%
terminalbench hard	16.7%	6.1%

Benchmark data from Artificial Analysis.