Grok 4 vs Qwen3 30B A3B 2507 (Reasoning)

xAI vs Alibaba — side-by-side benchmark comparison

	Grok 4	Qwen3 30B A3B 2507 (Reasoning)
Intelligence Index	41.5	22.4
Coding Index	40.5	14.6
Math Index	92.7	56.3
Output speed (tok/s)	0.0	155.3
Blended price ($/1M)	$11.00	$0.67
Time to first token (s)	0.00s	1.02s
aime	94.3%	90.7%
aime 25	92.7%	56.3%
artificial analysis coding index	40.50	14.60
artificial analysis intelligence index	41.50	22.40
artificial analysis math index	92.70	56.30
gpqa	87.7%	70.7%
hle	23.9%	9.8%
ifbench	53.7%	50.7%
lcr	68.0%	59.0%
livecodebench	81.9%	70.7%
math 500	99.0%	97.6%
mmlu pro	86.6%	80.5%
scicode	45.7%	33.3%
tau2	74.9%	28.1%
terminalbench hard	37.9%	5.3%

Benchmark data from Artificial Analysis.