← All comparisons

Grok 4.20 0309 (Non-reasoning) vs Qwen3 235B A22B 2507 (Reasoning)

xAI vs Alibaba — side-by-side benchmark comparison

Grok 4.20 0309 (Non-reasoning)Qwen3 235B A22B 2507 (Reasoning)
Intelligence Index29.729.5
Coding Index25.423.2
Math Index91.0
Output speed (tok/s)202.662.5
Blended price ($/1M)$3.00$0.84
Time to first token (s)0.50s1.21s
aime94.0%
aime 2591.0%
artificial analysis coding index25.4023.20
artificial analysis intelligence index29.7029.50
artificial analysis math index91.00
gpqa78.5%79.0%
hle22.5%15.0%
ifbench47.8%51.2%
lcr18.0%67.0%
livecodebench78.8%
math 50098.4%
mmlu pro84.3%
scicode32.2%42.4%
tau269.6%53.2%
terminalbench hard22.0%13.6%

Benchmark data from Artificial Analysis.