← All comparisons

Qwen3.5 4B (Non-reasoning) vs Grok 3 mini Reasoning (high)

Alibaba vs xAI — side-by-side benchmark comparison

Qwen3.5 4B (Non-reasoning)Grok 3 mini Reasoning (high)
Intelligence Index22.632.1
Coding Index13.725.2
Math Index84.7
Output speed (tok/s)210.056.8
Blended price ($/1M)$0.06$0.35
Time to first token (s)0.23s0.42s
aime93.3%
aime 2584.7%
artificial analysis coding index13.7025.20
artificial analysis intelligence index22.6032.10
artificial analysis math index84.70
gpqa71.2%79.1%
hle7.5%11.1%
ifbench33.3%45.9%
lcr28.3%50.3%
livecodebench69.6%
math 50099.2%
mmlu pro82.8%
scicode18.3%40.6%
tau287.7%90.4%
terminalbench hard11.4%17.4%

Benchmark data from Artificial Analysis.