Qwen3.5 0.8B (Non-reasoning) vs Qwen3 VL 4B (Reasoning)

Alibaba vs Alibaba — side-by-side benchmark comparison

	Qwen3.5 0.8B (Non-reasoning)	Qwen3 VL 4B (Reasoning)
Intelligence Index	9.9	13.7
Coding Index	1.0	6.7
Math Index	—	25.7
Output speed (tok/s)	96.3	0.0
Blended price ($/1M)	$0.02	$0.00
Time to first token (s)	0.26s	0.00s
aime	—	—
aime 25	—	25.7%
artificial analysis coding index	100.0%	6.70
artificial analysis intelligence index	9.90	13.70
artificial analysis math index	—	25.70
gpqa	23.6%	49.4%
hle	4.9%	4.4%
ifbench	21.6%	36.6%
lcr	6.7%	21.3%
livecodebench	—	32.0%
math 500	—	—
mmlu pro	—	70.0%
scicode	2.9%	17.1%
tau2	65.2%	15.5%
terminalbench hard	0.0%	1.5%

Benchmark data from Artificial Analysis.