Claude 4.5 Haiku (Reasoning) vs Qwen3 4B 2507 Instruct

Anthropic vs Alibaba — side-by-side benchmark comparison

	Claude 4.5 Haiku (Reasoning)	Qwen3 4B 2507 Instruct
Intelligence Index	37.1	12.9
Coding Index	32.6	9.0
Math Index	83.7	52.3
Output speed (tok/s)	142.2	0.0
Blended price ($/1M)	$2.19	$0.00
Time to first token (s)	10.48s	0.00s
aime	—	—
aime 25	83.7%	52.3%
artificial analysis coding index	32.60	9.00
artificial analysis intelligence index	37.10	12.90
artificial analysis math index	83.70	52.30
gpqa	67.2%	51.7%
hle	9.7%	4.7%
ifbench	54.3%	33.5%
lcr	70.3%	7.3%
livecodebench	61.5%	37.7%
math 500	—	—
mmlu pro	76.0%	67.2%
scicode	43.3%	18.1%
tau2	54.7%	26.6%
terminalbench hard	27.3%	4.5%

Benchmark data from Artificial Analysis.