Qwen3.5 4B (Non-reasoning) vs GPT-5 (medium)

Alibaba vs OpenAI — side-by-side benchmark comparison

	Qwen3.5 4B (Non-reasoning)	GPT-5 (medium)
Intelligence Index	22.6	42.0
Coding Index	13.7	38.9
Math Index	—	91.7
Output speed (tok/s)	210.0	86.4
Blended price ($/1M)	$0.06	$3.44
Time to first token (s)	0.23s	37.15s
aime	—	91.7%
aime 25	—	91.7%
artificial analysis coding index	13.70	38.90
artificial analysis intelligence index	22.60	42.00
artificial analysis math index	—	91.70
gpqa	71.2%	84.2%
hle	7.5%	23.5%
ifbench	33.3%	70.6%
lcr	28.3%	72.8%
livecodebench	—	70.3%
math 500	—	99.1%
mmlu pro	—	86.7%
scicode	18.3%	41.1%
tau2	87.7%	86.5%
terminalbench hard	11.4%	37.9%

Benchmark data from Artificial Analysis.