gpt-oss-20B (low) vs Qwen3.5 9B (Non-reasoning)

OpenAI vs Alibaba — side-by-side benchmark comparison

	gpt-oss-20B (low)	Qwen3.5 9B (Non-reasoning)
Intelligence Index	20.8	27.3
Coding Index	14.4	21.3
Math Index	62.3	—
Output speed (tok/s)	273.0	0.0
Blended price ($/1M)	$0.10	$0.00
Time to first token (s)	0.50s	0.00s
aime	—	—
aime 25	62.3%	—
artificial analysis coding index	14.40	21.30
artificial analysis intelligence index	20.80	27.30
artificial analysis math index	62.30	—
gpqa	61.1%	78.6%
hle	5.1%	8.6%
ifbench	57.8%	37.8%
lcr	31.0%	38.0%
livecodebench	65.2%	—
math 500	—	—
mmlu pro	71.8%	—
scicode	34.0%	27.7%
tau2	50.3%	85.1%
terminalbench hard	4.5%	18.2%

Benchmark data from Artificial Analysis.