gpt-oss-120b (high) vs Qwen3.5 397B A17B (Non-reasoning)

OpenAI vs Alibaba — side-by-side benchmark comparison

	gpt-oss-120b (high)	Qwen3.5 397B A17B (Non-reasoning)
Intelligence Index	33.3	40.1
Coding Index	28.6	37.4
Math Index	93.4	—
Output speed (tok/s)	356.8	53.5
Blended price ($/1M)	$0.26	$1.35
Time to first token (s)	0.51s	1.85s
aime	—	—
aime 25	93.4%	—
artificial analysis coding index	28.60	37.40
artificial analysis intelligence index	33.30	40.10
artificial analysis math index	93.40	—
gpqa	78.2%	86.1%
hle	18.5%	18.8%
ifbench	69.0%	51.6%
lcr	50.7%	58.0%
livecodebench	87.8%	—
math 500	—	—
mmlu pro	80.8%	—
scicode	38.9%	41.1%
tau2	65.8%	83.9%
terminalbench hard	23.5%	35.6%

Benchmark data from Artificial Analysis.