Claude Opus 4.7 (Non-reasoning, High Effort) vs Qwen3 4B 2507 Instruct

Anthropic vs Alibaba — side-by-side benchmark comparison

	Claude Opus 4.7 (Non-reasoning, High Effort)	Qwen3 4B 2507 Instruct
Intelligence Index	51.8	12.9
Coding Index	53.1	9.0
Math Index	—	52.3
Output speed (tok/s)	47.8	0.0
Blended price ($/1M)	$10.94	$0.00
Time to first token (s)	1.04s	0.00s
aime	—	—
aime 25	—	52.3%
artificial analysis coding index	53.10	9.00
artificial analysis intelligence index	51.80	12.90
artificial analysis math index	—	52.30
gpqa	88.5%	51.7%
hle	31.2%	4.7%
ifbench	43.6%	33.5%
lcr	67.0%	7.3%
livecodebench	—	37.7%
math 500	—	—
mmlu pro	—	67.2%
scicode	50.1%	18.1%
tau2	74.0%	26.6%
terminalbench hard	54.5%	4.5%

Benchmark data from Artificial Analysis.