← All comparisons

Claude Sonnet 4.6 (Non-reasoning, High Effort) vs Qwen3 VL 235B A22B (Reasoning)

Anthropic vs Alibaba — side-by-side benchmark comparison

Claude Sonnet 4.6 (Non-reasoning, High Effort)Qwen3 VL 235B A22B (Reasoning)
Intelligence Index44.427.6
Coding Index46.420.9
Math Index88.3
Output speed (tok/s)55.235.6
Blended price ($/1M)$6.56$2.17
Time to first token (s)1.07s5.14s
aime
aime 2588.3%
artificial analysis coding index46.4020.90
artificial analysis intelligence index44.4027.60
artificial analysis math index88.30
gpqa79.9%77.2%
hle13.2%10.1%
ifbench41.2%56.5%
lcr57.7%58.7%
livecodebench64.6%
math 500
mmlu pro83.6%
scicode46.9%39.9%
tau279.5%54.1%
terminalbench hard46.2%11.4%

Benchmark data from Artificial Analysis.