Llama 3.1 Instruct 405B vs Qwen3 235B A22B (Reasoning)

Meta vs Alibaba — side-by-side benchmark comparison

	Llama 3.1 Instruct 405B	Qwen3 235B A22B (Reasoning)
Intelligence Index	17.4	19.8
Coding Index	14.5	17.4
Math Index	3.0	82.0
Output speed (tok/s)	37.5	58.3
Blended price ($/1M)	$3.69	$2.63
Time to first token (s)	0.63s	1.37s
aime	21.3%	84.0%
aime 25	3.0%	82.0%
artificial analysis coding index	14.50	17.40
artificial analysis intelligence index	17.40	19.80
artificial analysis math index	3.00	82.00
gpqa	51.5%	70.0%
hle	4.2%	11.7%
ifbench	39.0%	38.7%
lcr	24.3%	0.0%
livecodebench	30.5%	62.2%
math 500	70.3%	93.0%
mmlu pro	73.2%	82.8%
scicode	29.9%	39.9%
tau2	19.0%	24.0%
terminalbench hard	6.8%	6.1%

Benchmark data from Artificial Analysis.