Llama 3.1 Instruct 70B vs Qwen3 VL 235B A22B (Reasoning)

Meta vs Alibaba — side-by-side benchmark comparison

	Llama 3.1 Instruct 70B	Qwen3 VL 235B A22B (Reasoning)
Intelligence Index	12.5	27.6
Coding Index	10.9	20.9
Math Index	4.0	88.3
Output speed (tok/s)	35.3	35.6
Blended price ($/1M)	$0.56	$2.17
Time to first token (s)	0.54s	5.14s
aime	17.3%	—
aime 25	4.0%	88.3%
artificial analysis coding index	10.90	20.90
artificial analysis intelligence index	12.50	27.60
artificial analysis math index	4.00	88.30
gpqa	40.9%	77.2%
hle	4.6%	10.1%
ifbench	34.4%	56.5%
lcr	6.3%	58.7%
livecodebench	23.2%	64.6%
math 500	64.9%	—
mmlu pro	67.6%	83.6%
scicode	26.7%	39.9%
tau2	15.2%	54.1%
terminalbench hard	3.0%	11.4%

Benchmark data from Artificial Analysis.