Grok 4.3 (high) vs Qwen2.5 Instruct 72B

xAI vs Alibaba — side-by-side benchmark comparison

	Grok 4.3 (high)	Qwen2.5 Instruct 72B
Intelligence Index	53.2	15.6
Coding Index	41.0	11.9
Math Index	—	14.0
Output speed (tok/s)	161.0	55.4
Blended price ($/1M)	$1.56	$0.37
Time to first token (s)	22.30s	1.23s
aime	—	16.0%
aime 25	—	14.0%
artificial analysis coding index	41.00	11.90
artificial analysis intelligence index	53.20	15.60
artificial analysis math index	—	14.00
gpqa	90.1%	49.1%
hle	35.0%	4.2%
ifbench	81.3%	36.9%
lcr	64.3%	20.3%
livecodebench	—	27.6%
math 500	—	85.8%
mmlu pro	—	72.0%
scicode	47.3%	26.7%
tau2	97.7%	34.5%
terminalbench hard	37.9%	4.5%

Benchmark data from Artificial Analysis.