Granite 4.0 H Small vs Qwen3 VL 235B A22B (Reasoning)

IBM vs Alibaba — side-by-side benchmark comparison

	Granite 4.0 H Small	Qwen3 VL 235B A22B (Reasoning)
Intelligence Index	10.8	27.6
Coding Index	8.5	20.9
Math Index	13.7	88.3
Output speed (tok/s)	401.2	35.6
Blended price ($/1M)	$0.11	$2.17
Time to first token (s)	8.75s	5.14s
aime	—	—
aime 25	13.7%	88.3%
artificial analysis coding index	8.50	20.90
artificial analysis intelligence index	10.80	27.60
artificial analysis math index	13.70	88.30
gpqa	41.6%	77.2%
hle	3.7%	10.1%
ifbench	31.5%	56.5%
lcr	9.0%	58.7%
livecodebench	25.1%	64.6%
math 500	—	—
mmlu pro	62.4%	83.6%
scicode	20.9%	39.9%
tau2	17.3%	54.1%
terminalbench hard	2.3%	11.4%

Benchmark data from Artificial Analysis.