Claude Sonnet 4.6 (Non-reasoning, High Effort) vs Granite 3.3 8B (Non-reasoning)

Anthropic vs IBM — side-by-side benchmark comparison

	Claude Sonnet 4.6 (Non-reasoning, High Effort)	Granite 3.3 8B (Non-reasoning)
Intelligence Index	44.4	7.0
Coding Index	46.4	3.4
Math Index	—	6.7
Output speed (tok/s)	55.2	453.9
Blended price ($/1M)	$6.56	$0.09
Time to first token (s)	1.07s	21.19s
aime	—	4.7%
aime 25	—	6.7%
artificial analysis coding index	46.40	3.40
artificial analysis intelligence index	44.40	7.00
artificial analysis math index	—	6.70
gpqa	79.9%	33.8%
hle	13.2%	4.2%
ifbench	41.2%	22.4%
lcr	57.7%	4.3%
livecodebench	—	12.7%
math 500	—	66.5%
mmlu pro	—	46.8%
scicode	46.9%	10.1%
tau2	79.5%	10.5%
terminalbench hard	46.2%	0.0%

Benchmark data from Artificial Analysis.