Olmo 3.1 32B Think vs Llama 3.2 Instruct 3B

Allen Institute for AI vs Meta — side-by-side benchmark comparison

	Olmo 3.1 32B Think	Llama 3.2 Instruct 3B
Intelligence Index	13.9	9.7
Coding Index	9.8	—
Math Index	77.3	3.3
Output speed (tok/s)	0.0	52.7
Blended price ($/1M)	$0.00	$0.15
Time to first token (s)	0.00s	0.65s
aime	—	6.7%
aime 25	77.3%	3.3%
artificial analysis coding index	9.80	—
artificial analysis intelligence index	13.90	9.70
artificial analysis math index	77.30	3.30
gpqa	59.1%	25.5%
hle	6.0%	5.2%
ifbench	66.0%	26.2%
lcr	0.0%	2.0%
livecodebench	69.5%	8.3%
math 500	—	48.9%
mmlu pro	76.3%	34.7%
scicode	29.3%	5.2%
tau2	0.0%	21.1%
terminalbench hard	0.0%	—

Benchmark data from Artificial Analysis.