← All comparisons

Olmo 3 7B Think vs Hermes 4 - Llama-3.1 405B (Reasoning)

Allen Institute for AI vs Nous Research — side-by-side benchmark comparison

Olmo 3 7B ThinkHermes 4 - Llama-3.1 405B (Reasoning)
Intelligence Index9.418.6
Coding Index7.616.0
Math Index70.769.7
Output speed (tok/s)0.038.6
Blended price ($/1M)$0.00$1.50
Time to first token (s)0.00s0.79s
aime
aime 2570.7%69.7%
artificial analysis coding index7.6016.00
artificial analysis intelligence index9.4018.60
artificial analysis math index70.7069.70
gpqa51.6%72.7%
hle5.7%10.3%
ifbench41.5%32.7%
lcr0.0%20.7%
livecodebench61.7%68.6%
math 500
mmlu pro65.5%82.9%
scicode21.2%25.2%
tau20.0%22.2%
terminalbench hard0.8%11.4%

Benchmark data from Artificial Analysis.