← All comparisons

Hermes 4 - Llama-3.1 405B (Reasoning) vs GPT-4o (May '24)

Nous Research vs OpenAI — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Reasoning)GPT-4o (May '24)
Intelligence Index18.614.5
Coding Index16.024.2
Math Index69.7
Output speed (tok/s)38.6111.8
Blended price ($/1M)$1.50$7.50
Time to first token (s)0.79s0.61s
aime11.0%
aime 2569.7%
artificial analysis coding index16.0024.20
artificial analysis intelligence index18.6014.50
artificial analysis math index69.70
gpqa72.7%52.6%
hle10.3%2.8%
ifbench32.7%
lcr20.7%
livecodebench68.6%33.4%
math 50079.1%
mmlu pro82.9%74.0%
scicode25.2%30.9%
tau222.2%
terminalbench hard11.4%

Benchmark data from Artificial Analysis.