← All comparisons

Hermes 4 - Llama-3.1 405B (Non-reasoning) vs GPT-4o (Aug '24)

Nous Research vs OpenAI — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Non-reasoning)GPT-4o (Aug '24)
Intelligence Index17.618.6
Coding Index18.116.6
Math Index15.3
Output speed (tok/s)40.8117.5
Blended price ($/1M)$1.50$4.38
Time to first token (s)0.73s0.60s
aime11.7%
aime 2515.3%
artificial analysis coding index18.1016.60
artificial analysis intelligence index17.6018.60
artificial analysis math index15.30
gpqa53.6%52.1%
hle4.2%2.9%
ifbench34.8%36.0%
lcr20.0%35.0%
livecodebench54.6%31.7%
math 50079.5%
mmlu pro72.9%
scicode34.6%33.1%
tau226.6%28.9%
terminalbench hard9.8%8.3%

Benchmark data from Artificial Analysis.