← All comparisons

Hermes 4 - Llama-3.1 70B (Reasoning) vs GPT-3.5 Turbo

Nous Research vs OpenAI — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 70B (Reasoning)GPT-3.5 Turbo
Intelligence Index16.09.0
Coding Index14.410.7
Math Index68.7
Output speed (tok/s)92.8116.9
Blended price ($/1M)$0.20$0.75
Time to first token (s)0.64s0.56s
aime
aime 2568.7%
artificial analysis coding index14.4010.70
artificial analysis intelligence index16.009.00
artificial analysis math index68.70
gpqa69.9%29.7%
hle7.9%
ifbench31.3%
lcr6.7%
livecodebench65.3%
math 50044.1%
mmlu pro81.1%46.2%
scicode34.1%
tau222.5%
terminalbench hard4.5%

Benchmark data from Artificial Analysis.