← All comparisons

Hermes 4 - Llama-3.1 70B (Reasoning) vs Mistral Large 2 (Jul '24)

Nous Research vs Mistral — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 70B (Reasoning)Mistral Large 2 (Jul '24)
Intelligence Index16.013.0
Coding Index14.4
Math Index68.70.0
Output speed (tok/s)92.80.0
Blended price ($/1M)$0.20$3.00
Time to first token (s)0.64s0.00s
aime9.3%
aime 2568.7%0.0%
artificial analysis coding index14.40
artificial analysis intelligence index16.0013.00
artificial analysis math index68.700.0%
gpqa69.9%47.2%
hle7.9%3.2%
ifbench31.3%31.6%
lcr6.7%1.7%
livecodebench65.3%26.7%
math 50071.4%
mmlu pro81.1%68.3%
scicode34.1%27.1%
tau222.5%33.0%
terminalbench hard4.5%

Benchmark data from Artificial Analysis.