← All comparisons

Hermes 4 - Llama-3.1 405B (Reasoning) vs Qwen2.5 Instruct 32B

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Reasoning)Qwen2.5 Instruct 32B
Intelligence Index18.613.2
Coding Index16.0
Math Index69.7
Output speed (tok/s)38.60.0
Blended price ($/1M)$1.50$0.00
Time to first token (s)0.79s0.00s
aime11.0%
aime 2569.7%
artificial analysis coding index16.00
artificial analysis intelligence index18.6013.20
artificial analysis math index69.70
gpqa72.7%46.6%
hle10.3%3.8%
ifbench32.7%
lcr20.7%
livecodebench68.6%24.8%
math 50080.5%
mmlu pro82.9%69.7%
scicode25.2%22.9%
tau222.2%
terminalbench hard11.4%

Benchmark data from Artificial Analysis.