← All comparisons

Hermes 4 - Llama-3.1 70B (Non-reasoning) vs Qwen3.5 0.8B (Non-reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 70B (Non-reasoning)Qwen3.5 0.8B (Non-reasoning)
Intelligence Index12.69.9
Coding Index9.21.0
Math Index11.3
Output speed (tok/s)94.396.3
Blended price ($/1M)$0.20$0.02
Time to first token (s)0.61s0.26s
aime
aime 2511.3%
artificial analysis coding index9.20100.0%
artificial analysis intelligence index12.609.90
artificial analysis math index11.30
gpqa49.1%23.6%
hle3.6%4.9%
ifbench29.0%21.6%
lcr2.0%6.7%
livecodebench26.9%
math 500
mmlu pro66.4%
scicode27.7%2.9%
tau221.6%65.2%
terminalbench hard0.0%0.0%

Benchmark data from Artificial Analysis.