← All comparisons

Hermes 4 - Llama-3.1 70B (Non-reasoning) vs Qwen3 32B (Non-reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 70B (Non-reasoning)Qwen3 32B (Non-reasoning)
Intelligence Index12.614.5
Coding Index9.2
Math Index11.319.7
Output speed (tok/s)94.394.2
Blended price ($/1M)$0.20$0.26
Time to first token (s)0.61s1.12s
aime30.3%
aime 2511.3%19.7%
artificial analysis coding index9.20
artificial analysis intelligence index12.6014.50
artificial analysis math index11.3019.70
gpqa49.1%53.5%
hle3.6%4.3%
ifbench29.0%31.5%
lcr2.0%0.0%
livecodebench26.9%28.8%
math 50086.9%
mmlu pro66.4%72.7%
scicode27.7%28.0%
tau221.6%
terminalbench hard0.0%

Benchmark data from Artificial Analysis.