← All comparisons

Hermes 4 - Llama-3.1 405B (Reasoning) vs Qwen3 VL 32B (Reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Reasoning)Qwen3 VL 32B (Reasoning)
Intelligence Index18.624.7
Coding Index16.014.5
Math Index69.784.7
Output speed (tok/s)38.696.3
Blended price ($/1M)$1.50$2.63
Time to first token (s)0.79s1.12s
aime
aime 2569.7%84.7%
artificial analysis coding index16.0014.50
artificial analysis intelligence index18.6024.70
artificial analysis math index69.7084.70
gpqa72.7%73.3%
hle10.3%9.6%
ifbench32.7%59.4%
lcr20.7%55.3%
livecodebench68.6%73.8%
math 500
mmlu pro82.9%81.8%
scicode25.2%28.5%
tau222.2%45.6%
terminalbench hard11.4%7.6%

Benchmark data from Artificial Analysis.