← All comparisons

Hermes 4 - Llama-3.1 70B (Non-reasoning) vs Qwen3 VL 30B A3B (Reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 70B (Non-reasoning)Qwen3 VL 30B A3B (Reasoning)
Intelligence Index12.619.7
Coding Index9.213.1
Math Index11.382.3
Output speed (tok/s)94.3123.6
Blended price ($/1M)$0.20$0.34
Time to first token (s)0.61s1.09s
aime
aime 2511.3%82.3%
artificial analysis coding index9.2013.10
artificial analysis intelligence index12.6019.70
artificial analysis math index11.3082.30
gpqa49.1%72.0%
hle3.6%8.7%
ifbench29.0%45.1%
lcr2.0%40.7%
livecodebench26.9%69.7%
math 500
mmlu pro66.4%80.7%
scicode27.7%28.8%
tau221.6%19.9%
terminalbench hard0.0%5.3%

Benchmark data from Artificial Analysis.