← All comparisons

Hermes 4 - Llama-3.1 405B (Non-reasoning) vs Qwen3 VL 30B A3B (Reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Non-reasoning)Qwen3 VL 30B A3B (Reasoning)
Intelligence Index17.619.7
Coding Index18.113.1
Math Index15.382.3
Output speed (tok/s)40.8123.6
Blended price ($/1M)$1.50$0.34
Time to first token (s)0.73s1.09s
aime
aime 2515.3%82.3%
artificial analysis coding index18.1013.10
artificial analysis intelligence index17.6019.70
artificial analysis math index15.3082.30
gpqa53.6%72.0%
hle4.2%8.7%
ifbench34.8%45.1%
lcr20.0%40.7%
livecodebench54.6%69.7%
math 500
mmlu pro72.9%80.7%
scicode34.6%28.8%
tau226.6%19.9%
terminalbench hard9.8%5.3%

Benchmark data from Artificial Analysis.