← All comparisons

Hermes 3 - Llama-3.1 70B vs Qwen3 VL 4B (Reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

Hermes 3 - Llama-3.1 70BQwen3 VL 4B (Reasoning)
Intelligence Index10.613.7
Coding Index6.7
Math Index25.7
Output speed (tok/s)33.20.0
Blended price ($/1M)$0.30$0.00
Time to first token (s)0.38s0.00s
aime2.3%
aime 2525.7%
artificial analysis coding index6.70
artificial analysis intelligence index10.6013.70
artificial analysis math index25.70
gpqa40.1%49.4%
hle4.1%4.4%
ifbench36.6%
lcr21.3%
livecodebench18.8%32.0%
math 50053.8%
mmlu pro57.1%70.0%
scicode23.1%17.1%
tau215.5%
terminalbench hard1.5%

Benchmark data from Artificial Analysis.