Hermes 3 - Llama-3.1 70B vs Qwen3.5 27B (Non-reasoning)

Nous Research vs Alibaba — side-by-side benchmark comparison

	Hermes 3 - Llama-3.1 70B	Qwen3.5 27B (Non-reasoning)
Intelligence Index	10.6	37.2
Coding Index	—	33.4
Math Index	—	—
Output speed (tok/s)	33.2	95.3
Blended price ($/1M)	$0.30	$0.88
Time to first token (s)	0.38s	1.40s
aime	2.3%	—
aime 25	—	—
artificial analysis coding index	—	33.40
artificial analysis intelligence index	10.60	37.20
artificial analysis math index	—	—
gpqa	40.1%	84.2%
hle	4.1%	13.2%
ifbench	—	46.9%
lcr	—	55.7%
livecodebench	18.8%	—
math 500	53.8%	—
mmlu pro	57.1%	—
scicode	23.1%	36.7%
tau2	—	87.1%
terminalbench hard	—	31.8%

Benchmark data from Artificial Analysis.