← All comparisons

Claude Sonnet 4.6 (Non-reasoning, Low Effort) vs Hermes 4 - Llama-3.1 405B (Non-reasoning)

Anthropic vs Nous Research — side-by-side benchmark comparison

Claude Sonnet 4.6 (Non-reasoning, Low Effort)Hermes 4 - Llama-3.1 405B (Non-reasoning)
Intelligence Index42.617.6
Coding Index43.018.1
Math Index15.3
Output speed (tok/s)54.940.8
Blended price ($/1M)$6.56$1.50
Time to first token (s)1.13s0.73s
aime
aime 2515.3%
artificial analysis coding index43.0018.10
artificial analysis intelligence index42.6017.60
artificial analysis math index15.30
gpqa79.7%53.6%
hle10.8%4.2%
ifbench42.4%34.8%
lcr58.7%20.0%
livecodebench54.6%
math 500
mmlu pro72.9%
scicode44.1%34.6%
tau278.9%26.6%
terminalbench hard42.4%9.8%

Benchmark data from Artificial Analysis.