← All comparisons

Claude Opus 4.7 (Non-reasoning, High Effort) vs Hermes 4 - Llama-3.1 405B (Reasoning)

Anthropic vs Nous Research — side-by-side benchmark comparison

Claude Opus 4.7 (Non-reasoning, High Effort)Hermes 4 - Llama-3.1 405B (Reasoning)
Intelligence Index51.818.6
Coding Index53.116.0
Math Index69.7
Output speed (tok/s)47.838.6
Blended price ($/1M)$10.94$1.50
Time to first token (s)1.04s0.79s
aime
aime 2569.7%
artificial analysis coding index53.1016.00
artificial analysis intelligence index51.8018.60
artificial analysis math index69.70
gpqa88.5%72.7%
hle31.2%10.3%
ifbench43.6%32.7%
lcr67.0%20.7%
livecodebench68.6%
math 500
mmlu pro82.9%
scicode50.1%25.2%
tau274.0%22.2%
terminalbench hard54.5%11.4%

Benchmark data from Artificial Analysis.