← All comparisons

Hermes 4 - Llama-3.1 405B (Non-reasoning) vs DeepSeek V3.2 Exp (Reasoning)

Nous Research vs DeepSeek — side-by-side benchmark comparison

Hermes 4 - Llama-3.1 405B (Non-reasoning)DeepSeek V3.2 Exp (Reasoning)
Intelligence Index17.632.9
Coding Index18.133.3
Math Index15.387.7
Output speed (tok/s)40.80.0
Blended price ($/1M)$1.50$0.31
Time to first token (s)0.73s0.00s
aime
aime 2515.3%87.7%
artificial analysis coding index18.1033.30
artificial analysis intelligence index17.6032.90
artificial analysis math index15.3087.70
gpqa53.6%79.7%
hle4.2%13.8%
ifbench34.8%54.1%
lcr20.0%69.0%
livecodebench54.6%78.9%
math 500
mmlu pro72.9%85.0%
scicode34.6%37.7%
tau226.6%33.9%
terminalbench hard9.8%31.1%

Benchmark data from Artificial Analysis.