← All comparisons

DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) vs Claude 2.1

Nous Research vs Anthropic — side-by-side benchmark comparison

DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)Claude 2.1
Intelligence Index7.69.3
Coding Index14.0
Math Index
Output speed (tok/s)0.00.0
Blended price ($/1M)$0.00$0.00
Time to first token (s)0.00s0.00s
aime0.0%3.3%
aime 25
artificial analysis coding index14.00
artificial analysis intelligence index7.609.30
artificial analysis math index
gpqa27.0%31.9%
hle4.3%4.2%
ifbench
lcr
livecodebench8.5%19.5%
math 50021.8%37.4%
mmlu pro36.5%49.5%
scicode9.1%18.4%
tau2
terminalbench hard

Benchmark data from Artificial Analysis.