← All comparisons

Grok 4.20 0309 v2 (Non-reasoning) vs Llama 3.1 Tulu3 405B

xAI vs Allen Institute for AI — side-by-side benchmark comparison

Grok 4.20 0309 v2 (Non-reasoning)Llama 3.1 Tulu3 405B
Intelligence Index29.014.1
Coding Index22.0
Math Index
Output speed (tok/s)175.20.0
Blended price ($/1M)$3.00$0.00
Time to first token (s)0.47s0.00s
aime13.3%
aime 25
artificial analysis coding index22.00
artificial analysis intelligence index29.0014.10
artificial analysis math index
gpqa77.6%51.6%
hle24.2%3.5%
ifbench49.3%
lcr17.3%
livecodebench29.1%
math 50077.8%
mmlu pro71.6%
scicode32.8%30.2%
tau259.9%
terminalbench hard16.7%

Benchmark data from Artificial Analysis.