← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs Qwen3.5 0.8B (Non-reasoning)

NVIDIA vs Alibaba — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)Qwen3.5 0.8B (Non-reasoning)
Intelligence Index15.09.9
Coding Index13.11.0
Math Index63.7
Output speed (tok/s)52.396.3
Blended price ($/1M)$0.90$0.02
Time to first token (s)0.76s0.26s
aime74.7%
aime 2563.7%
artificial analysis coding index13.10100.0%
artificial analysis intelligence index15.009.90
artificial analysis math index63.70
gpqa72.8%23.6%
hle8.1%4.9%
ifbench38.2%21.6%
lcr7.3%6.7%
livecodebench64.1%
math 50095.2%
mmlu pro82.5%
scicode34.7%2.9%
tau211.4%65.2%
terminalbench hard2.3%0.0%

Benchmark data from Artificial Analysis.