← All comparisons

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) vs Qwen3 235B A22B 2507 Instruct

NVIDIA vs Alibaba — side-by-side benchmark comparison

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)Qwen3 235B A22B 2507 Instruct
Intelligence Index15.025.0
Coding Index13.122.1
Math Index63.771.7
Output speed (tok/s)52.357.0
Blended price ($/1M)$0.90$0.36
Time to first token (s)0.76s1.34s
aime74.7%71.7%
aime 2563.7%71.7%
artificial analysis coding index13.1022.10
artificial analysis intelligence index15.0025.00
artificial analysis math index63.7071.70
gpqa72.8%75.3%
hle8.1%10.6%
ifbench38.2%46.1%
lcr7.3%31.2%
livecodebench64.1%52.4%
math 50095.2%98.0%
mmlu pro82.5%82.8%
scicode34.7%36.0%
tau211.4%33.3%
terminalbench hard2.3%15.2%

Benchmark data from Artificial Analysis.