Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) vs Qwen3 4B 2507 (Reasoning)

NVIDIA vs Alibaba — side-by-side benchmark comparison

	Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)	Qwen3 4B 2507 (Reasoning)
Intelligence Index	14.4	18.2
Coding Index	—	9.5
Math Index	50.0	82.7
Output speed (tok/s)	0.0	0.0
Blended price ($/1M)	$0.00	$0.00
Time to first token (s)	0.00s	0.00s
aime	70.7%	—
aime 25	50.0%	82.7%
artificial analysis coding index	—	9.50
artificial analysis intelligence index	14.40	18.20
artificial analysis math index	50.00	82.70
gpqa	40.8%	66.7%
hle	5.1%	5.9%
ifbench	25.5%	49.8%
lcr	0.0%	37.7%
livecodebench	49.3%	64.1%
math 500	94.7%	—
mmlu pro	55.6%	74.3%
scicode	10.1%	25.6%
tau2	11.7%	25.4%
terminalbench hard	—	1.5%

Benchmark data from Artificial Analysis.