Exaone 4.0 1.2B (Reasoning) vs Qwen3 235B A22B (Non-reasoning)

LG AI Research vs Alibaba — side-by-side benchmark comparison

	Exaone 4.0 1.2B (Reasoning)	Qwen3 235B A22B (Non-reasoning)
Intelligence Index	8.3	17.0
Coding Index	3.1	14.0
Math Index	50.3	23.7
Output speed (tok/s)	0.0	60.8
Blended price ($/1M)	$0.00	$0.79
Time to first token (s)	0.00s	1.30s
aime	—	32.7%
aime 25	50.3%	23.7%
artificial analysis coding index	3.10	14.00
artificial analysis intelligence index	8.30	17.00
artificial analysis math index	50.30	23.70
gpqa	51.5%	61.3%
hle	5.8%	4.7%
ifbench	23.0%	36.6%
lcr	0.0%	0.0%
livecodebench	51.6%	34.3%
math 500	—	90.2%
mmlu pro	58.8%	76.2%
scicode	9.3%	29.9%
tau2	16.4%	27.2%
terminalbench hard	0.0%	6.1%

Benchmark data from Artificial Analysis.