Best AI Models for Product Managers

Frontier models ranked by the PM Index — our product-management-weighted blend of Artificial Analysis benchmarks — next to price and context window. Use the slider to re-rank by quality or value for your budget.

Updated June 15, 2026

Which AI model should I use?

Pick your task and budget for a data-backed recommendation.

PM taskBudget

Recommended for writing prds

Claude Opus 4.8

PM Index 64.2 · $10.00/1M tok

See the full ranking for Writing PRDs →

Quality vs price

Higher = smarter (PM Index), left = cheaper. Ringed models are the best value — nothing beats them on both. Hover a point, or a lab below.

Anthropic
OpenAI
Alibaba
Google
MiniMax
Xiaomi
Kimi
xAI
DeepSeek
Z AI

Optimize ranking for:

Best qualityBest value

#	Model	PM Index	AA Intelligence	$/1M tok	Value
1	Muse Spark Meta	53.7	52.2	—	10.0
2	MiniMax-M3 MiniMax	56.4	54.7	$0.52	10.0
3	MiMo-V2.5-Pro Xiaomi	55.7	53.8	$0.54	9.9
4	Qwen3.7 Plus Alibaba	55.0	53.3	$0.59	9.8
5	DeepSeek V4 Pro DeepSeek	54.2	51.5	$0.54	9.8
6	MiMo-V2.5 Xiaomi	51.6	49.0	$0.18	9.8
7	GLM-5-Turbo Z AI	49.7	46.8	—	9.6
8	MiniMax-M2.7 MiniMax	51.2	49.6	$0.52	9.5
9	DeepSeek V4 Flash DeepSeek	48.7	46.5	$0.18	9.5
10	Kimi K2.6 Kimi	55.6	53.9	$1.71	9.2
11	Grok 4.3 xAI	54.5	53.2	$1.56	9.2
12	Qwen3.6 Plus Alibaba	51.6	50.0	$1.13	9.2
13	Hy3-preview Tencent	44.1	41.9	$0.20	9.1
14	Step 3.7 Flash StepFun	45.4	42.6	$0.44	9.1
15	GLM-5 Z AI	51.9	49.8	$1.55	9.0
16	MiMo-V2-Pro Xiaomi	51.1	49.2	$1.50	9.0
17	DeepSeek V3.2 DeepSeek	43.4	41.7	$0.34	8.9
18	MiMo-V2-Flash Xiaomi	42.2	41.5	$0.15	8.9
19	Nemotron 3 Ultra 550B A55B NVIDIA	48.6	47.7	$1.18	8.9
20	Kimi K2.5 Kimi	48.5	46.8	$1.19	8.9
21	MiniMax-M2.5 MiniMax	44.2	41.9	$0.52	8.9
22	GPT-5.4 mini OpenAI	51.2	48.9	$1.69	8.9
23	GLM-5.1 Z AI	53.7	51.4	$2.15	8.8
24	Qwen3.6 27B Alibaba	48.3	45.8	$1.35	8.8
25	Grok 4.1 Fast xAI	40.0	38.6	—	8.8
26	Gemini 3 Flash Preview Google	46.7	46.4	$1.13	8.8
27	Step 3.5 Flash 2603 StepFun	40.1	38.5	$0.15	8.8
28	Step 3.5 Flash StepFun	40.0	37.8	$0.15	8.7
29	GLM-4.7 Z AI	44.1	42.1	$1.00	8.6
30	Command A+ Cohere	37.2	37.2	$0.00	8.6
31	MiniMax-M2.1 MiniMax	40.3	39.4	$0.52	8.6
32	Gemini 3.5 Flash Google	57.3	55.3	$3.38	8.4
33	Qwen3.6 Max Preview Alibaba	53.7	51.8	$2.92	8.4
34	Kimi K2 Thinking Kimi	41.7	40.9	$1.07	8.4
35	MiniMax-M2 MiniMax	37.7	36.1	$0.52	8.3
36	Qwen3.7 Max Alibaba	58.0	56.6	$3.75	8.3
37	NVIDIA Nemotron 3 Super 120B A12B NVIDIA	36.4	36.0	$0.41	8.3
38	K-EXAONE LG AI Research	32.8	32.1	—	8.2
39	ERNIE 5.0 Thinking Preview Baidu	31.2	29.1	—	8.0
40	EXAONE 4.5 33B LG AI Research	30.7	30.2	—	8.0
41	DeepSeek V3.2 Exp DeepSeek	32.1	32.9	$0.31	8.0
42	Grok 4.20 0309 v2 xAI	49.3	49.3	$3.00	8.0
43	Grok 4.20 0309 xAI	48.4	48.5	$3.00	7.9
44	Gemini 3.1 Pro Preview Google	57.4	57.2	$4.50	7.8
45	Nova 2.0 Lite Amazon	34.0	34.5	$0.85	7.8
46	GLM-4.6 Z AI	34.0	32.5	$0.96	7.8
47	Nemotron Cascade 2 30B A3B NVIDIA	27.7	28.4	—	7.7
48	GPT-5.1 OpenAI	48.1	47.7	$3.44	7.6
49	Solar Pro 3 Upstage	26.4	25.9	—	7.6
50	Mistral Small 4 Mistral	27.1	27.8	$0.26	7.5
51	Kimi K2 0905 Kimi	31.8	30.9	$1.07	7.5
52	Sonar Reasoning Pro Perplexity	24.6	24.6	—	7.4
53	GPT-5.4 OpenAI	59.1	56.8	$5.63	7.3
54	Mistral Medium 3.5 Mistral	41.6	39.2	$3.00	7.3
55	NVIDIA Nemotron 3 Nano 30B A3B NVIDIA	22.7	24.3	$0.10	7.2
56	GPT-5.2 OpenAI	52.8	51.3	$4.81	7.2
57	DeepSeek V3.1 Terminus DeepSeek	33.1	33.9	$1.92	7.1
58	Solar Open 100B Upstage	21.1	21.7	—	7.1
59	Gemini 3 Pro Preview Google	48.9	48.4	$4.50	7.1
60	Kimi K2 Kimi	25.5	26.3	$1.03	7.0
61	Solar Pro 2 (Preview) Upstage	18.8	18.8	—	6.9
62	Mistral Large 3 Mistral	22.6	22.8	$0.75	6.9
63	Sonar Reasoning Perplexity	17.9	17.9	—	6.8
64	Mistral Medium 3.1 Mistral	21.8	21.3	$0.80	6.8
65	Claude Sonnet 4.6 Anthropic	53.9	51.7	$6.00	6.7
66	Nova 2.0 Pro Preview Amazon	37.4	35.7	$3.44	6.7
67	Llama Nemotron Super 49B v1.5 NVIDIA	16.5	18.7	$0.18	6.7
68	Sonar Perplexity	15.5	15.5	—	6.6
69	Sonar Pro Perplexity	15.2	15.2	—	6.6
70	EXAONE 4.0 32B LG AI Research	15.0	16.7	—	6.6
71	Granite 4.1 30B IBM	14.1	14.7	—	6.5
72	Solar Pro 2 Upstage	13.9	14.9	—	6.5
73	Llama 4 Maverick Meta	15.9	18.4	$0.47	6.4
74	Gemini 2.5 Pro Google	34.0	34.6	$3.44	6.4
75	R1 1776 Perplexity	12.0	12.0	—	6.3
76	Granite 4.1 8B IBM	11.6	12.4	$0.06	6.3
77	Solar Mini Upstage	11.9	11.9	$0.15	6.3
78	Nova Lite Amazon	10.6	12.7	$0.10	6.2
79	Llama 4 Scout Meta	11.2	13.5	$0.29	6.1
80	Llama 3.3 Instruct 70B Meta	13.0	14.5	$0.61	6.1
81	Granite 4.0 H Small IBM	9.6	10.8	$0.11	6.1
82	ERNIE 4.5 300B A47B Baidu	12.0	15.0	$0.49	6.1
83	Phi-3 Mini Instruct 3.8B Microsoft	9.2	10.1	—	6.1
84	Magistral Medium 1.2 Mistral	26.0	27.1	$2.75	6.0
85	Jamba Reasoning 3B AI21 Labs	8.0	9.6	—	6.0
86	Granite 4.1 3B IBM	7.8	8.5	—	6.0
87	Phi-4 Microsoft	8.4	10.4	$0.22	5.9
88	Exaone 4.0 1.2B LG AI Research	7.2	8.3	—	5.9
89	Granite 4.0 H 1B IBM	7.2	8.0	—	5.9
90	Phi-4 Mini Instruct Microsoft	6.8	8.4	$0.00	5.9
91	Jamba 1.7 Mini AI21 Labs	6.8	8.1	—	5.9
92	Tiny Aya Global Cohere	3.4	4.7	$0.00	5.6
93	Nova Pro Amazon	11.5	13.5	$1.40	5.5
94	Claude Opus 4.8 Anthropic	64.2	61.4	$10.00	5.3
95	Claude Opus 4.7 Anthropic	59.6	57.3	$10.00	4.9
96	Claude Opus 4.6 Anthropic	55.4	52.9	$10.00	4.5
97	Llama 3.1 Instruct 405B Meta	14.9	17.4	$3.69	4.5
98	GPT-5.5 OpenAI	62.9	60.2	$11.25	4.5
99	Jamba 1.5 Large AI21 Labs	10.7	10.7	$3.50	4.3
100	Jamba 1.6 Large AI21 Labs	10.6	10.6	$3.50	4.3
101	Jamba 1.7 Large AI21 Labs	9.3	10.9	$3.50	4.1
102	Nova Premier Amazon	18.0	19.0	$5.00	4.1
103	Grok 4 xAI	41.4	41.5	$11.00	2.7
104	Command-R+ Cohere	8.3	8.3	$6.00	2.7
105	Claude Fable 5 Anthropic	67.8	64.9	$20.00	0.0

Get the monthly PM AI rankings

New frontier models ship constantly. Get the updated PM Index rankings — and which model to use for which PM task — once a month. No spam, unsubscribe anytime.

Best AI model by PM use case

The right model depends on the job. These rankings reweight the benchmarks for specific product-management tasks.

How the PM Index works

Generic AI leaderboards optimize for coding or competitive math. The PM Index reweights Artificial Analysis benchmarks for the work product managers actually do — emphasizing general reasoning and multi-step competence, and de-emphasizing pure coding. It sits on the same 0–100 scale as the underlying indices so you can compare directly. Price shown is the AA-convention blended rate (3 parts input to 1 part output) per 1M tokens.

Model recommender

Which AI model should I use?

Value frontier chart

AI model rankings

Get the monthly PM AI rankings

Best AI model by PM use case

How the PM Index works