Best reasoning models for local use

GLM 5

GLM · 744B

372 GB

VRAM427.8 GB

RAM558 GB

DeepSeek V3.2

DeepSeek · 685B

342.5 GB

VRAM393.9 GB

RAM514 GB

DeepSeek R1 671B

DeepSeek · 671B

240 GB

reasoningmoefrontierchat

VRAM245 GB

RAM260 GB

QuantQ2_K

GLM 4.7

GLM · 355B

177.5 GB

VRAM204.1 GB

RAM267 GB

MiMo V2 Flash

MiMo · 309B

154.5 GB

VRAM177.7 GB

RAM232 GB

chatreasoningtool-usefrontier

Llama 3.1 Nemotron Ultra 253B

Nemotron · 253B

126.5 GB

VRAM145.5 GB

RAM190 GB

chatreasoningtool-usemoe+1

Qwen3 235B A22B

Qwen · 235B

117.5 GB

VRAM135.1 GB

RAM177 GB

MiniMax M2.5

MiniMax · 230B

115 GB

VRAM132.3 GB

RAM173 GB

Step 3.5 Flash

Step · 196B

98 GB

VRAM112.7 GB

RAM147 GB

codingtool-usereasoningfrontier

Devstral 2 123B

Mistral · 123B

61.5 GB

VRAM70.7 GB

RAM93 GB

GPT-OSS 120B

GPT-OSS · 120B

60 GB

VRAM69 GB

RAM90 GB

codingtool-usereasoningmoe+2

Qwen3 Coder Next 80B A3B

Qwen · 80B

40 GB

VRAM46 GB

RAM60 GB

chatreasoningtool-usemoe+3

DeepSeek R1 Distill Llama 70B

DeepSeek · 70B

33 GB

Best for Enterprise-grade local pilots

Qwen3.5 35B A3B

Qwen · 35B

17.5 GB

Best for RTX 5090 and 4090 class systems

VRAM20.1 GB

RAM27 GB

Qwen3.6 35B A3B

Qwen · 35B

17.5 GB

VRAM20.1 GB

RAM27 GB

DeepSeek R1 Distill Qwen 32B

DeepSeek · 32B

chatgeneralreasoningmultilingual

Best for High-value reasoning tasks

Qwen3 32B

Qwen · 32B

16 GB

VRAM18.4 GB

RAM24 GB

QwQ 32B

Qwen · 32B

chatgeneralmultimodalreasoning+1

Best for Expert analytical users

Gemma 4 31B

Gemma · 31B

14.5 GB

VRAM16.5 GB

RAM20 GB

QuantQ3_K_M

Qwen3 30B A3B

Qwen · 30B

chatreasoningtool-usemoe

VRAM17.3 GB

RAM23 GB

Qwen3 Coder 30B A3B

Qwen · 30B

codingtool-usereasoningmoe+1

VRAM17.3 GB

RAM23 GB

chatgeneralreasoningtool-use+2

Qwen3.5 27B

Qwen · 27B

13.5 GB

Best for High-capacity local APIs

VRAM15.5 GB

RAM21 GB

chatgeneralmultimodalreasoning+2

Gemma 4 26B A4B

Gemma · 26B

13.3 GB

VRAM15 GB

RAM18 GB

QuantQ3_K_M

Devstral Small 2 24B

Mistral · 24B

12 GB

codingtool-usereasoning

VRAM13.8 GB

RAM18 GB

chatgeneralreasoningmultimodal+2

Mistral Small 24B

Mistral · 24B

Mistral Small 3.1 24B

Mistral · 24B

12.6 GB

VRAM14 GB

RAM20 GB

chatcodingreasoningtool-use+1

GPT-OSS 20B

GPT-OSS · 20B

10 GB

VRAM11.5 GB

RAM15 GB

chatgeneralreasoningcoding

DeepSeek R1 Distill Qwen 14B

DeepSeek · 14B

8.7 GB

Best for Reasoning-intensive workflows

Phi-4 14B

Phi · 14B

8.2 GB

Best for Analysis-heavy assistants

VRAM9.5 GB

RAM12 GB

chatreasoningcodingfrontier

Phi-4 Reasoning 14B

Phi · 14B

Phi-4 Reasoning Plus 14B

Phi · 14B

7 GB

VRAM8 GB

RAM11 GB

chatgeneralreasoningmultilingual

Qwen3 14B

Qwen · 14B

7 GB

VRAM8 GB

RAM11 GB

chatgeneralreasoningtool-use+2

Qwen3.5 9B

Qwen · 9B

4.5 GB

Best for Upgraded general local assistant

VRAM5.2 GB

RAM7 GB

chatgeneralreasoningmultilingual

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

Qwen3 8B

Qwen · 8B

4 GB

VRAM4.6 GB

RAM6 GB

chatsmallmultimodalreasoning+1

DeepSeek R1 Distill Qwen 7B

DeepSeek · 7B

4.7 GB

Best for Reasoning tasks

Gemma 4 E4B

Gemma · 4.5B

4.1 GB

Best for On-device multimodal assistants

VRAM5 GB

RAM6 GB

chatsmallreasoningtool-use+2

Qwen3 4B

Qwen · 4B

Qwen3.5 4B

Qwen · 4B

2 GB

VRAM2.3 GB

RAM3 GB

Phi-4 Mini 3.8B

Phi · 3.8B

2.3 GB

Best for Low-VRAM devices

chatsmallreasoningcoding

VRAM3 GB

RAM4 GB

chatsmallreasoningtool-use+1

SmolLM3 3B

SmolLM · 3B

1.5 GB

VRAM1.7 GB

RAM3 GB

chatsmalledgemultimodal+2

Gemma 4 E2B

Gemma · 2.3B

2.7 GB

VRAM3.5 GB

RAM4 GB

Qwen3 1.7B

Qwen · 1.7B

0.9 GB

chatsmalledgereasoning

VRAM1 GB

RAM2 GB

reasoningdistillsmallchat

DeepSeek R1 Distill Qwen 1.5B

DeepSeek · 1.5B

1 GB

VRAM2 GB

RAM4 GB

Qwen3 0.6B

Qwen · 0.6B

0.3 GB

chattinyedgereasoning

VRAM0.3 GB

RAM1 GB