Skip to main content

46 models in this collection.

Kimi K2.5

Kimi · 1000B

500 GB
chatreasoningtool-usemoe+2
VRAM575 GB
RAM750 GB
QuantQ4_K_M

GLM 5

GLM · 744B

372 GB
chatcodingreasoningtool-use+2
VRAM427.8 GB
RAM558 GB
QuantQ4_K_M

DeepSeek V3.2

DeepSeek · 685B

342.5 GB
chatcodingreasoningtool-use+2
VRAM393.9 GB
RAM514 GB
QuantQ4_K_M

DeepSeek R1 671B

DeepSeek · 671B

240 GB
reasoningmoefrontierchat
VRAM245 GB
RAM260 GB
QuantQ2_K

GLM 4.7

GLM · 355B

177.5 GB
chatcodingreasoningtool-use+2
VRAM204.1 GB
RAM267 GB
QuantQ4_K_M

MiMo V2 Flash

MiMo · 309B

154.5 GB
chatcodingreasoningtool-use+2
VRAM177.7 GB
RAM232 GB
QuantQ4_K_M

Llama 3.1 Nemotron Ultra 253B

Nemotron · 253B

126.5 GB
chatreasoningtool-usefrontier
VRAM145.5 GB
RAM190 GB
QuantQ4_K_M

Qwen3 235B A22B

Qwen · 235B

117.5 GB
chatreasoningtool-usemoe+1
VRAM135.1 GB
RAM177 GB
QuantQ4_K_M

MiniMax M2.5

MiniMax · 230B

115 GB
chatcodingreasoningtool-use+2
VRAM132.3 GB
RAM173 GB
QuantQ4_K_M

Step 3.5 Flash

Step · 196B

98 GB
chatcodingreasoningtool-use+2
VRAM112.7 GB
RAM147 GB
QuantQ4_K_M

Devstral 2 123B

Mistral · 123B

61.5 GB
codingtool-usereasoningfrontier
VRAM70.7 GB
RAM93 GB
QuantQ4_K_M

GPT-OSS 120B

GPT-OSS · 120B

60 GB
chatcodingreasoningtool-use+2
VRAM69 GB
RAM90 GB
QuantQ4_K_M

Qwen3 Coder Next 80B A3B

Qwen · 80B

40 GB
codingtool-usereasoningmoe+2
VRAM46 GB
RAM60 GB
QuantQ4_K_M

DeepSeek R1 Distill Llama 70B

DeepSeek · 70B

33 GB

Best for Enterprise-grade local pilots

reasoningdistillchat
VRAM35 GB
RAM40 GB
QuantQ3_K_M

Qwen3.5 35B A3B

Qwen · 35B

17.5 GB

Best for RTX 5090 and 4090 class systems

chatreasoningtool-usemoe+3
VRAM20.1 GB
RAM27 GB
QuantQ4_K_M

Qwen3.6 35B A3B

Qwen · 35B

17.5 GB
chatcodingreasoningtool-use+2
VRAM20.1 GB
RAM27 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 32B

DeepSeek · 32B

15 GB

Best for High-value reasoning tasks

reasoningdistillchat
VRAM17 GB
RAM20 GB
QuantQ3_K_M

Qwen3 32B

Qwen · 32B

16 GB
chatgeneralreasoningmultilingual
VRAM18.4 GB
RAM24 GB
QuantQ4_K_M

QwQ 32B

Qwen · 32B

15 GB

Best for Expert analytical users

reasoningchat
VRAM17 GB
RAM20 GB
QuantQ3_K_M

Gemma 4 31B

Gemma · 31B

14.5 GB
chatgeneralmultimodalreasoning+1
VRAM16.5 GB
RAM20 GB
QuantQ3_K_M

Qwen3 30B A3B

Qwen · 30B

15 GB
chatreasoningtool-usemoe
VRAM17.3 GB
RAM23 GB
QuantQ4_K_M

Qwen3 Coder 30B A3B

Qwen · 30B

15 GB
codingtool-usereasoningmoe+1
VRAM17.3 GB
RAM23 GB
QuantQ4_K_M

Qwen3.5 27B

Qwen · 27B

13.5 GB

Best for High-capacity local APIs

chatgeneralreasoningtool-use+2
VRAM15.5 GB
RAM21 GB
QuantQ4_K_M

Gemma 4 26B A4B

Gemma · 26B

13.3 GB
chatgeneralmultimodalreasoning+2
VRAM15 GB
RAM18 GB
QuantQ3_K_M

Devstral Small 2 24B

Mistral · 24B

12 GB
codingtool-usereasoning
VRAM13.8 GB
RAM18 GB
QuantQ4_K_M

Mistral Small 24B

Mistral · 24B

14 GB
chatgeneralreasoning
VRAM16 GB
RAM20 GB
QuantQ4_K_M

Mistral Small 3.1 24B

Mistral · 24B

12.6 GB
chatgeneralreasoningmultimodal+2
VRAM14 GB
RAM20 GB
QuantQ4_K_M

GPT-OSS 20B

GPT-OSS · 20B

10 GB
chatcodingreasoningtool-use+1
VRAM11.5 GB
RAM15 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 14B

DeepSeek · 14B

8.7 GB

Best for Reasoning-intensive workflows

reasoningdistillchat
VRAM10 GB
RAM12 GB
QuantQ4_K_M

Phi-4 14B

Phi · 14B

8.2 GB

Best for Analysis-heavy assistants

chatgeneralreasoningcoding
VRAM9.5 GB
RAM12 GB
QuantQ4_K_M

Phi-4 Reasoning 14B

Phi · 14B

7 GB
chatreasoningcoding
VRAM8 GB
RAM11 GB
QuantQ4_K_M

Phi-4 Reasoning Plus 14B

Phi · 14B

7 GB
chatreasoningcodingfrontier
VRAM8 GB
RAM11 GB
QuantQ4_K_M

Qwen3 14B

Qwen · 14B

7 GB
chatgeneralreasoningmultilingual
VRAM8 GB
RAM11 GB
QuantQ4_K_M

Qwen3.5 9B

Qwen · 9B

4.5 GB

Best for Upgraded general local assistant

chatgeneralreasoningtool-use+2
VRAM5.2 GB
RAM7 GB
QuantQ4_K_M

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

4.9 GB
reasoningdistillchat
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Qwen3 8B

Qwen · 8B

4 GB
chatgeneralreasoningmultilingual
VRAM4.6 GB
RAM6 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 7B

DeepSeek · 7B

4.7 GB

Best for Reasoning tasks

reasoningdistillchat
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Gemma 4 E4B

Gemma · 4.5B

4.1 GB

Best for On-device multimodal assistants

chatsmallmultimodalreasoning+1
VRAM5 GB
RAM6 GB
QuantQ4_K_M

Qwen3 4B

Qwen · 4B

2 GB
chatsmallreasoning
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M

Qwen3.5 4B

Qwen · 4B

2 GB
chatsmallreasoningtool-use+2
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M

Phi-4 Mini 3.8B

Phi · 3.8B

2.3 GB

Best for Low-VRAM devices

chatsmallreasoningcoding
VRAM3 GB
RAM4 GB
QuantQ4_K_M

SmolLM3 3B

SmolLM · 3B

1.5 GB
chatsmallreasoningtool-use+1
VRAM1.7 GB
RAM3 GB
QuantQ4_K_M

Gemma 4 E2B

Gemma · 2.3B

2.7 GB
chatsmalledgemultimodal+2
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M

Qwen3 1.7B

Qwen · 1.7B

0.9 GB
chatsmalledgereasoning
VRAM1 GB
RAM2 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 1.5B

DeepSeek · 1.5B

1 GB
reasoningdistillsmallchat
VRAM2 GB
RAM4 GB
QuantQ4_K_M

Qwen3 0.6B

Qwen · 0.6B

0.3 GB
chattinyedgereasoning
VRAM0.3 GB
RAM1 GB
QuantQ4_K_M