Best long-context local LLMs

chatgeneralmultimodalreasoning+1

Gemma 4 31B

Gemma · 31B

14.5 GB

VRAM16.5 GB

RAM20 GB

chatragtool-usemultilingual

Command R 35B

Command · 35B

20 GB

VRAM22 GB

RAM24 GB

chatragtool-usemultilingual+1

Command R+ 104B

Command · 104B

48 GB

VRAM50 GB

RAM56 GB

reasoningdistillsmallchat

DeepSeek Coder V2 Lite 16B

DeepSeek · 16B

DeepSeek R1 671B

DeepSeek · 671B

240 GB

reasoningmoefrontierchat

VRAM245 GB

RAM260 GB

QuantQ2_K

DeepSeek R1 Distill Llama 70B

DeepSeek · 70B

33 GB

Best for Enterprise-grade local pilots

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

DeepSeek R1 Distill Qwen 1.5B

DeepSeek · 1.5B

1 GB

VRAM2 GB

RAM4 GB

chatsmalledgemultimodal+2

DeepSeek R1 Distill Qwen 14B

DeepSeek · 14B

8.7 GB

Best for Reasoning-intensive workflows

DeepSeek R1 Distill Qwen 32B

DeepSeek · 32B

15 GB

Best for High-value reasoning tasks

DeepSeek R1 Distill Qwen 7B

DeepSeek · 7B

4.7 GB

Best for Reasoning tasks

DeepSeek V3 671B

DeepSeek · 671B

240 GB

chatgeneralmoefrontier

VRAM245 GB

RAM260 GB

QuantQ2_K

Gemma 4 E2B

Gemma · 2.3B

2.7 GB

VRAM3.5 GB

RAM4 GB

chatsmallmultimodalreasoning+1

Gemma 4 E4B

Gemma · 4.5B

4.1 GB

Best for On-device multimodal assistants

VRAM5 GB

RAM6 GB

chatgeneralfunction-calling

Hermes 3 Llama 3.1 70B

Hermes · 70B

40 GB

VRAM42 GB

RAM48 GB

chatgeneralfunction-calling

Hermes 3 Llama 3.1 8B

Hermes · 8B

4.9 GB

VRAM5.5 GB

RAM8 GB

Llama 3.1 405B

Llama · 405B

Llama 3.1 70B

Llama · 70B

Llama 3.1 8B

Llama · 8B

Best for General use

Llama 3.1 Nemotron 70B

Nemotron · 70B

Llama 3.2 1B

Llama · 1.24B

0.75 GB

chatsmalledgeinstruct

VRAM1.5 GB

RAM2 GB

Llama 3.2 3B

Llama · 3.21B

2 GB

chatsmalledgeinstruct

VRAM3 GB

RAM4 GB

chatgeneralcodinginstruct

Llama 3.3 70B

Llama · 70B

33 GB

Best for High-end workstations

VRAM35 GB

RAM40 GB

chatmoemultimodalfrontier

Llama 4 Maverick 17B (128E)

Llama · 17B

230 GB

VRAM235 GB

RAM256 GB

Llama 4 Scout 17B (16E)

Llama · 17B

60 GB

Best for Mid/high-end hardware

Mistral Nemo 12B

Mistral · 12B

7.3 GB

Best for Multilingual assistants

VRAM8.5 GB

RAM12 GB

chatgeneralreasoningmultimodal+2

Mistral Small 3.1 24B

Mistral · 24B

12.6 GB

VRAM14 GB

RAM20 GB

Qwen 2.5 1.5B

Qwen · 1.5B

Qwen 2.5 14B

Qwen · 14B

8.7 GB

Best for Internal team assistants

VRAM10 GB

RAM12 GB

Qwen 2.5 32B

Qwen · 32B

15 GB

Best for High-end local deployments

VRAM17 GB

RAM20 GB

Qwen 2.5 3B

Qwen · 3B

Qwen 2.5 72B

Qwen · 72B

27 GB

VRAM29 GB

RAM36 GB

QuantQ2_K

Qwen 2.5 7B

Qwen · 7B

3.7 GB

Best for General multilingual assistants

VRAM4.5 GB

RAM6 GB