Best multimodal local LLMs | Can I Run This LLM?

19 models in this collection.

Kimi K2.5

Kimi · 1000B

500 GB

chatreasoningtool-usemoe+2

VRAM575 GB

RAM750 GB

QuantQ4_K_M

Mistral Large 3 675B

Mistral · 675B

337.5 GB

chatgeneralmultimodaltool-use+3

VRAM388.1 GB

RAM507 GB

QuantQ4_K_M

Qwen3.5 35B A3B

Qwen · 35B

17.5 GB

Best for RTX 5090 and 4090 class systems

chatreasoningtool-usemoe+3

VRAM20.1 GB

RAM27 GB

QuantQ4_K_M

Qwen3.6 35B A3B

Qwen · 35B

17.5 GB

chatcodingreasoningtool-use+2

VRAM20.1 GB

RAM27 GB

QuantQ4_K_M

Gemma 4 31B

Gemma · 31B

14.5 GB

chatgeneralmultimodalreasoning+1

VRAM16.5 GB

RAM20 GB

QuantQ3_K_M

Gemma 3 27B

Gemma · 27B

16 GB

chatgeneralmultimodal

VRAM18 GB

RAM20 GB

QuantQ4_K_M

Qwen3.5 27B

Qwen · 27B

13.5 GB

Best for High-capacity local APIs

chatgeneralreasoningtool-use+2

VRAM15.5 GB

RAM21 GB

QuantQ4_K_M

Gemma 4 26B A4B

Gemma · 26B

13.3 GB

chatgeneralmultimodalreasoning+2

VRAM15 GB

RAM18 GB

QuantQ3_K_M

Mistral Small 3.1 24B

Mistral · 24B

12.6 GB

chatgeneralreasoningmultimodal+2

VRAM14 GB

RAM20 GB

QuantQ4_K_M

Llama 4 Maverick 17B (128E)

Llama · 17B

230 GB

chatmoemultimodalfrontier

VRAM235 GB

RAM256 GB

QuantQ4_K_M

Llama 4 Scout 17B (16E)

Llama · 17B

60 GB

Best for Mid/high-end hardware

Gemma 3 12B

Gemma · 12B

7.3 GB

Best for Mid-range GPUs

Qwen3.5 9B

Qwen · 9B

4.5 GB

Best for Upgraded general local assistant

chatgeneralreasoningtool-use+2

VRAM5.2 GB

RAM7 GB

QuantQ4_K_M

Gemma 4 E4B

Gemma · 4.5B

4.1 GB

Best for On-device multimodal assistants

chatsmallmultimodalreasoning+1

VRAM5 GB

RAM6 GB

QuantQ4_K_M

Gemma 3 4B

Gemma · 4B

Gemma 3n E4B

Gemma · 4B

Qwen3.5 4B

Qwen · 4B

2 GB

chatsmallreasoningtool-use+2

VRAM2.3 GB

RAM3 GB

QuantQ4_K_M

Gemma 4 E2B

Gemma · 2.3B

2.7 GB

chatsmalledgemultimodal+2

VRAM3.5 GB

RAM4 GB

QuantQ4_K_M

Gemma 3n E2B

Gemma · 2B

1 GB

chatsmalledgemultimodal

VRAM1.2 GB

RAM2 GB

QuantQ4_K_M