Multimodal
Best multimodal local LLMs
Vision-capable models that handle text plus images on-device.
19 models in this collection.
Kimi K2.5
Kimi · 1000B
chatreasoningtool-usemoe+2
VRAM575 GB
RAM750 GB
QuantQ4_K_M
Mistral Large 3 675B
Mistral · 675B
chatgeneralmultimodaltool-use+3
VRAM388.1 GB
RAM507 GB
QuantQ4_K_M
Qwen3.5 35B A3B
Qwen · 35B
Best for RTX 5090 and 4090 class systems
chatreasoningtool-usemoe+3
VRAM20.1 GB
RAM27 GB
QuantQ4_K_M
Qwen3.6 35B A3B
Qwen · 35B
chatcodingreasoningtool-use+2
VRAM20.1 GB
RAM27 GB
QuantQ4_K_M
Gemma 4 31B
Gemma · 31B
chatgeneralmultimodalreasoning+1
VRAM16.5 GB
RAM20 GB
QuantQ3_K_M
Gemma 3 27B
Gemma · 27B
chatgeneralmultimodal
VRAM18 GB
RAM20 GB
QuantQ4_K_M
Qwen3.5 27B
Qwen · 27B
Best for High-capacity local APIs
chatgeneralreasoningtool-use+2
VRAM15.5 GB
RAM21 GB
QuantQ4_K_M
Gemma 4 26B A4B
Gemma · 26B
chatgeneralmultimodalreasoning+2
VRAM15 GB
RAM18 GB
QuantQ3_K_M
Mistral Small 3.1 24B
Mistral · 24B
chatgeneralreasoningmultimodal+2
VRAM14 GB
RAM20 GB
QuantQ4_K_M
Llama 4 Maverick 17B (128E)
Llama · 17B
chatmoemultimodalfrontier
VRAM235 GB
RAM256 GB
QuantQ4_K_M
Llama 4 Scout 17B (16E)
Llama · 17B
Best for Mid/high-end hardware
chatmoemultimodal
VRAM63 GB
RAM68 GB
QuantQ4_K_M
Gemma 3 12B
Gemma · 12B
Best for Mid-range GPUs
chatmultimodal
VRAM8.5 GB
RAM12 GB
QuantQ4_K_M
Qwen3.5 9B
Qwen · 9B
Best for Upgraded general local assistant
chatgeneralreasoningtool-use+2
VRAM5.2 GB
RAM7 GB
QuantQ4_K_M
Gemma 4 E4B
Gemma · 4.5B
Best for On-device multimodal assistants
chatsmallmultimodalreasoning+1
VRAM5 GB
RAM6 GB
QuantQ4_K_M
Gemma 3 4B
Gemma · 4B
chatsmallmultimodal
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M
Gemma 3n E4B
Gemma · 4B
chatsmallmultimodal
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M
Qwen3.5 4B
Qwen · 4B
chatsmallreasoningtool-use+2
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M
Gemma 4 E2B
Gemma · 2.3B
chatsmalledgemultimodal+2
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M
Gemma 3n E2B
Gemma · 2B
chatsmalledgemultimodal
VRAM1.2 GB
RAM2 GB
QuantQ4_K_M