Skip to main content

48 models in this collection.

Qwen3 0.6B

Qwen · 0.6B

0.3 GB
chattinyedgereasoning
VRAM0.3 GB
RAM1 GB
QuantQ4_K_M

Nomic Embed Text v1.5

Nomic · 0.137B

0.27 GB
embeddingrag
VRAM0.5 GB
RAM1 GB
QuantFP16

BGE Large EN v1.5

BGE · 0.335B

0.67 GB
embeddingrag
VRAM1 GB
RAM2 GB
QuantFP16

mxbai-embed-large

Mixedbread · 0.335B

0.67 GB
embeddingrag
VRAM1 GB
RAM2 GB
QuantFP16

Qwen 2.5 0.5B

Qwen · 0.5B

0.4 GB
chattinyedge
VRAM1 GB
RAM2 GB
QuantQ4_K_M

Qwen3 1.7B

Qwen · 1.7B

0.9 GB
chatsmalledgereasoning
VRAM1 GB
RAM2 GB
QuantQ4_K_M

Snowflake Arctic Embed L

Snowflake · 0.335B

0.67 GB
embeddingrag
VRAM1 GB
RAM2 GB
QuantFP16

Gemma 3n E2B

Gemma · 2B

1 GB
chatsmalledgemultimodal
VRAM1.2 GB
RAM2 GB
QuantQ4_K_M

Gemma 3 1B

Gemma · 1B

0.7 GB
chatsmalledge
VRAM1.5 GB
RAM2 GB
QuantQ4_K_M

Llama 3.2 1B

Llama · 1.24B

0.75 GB
chatsmalledgeinstruct
VRAM1.5 GB
RAM2 GB
QuantQ4_K_M

SmolLM3 3B

SmolLM · 3B

1.5 GB
chatsmallreasoningtool-use+1
VRAM1.7 GB
RAM3 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 1.5B

DeepSeek · 1.5B

1 GB
reasoningdistillsmallchat
VRAM2 GB
RAM4 GB
QuantQ4_K_M

Qwen 2.5 1.5B

Qwen · 1.5B

1 GB
chatsmalledge
VRAM2 GB
RAM4 GB
QuantQ4_K_M

StableLM 2 1.6B

StableLM · 1.6B

1 GB
chatsmall
VRAM2 GB
RAM4 GB
QuantQ4_K_M

Gemma 3n E4B

Gemma · 4B

2 GB
chatsmallmultimodal
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M

Qwen3 4B

Qwen · 4B

2 GB
chatsmallreasoning
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M

Qwen3.5 4B

Qwen · 4B

2 GB
chatsmallreasoningtool-use+2
VRAM2.3 GB
RAM3 GB
QuantQ4_K_M

Gemma 2 2B

Gemma · 2B

1.5 GB
chatsmall
VRAM2.5 GB
RAM4 GB
QuantQ4_K_M

Stable Code 3B

StableCode · 3B

1.8 GB
codingsmallchat
VRAM2.5 GB
RAM4 GB
QuantQ4_K_M

StarCoder2 3B

StarCoder · 3B

1.8 GB
codingsmallfimchat
VRAM2.5 GB
RAM4 GB
QuantQ4_K_M

Llama 3.2 3B

Llama · 3.21B

2 GB
chatsmalledgeinstruct
VRAM3 GB
RAM4 GB
QuantQ4_K_M

Phi-3 Mini 3.8B

Phi · 3.8B

2.3 GB
chatsmallcoding
VRAM3 GB
RAM4 GB
QuantQ4_K_M

Phi-4 Mini 3.8B

Phi · 3.8B

2.3 GB

Best for Low-VRAM devices

chatsmallreasoningcoding
VRAM3 GB
RAM4 GB
QuantQ4_K_M

Qwen 2.5 3B

Qwen · 3B

1.9 GB
chatsmall
VRAM3 GB
RAM4 GB
QuantQ4_K_M

Gemma 3 4B

Gemma · 4B

2.5 GB
chatsmallmultimodal
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M

Gemma 4 E2B

Gemma · 2.3B

2.7 GB
chatsmalledgemultimodal+2
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M

Nemotron Mini 4B

Nemotron · 4B

2.5 GB
chatsmall
VRAM3.5 GB
RAM4 GB
QuantQ4_K_M

Mistral 7B v0.3

Mistral · 7B

3.5 GB

Best for Baseline chat

chatgeneral
VRAM4 GB
RAM6 GB
QuantQ3_K_M

Llama 3.1 8B

Llama · 8B

3.9 GB

Best for General use

chatgeneralcoding
VRAM4.5 GB
RAM6 GB
QuantQ3_K_M

Qwen 2.5 7B

Qwen · 7B

3.7 GB

Best for General multilingual assistants

chatgeneralmultilingual
VRAM4.5 GB
RAM6 GB
QuantQ3_K_M

Yi 1.5 6B

Yi · 6B

3.7 GB
chatgeneralmultilingual
VRAM4.5 GB
RAM6 GB
QuantQ4_K_M

Qwen3 8B

Qwen · 8B

4 GB
chatgeneralreasoningmultilingual
VRAM4.6 GB
RAM6 GB
QuantQ4_K_M

CodeLlama 7B

CodeLlama · 7B

4.2 GB
codinginstructchat
VRAM5 GB
RAM8 GB
QuantQ4_K_M

Gemma 4 E4B

Gemma · 4.5B

4.1 GB

Best for On-device multimodal assistants

chatsmallmultimodalreasoning+1
VRAM5 GB
RAM6 GB
QuantQ4_K_M

StarCoder2 7B

StarCoder · 7B

4.2 GB
codingfimchat
VRAM5 GB
RAM8 GB
QuantQ4_K_M

Qwen3.5 9B

Qwen · 9B

4.5 GB

Best for Upgraded general local assistant

chatgeneralreasoningtool-use+2
VRAM5.2 GB
RAM7 GB
QuantQ4_K_M

Aya Expanse 8B

Command · 8B

4.9 GB
chatmultilingual
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Command R7B

Command · 7B

4.5 GB
chatragtool-use
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

4.9 GB
reasoningdistillchat
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

DeepSeek R1 Distill Qwen 7B

DeepSeek · 7B

4.7 GB

Best for Reasoning tasks

reasoningdistillchat
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Hermes 3 Llama 3.1 8B

Hermes · 8B

4.9 GB
chatgeneralfunction-calling
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

InternLM 2.5 7B

InternLM · 7B

4.7 GB
chatgeneraltool-usemultilingual
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Qwen 2.5 Coder 7B

Qwen · 7B

4.7 GB

Best for Code generation

codinginstructchat
VRAM5.5 GB
RAM8 GB
QuantQ4_K_M

Gemma 2 9B

Gemma · 9B

5.5 GB

Best for General local assistants

chatgeneral
VRAM6.5 GB
RAM8 GB
QuantQ4_K_M

Yi 1.5 9B

Yi · 9B

5.5 GB
chatgeneralmultilingual
VRAM6.5 GB
RAM8 GB
QuantQ4_K_M

Phi-4 Reasoning 14B

Phi · 14B

7 GB
chatreasoningcoding
VRAM8 GB
RAM11 GB
QuantQ4_K_M

Phi-4 Reasoning Plus 14B

Phi · 14B

7 GB
chatreasoningcodingfrontier
VRAM8 GB
RAM11 GB
QuantQ4_K_M

Qwen3 14B

Qwen · 14B

7 GB
chatgeneralreasoningmultilingual
VRAM8 GB
RAM11 GB
QuantQ4_K_M