Q4_K_M
12.6 GBMin VRAM: 14 GB
Recommended VRAM: 16 GB
Min RAM: 20 GB
Context: 8K / 128K
Loading model details...
Fetching variants, compatibility details, and metadata.
Model Detail
Share Mistral Small 3.1 24B with someone who is deciding what to run locally.
Social proof
56% of 984 scanned PCs run Mistral Small 3.1 24B fully on GPU.
683 keep at least some work on GPU. Based on anonymous compatibility checks.
General-purpose local model brief
Best for
Consider alternatives if
Quantization tip: Benchmark at least two quantizations and validate with a task-specific eval set before production use.
New to local models? Smaller quantization variants are easier to run, while larger ones can improve quality at the cost of more memory.
Q4_K_M
12.6 GBMin VRAM: 14 GB
Recommended VRAM: 16 GB
Min RAM: 20 GB
Context: 8K / 128K
FP16
45.6 GBMin VRAM: 48 GB
Recommended VRAM: 54 GB
Min RAM: 60 GB
Context: 8K / 128K
| Quantization | File Size | Min VRAM | Recommended VRAM | Min RAM | Context |
|---|---|---|---|---|---|
| Q4_K_M | 12.6 GB | 14 GB | 16 GB | 20 GB | 8K / 128K |
| FP16 | 45.6 GB | 48 GB | 54 GB | 60 GB | 8K / 128K |
These GPUs meet the recommended 16 GB VRAM for the Q4_K_M quantization. Estimated speeds are approximate and assume full GPU offloading.
Budget Pick
NVIDIA GeForce RTX 508016 GB VRAM · ~61 tok/s
Lowest cost that meets recommended VRAM
Check price on AmazonFastest Pick
NVIDIA GeForce RTX 509032 GB VRAM · ~113.8 tok/s
Highest estimated throughput
Check price on AmazonBest Value
NVIDIA GeForce RTX 5070 Ti16 GB VRAM · ~56.9 tok/s
Best speed per dollar of VRAM
Check price on AmazonNeed a detailed comparison? See all GPU rankings for Mistral Small 3.1 24B.