Q3_K_M
13.3 GBMin VRAM: 15 GB
Recommended VRAM: 18 GB
Min RAM: 18 GB
Context: 8K / 256K
Loading model details...
Fetching variants, compatibility details, and metadata.
Model Detail
Share Gemma 4 26B A4B with someone who is deciding what to run locally.
Social proof
32% of 847 scanned PCs run Gemma 4 26B A4B fully on GPU.
601 keep at least some work on GPU. Based on anonymous compatibility checks.
General-purpose local model brief
Best for
Consider alternatives if
Quantization tip: Benchmark at least two quantizations and validate with a task-specific eval set before production use.
New to local models? Smaller quantization variants are easier to run, while larger ones can improve quality at the cost of more memory.
Q3_K_M
13.3 GBMin VRAM: 15 GB
Recommended VRAM: 18 GB
Min RAM: 18 GB
Context: 8K / 256K
Q4_K_M
16.6 GBMin VRAM: 18.5 GB
Recommended VRAM: 24 GB
Min RAM: 22 GB
Context: 8K / 256K
Q8_0
29.2 GBMin VRAM: 31 GB
Recommended VRAM: 36 GB
Min RAM: 36 GB
Context: 8K / 256K
| Quantization | File Size | Min VRAM | Recommended VRAM | Min RAM | Context |
|---|---|---|---|---|---|
| Q3_K_M | 13.3 GB | 15 GB | 18 GB | 18 GB | 8K / 256K |
| Q4_K_M | 16.6 GB | 18.5 GB | 24 GB | 22 GB | 8K / 256K |
| Q8_0 | 29.2 GB | 31 GB | 36 GB | 36 GB | 8K / 256K |
These GPUs meet the recommended 18 GB VRAM for the Q3_K_M quantization. Estimated speeds are approximate and assume full GPU offloading.
Budget Pick
AMD Radeon RX 7900 XT20 GB VRAM · ~41.5 tok/s
Lowest cost that meets recommended VRAM
Check price on AmazonFastest Pick
NVIDIA GeForce RTX 509032 GB VRAM · ~92.9 tok/s
Highest estimated throughput
Check price on AmazonBest Value
NVIDIA GeForce RTX 409024 GB VRAM · ~52.3 tok/s
Best speed per dollar of VRAM
Rent on RunPodNeed a detailed comparison? See all GPU rankings for Gemma 4 26B A4B.