Strengths
- Strong quality per VRAM for a current multimodal model
- Supports image and audio input on-device
- Good fit for local agent and tool-use workflows without jumping to a much larger model
Model Brief
Best small Gemma pick for local multimodal workflows. Use this as a shortlist aid, then validate quality with your own tests.
Start with Q4_K_M for broad compatibility and move to Q8_0 only if your GPU still feels responsive.
Source model page: https://huggingface.co/google/gemma-4-E4B-it