Google DeepMind’s Gemma AI models are a collection of lightweight, open models built from the same technology that powers the Gemini models. The company previously began offering four Gemma 4 models suitable for multimodal input tasks, each with various parameter sizes—two on lower end, two on the higher end—tailored for specific needs. The Gemma 4 12B open-weight model has joined the family, falling somewhere between the quad. The 12B sports an Apache 2.0 license like its siblings and is optimized for running locally on a standard business laptop.

The new Gemma 12B, tuned for text generation, coding, and reasoning, is a 12 billion-parameter open-weight model. It is also optimized to run locally on a standard enterprise laptop, using just 16 GB of VRAM or unified memory, thus eliminating the need for excessive (and expensive) RAM. There’s a convenience factor with the new model as well: Enterprise users can use it to continue working with AI when Wi-Fi is unavailable or when security concerns call for offline work.
Like the rest of the Gemma 4 family, the Gemma 4 12B is multimodal, capable of handling text and image input and generating text output, bringing native audio and vision understanding directly to local environments.




