Google has released Gemma 4, a new family of open models built from the same research base as Gemini 3. The launch gives developers four model sizes, wider multimodal support, longer context windows, and a more permissive Apache 2.0 license.
In an X post on Thursday, Google Chief Scientist Jeff Dean described Gemma 4 as “built on the same research and technology as our Gemini 3 series,” tying the new lineup directly to Google’s recent Gemini work.
The new lineup includes Effective 2B, Effective 4B, 26B Mixture of Experts, and 31B Dense. Google said it designed the range to serve different hardware limits, from Android devices and laptops to larger developer workstations.
The company also stated that the 31B Dense and 26B MoE models rank near the top of the open-model field on Arena AI’s text leaderboard, while the smaller models focus on lower latency and on-device use.
Google also widened the media formats these models can handle. All Gemma 4 variants can process images and video, while the E2B and E4B versions add native audio input for speech recognition and understanding.
The edge models support a 128K context window, and the larger two models reach 256K. Google said that range lets users pass long documents, code repositories, and other large inputs in a single prompt.
The release extends Google’s Gemini 3 push
The company presented Gemma 4 as an open-model companion to Gemini 3 rather than a separate track. In its launch post, Google wrote that Gemma 4 comes from the same research and technology stack as Gemini 3 and gives developers both open and proprietary options inside one broader ecosystem.
That framing places Gemma 4 inside Google’s wider effort to spread Gemini-based tools across consumer and developer products.
That broader push has moved quickly in recent months. Google introduced Gemini 3 in late 2025 as its latest model family for reasoning, multimodal work, planning, and tool use across the Gemini app, AI Studio, and Vertex AI. On January 8, 2026, Google announced that Gmail had started rolling out Gemini 3 features.
On February 19, the company followed with Gemini 3.1 Pro, which it described as the next iteration in the Gemini 3 series and rolled out through the Gemini API, Vertex AI, Gemini app, NotebookLM, Gemini CLI, and Android Studio.
Gemma 4 now brings part of that same research path into an open release that developers can run on their own hardware.
Google puts local agents at the center of the launch
A key part of the launch is Google’s focus on local agents. The company said Gemma 4 includes native support for function calling, structured JSON output, and system instructions.
It also stated that the models can handle multi-step planning, autonomous action, offline code generation, and audio-visual processing without specialized fine-tuning. Those features move the family beyond basic chat and toward local software agents that can call tools and complete chained tasks on-device.
Google paired that message with new deployment options for Android and edge development. On the same day as the launch, Google announced Gemma 4 in the AICore Developer Preview, giving developers access to a built-in on-device model on Android.
Meanwhile, the company said the preview will add support for tool calling, structured output, system prompts, and thinking mode in the Prompt API. Through Google AI Edge, it is also pushing Gemma 4 for mobile, desktop, and edge apps that run directly on local hardware.
Apache 2.0 terms widen commercial use
Google released Gemma 4 under the Apache 2.0 license and presented that move as a direct answer to developer demand for fewer commercial limits. The company wrote that the models are meant to stay widely accessible, with the goal of making stronger capabilities easier to use in research and product development.
It also pointed to the existing Gemma community, saying earlier generations had reached more than 400 million downloads and led to over 100,000 variants built by developers.
The company has also made the models broadly available from day one. Google said developers can access Gemma 4 through Google Cloud and download the open weights from Hugging Face, Kaggle, and Ollama.
In addition, it also listed support across tools such as Hugging Face Transformers, vLLM, llama.cpp, MLX, LM Studio, NVIDIA NIM, and Unsloth.
The rollout also comes as model development grows more expensive across the sector. As we reported on February 27, Meta has reportedly signed a multi-billion-dollar deal to rent Google AI chips to build more advanced systems.
In that context, the report points to the rising cost of compute and infrastructure as major firms race to train and deploy newer models.


