Gemini Nano with the Google AI Edge SDK

On supported Android-powered devices, you can deliver rich generative AI experiences without needing a network connection or sending data to the cloud. On-device AI is a great solution for use-cases where low latency, low cost, and privacy safeguards are your primary concerns.

For on-device use-cases, you can take advantage of Google's Gemini Nano foundation model. While it is smaller than other Gemini models running inference in the cloud, you can fine-tune Gemini Nano to perform specialized tasks as well as its larger counterparts. Gemini Nano runs in Android's AICore system service, which leverages device hardware to enable low inference latency and keeps the model up-to-date.

Access to Gemini Nano API and AICore is provided by the Google AI Edge SDK. Google AI Edge is a comprehensive suite of tools for on-device ML. Learn more about the Google AI Edge.

Architecture

As a system-level module, you access AICore through a series of APIs in order to run inference on-device. In addition, AICore has several built-in safety features, ensuring a thorough evaluation against our safety filters. The following diagram outlines how an app accesses AICore to run Gemini Nano on-device.

Google AI Edge SDK, AICore, and Gemini Nano.
Figure 1. Google AI Edge SDK, AICore, and Gemini Nano.

Keep user data private and secure

On-device generative AI executes prompts locally, eliminating server calls. This approach enhances privacy by keeping sensitive data on the device, enables offline functionality, and reduces inference costs.

AICore adheres to the Private Compute Core principles, with the following key characteristics:

Restricted Package Binding: AICore is isolated from most other packages, with limited exceptions for specific system packages. Any modifications to this allowed list can only occur during a full Android OTA update.

Indirect Internet Access: AICore does not have direct internet access. All internet requests, including model downloads, are routed through the open-source Private Compute Services companion APK. APIs within Private Compute Services must explicitly demonstrate their privacy-centric nature.

Additionally, AICore is built to isolate each request and doesn't store any record of the input data or the resulting outputs after processing them to protect user privacy. Read the blog post An Introduction to Privacy and Safety for Gemini Nano to learn more.

Illustration of the AICore architecture
Figure 2. AICore architecture

Benefits of accessing AI foundation models with AICore

AICore enables the Android OS to provide and manage AI foundation models. This significantly reduces the cost of using these large models in your app, principally due to the following:

Ease of deployment: AICore manages the distribution of Gemini Nano and handles future updates. You don't need to worry about downloading or updating large models over the network, nor impact on your app's disk and runtime memory budget.

Accelerated inference: AICore leverages on-device hardware to accelerate inference. Your app gets the best performance on each device, and you don't need to worry about the underlying hardware interfaces.

Supported functionality

AICore supports the following devices and modalities:

  • Supported Devices: AICore is currently available on Pixel 9 series devices, Google Pixel 8 Series devices including Pixel 81 and Pixel 8a2, Samsung S24 Series devices, Samsung Z Fold6, Samsung Z Flip6, Realme GT 6, Motorola Edge 50 Ultra, Motorola Razr 50 Ultra, Xiaomi 14T/Pro, and Xiaomi MIX Flip.
  • Supported Modalities: AICore currently supports text modality for Gemini Nano.

Additional device and modality support are areas of active investment.

Use cases

Due to the resource constraints of mobile devices compared to cloud servers, on-device generative AI models are designed with a focus on efficiency and size. This optimization prioritizes specific, well-defined tasks over more generalized applications. Suitable use cases include:

  • Text Rephrasing: Modify the tone and style of text (e.g., casual to formal).
  • Smart Reply: Generate contextually relevant responses within a chat thread.
  • Proofreading: Identify and correct spelling and grammatical errors.
  • Summarization: Condense lengthy documents into concise summaries (paragraph or bullet points).

For optimal performance, refer to the prompting strategies documentation. To explore these use cases firsthand, download our sample app and begin experimenting with Gemini Nano.

Gemini Nano is used by several Google apps. Some examples include:

  • Talkback: Android's accessibility app Talkback leverages Gemini Nano's multimodal input capabilities to improve image descriptions for visually impaired users.
  • Pixel Voice Recorder: The Pixel Voice Recorder app uses Gemini Nano and AICore to power an on-device summarization feature. The Recorder team adopted the latest Gemini Nano model to support longer recordings and delivers higher quality summaries.
  • Gboard: Gboard smart reply leverages on-device Gemini Nano with AICore to provide accurate smart replies.

  1. Gemini Nano can be enabled on Pixel 8 devices as a developer option

  2. Gemini Nano can be enabled on Pixel 8a devices as a developer option