本頁面由 Cloud Translation API 翻譯而成。

Gemini Live API

對於需要即時且低延遲語音支援的應用程式 (例如聊天機器人或代理互動)，Gemini Live API 提供最佳化的方式，可串流 Gemini 模型的輸入和輸出內容。使用 Firebase AI Logic，您就能直接從 Android 應用程式呼叫 Gemini Live API，無須整合後端。本指南說明如何透過 Firebase AI Logic，在 Android 應用程式中使用 Gemini Live API。

開始使用

開始之前，請先確認應用程式的目標 API 級別為 23 以上。

如果您尚未設定 Firebase 專案，請先設定，並將應用程式連結至 Firebase。詳情請參閱 Firebase AI Logic 說明文件。

設定 Android 專案

在應用程式層級的 build.gradle.kts 或 build.gradle 檔案中，加入 Firebase AI Logic 程式庫依附元件。使用 Firebase Android BoM 管理程式庫版本。

dependencies {
  // Import the Firebase BoM
  implementation(platform("com.google.firebase:firebase-bom:34.6.0"))
  // Add the dependency for the Firebase AI Logic library
  // When using the BoM, you don't specify versions in Firebase library dependencies
  implementation("com.google.firebase:firebase-ai")
}

新增依附元件後，請同步處理 Android 專案和 Gradle。

整合 Firebase AI Logic 並初始化生成模型

將 RECORD_AUDIO 權限新增至應用程式的 AndroidManifest.xml 檔案：

<uses-permission android:name="android.permission.RECORD_AUDIO" />

初始化 Gemini Developer API 後端服務，並存取 LiveModel。使用支援 Live API 的模型，例如 gemini-live-2.5-flash-preview。如需可用模型，請參閱 Firebase 說明文件。

如要指定語音，請在 speechConfig 物件中設定語音名稱，做為模型設定的一部分。如未指定語音，預設值為 Puck。

Kotlin

// Initialize the `LiveModel`
val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
       modelName = "gemini-live-2.5-flash-preview",
       generationConfig = liveGenerationConfig {
          responseModality = ResponseModality.AUDIO
          speechConfig = SpeechConfig(voice = Voice("FENRIR"))
       })

Java

// Initialize the `LiveModel`
LiveGenerativeModel model = FirebaseAI
       .getInstance(GenerativeBackend.googleAI())
       .liveModel(
              "gemini-live-2.5-flash-preview",
              new LiveGenerationConfig.Builder()
                     .setResponseModality(ResponseModality.AUDIO)
                     .setSpeechConfig(new SpeechConfig(new Voice("FENRIR"))
              ).build(),
        null,
        null
);

您可以視需要設定系統指令，定義模型扮演的角色或身分：

Kotlin

val systemInstruction = content {
            text("You are a helpful assistant, you main role is [...]")}

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
       modelName = "gemini-live-2.5-flash-preview",
       generationConfig = liveGenerationConfig {
          responseModality = ResponseModality.AUDIO
          speechConfig = SpeechConfig(voice= Voice("FENRIR"))
       },
       systemInstruction = systemInstruction,
)

Java

Content systemInstruction = new Content.Builder()
       .addText("You are a helpful assistant, you main role is [...]")
       .build();

LiveGenerativeModel model = FirebaseAI
       .getInstance(GenerativeBackend.googleAI())
       .liveModel(
              "gemini-live-2.5-flash-preview",
              new LiveGenerationConfig.Builder()
                     .setResponseModality(ResponseModality.AUDIO)
                     .setSpeechConfig(new SpeechConfig(new Voice("FENRIR"))
              ).build(),
        tools, // null if you don't want to use function calling
        systemInstruction
);

您可以使用系統指令提供應用程式專屬的脈絡資訊 (例如使用者在應用程式內的活動記錄)，進一步專門化與模型的對話。

初始化 Live API 工作階段

建立 LiveModel 執行個體後，請呼叫 model.connect() 建立 LiveSession 物件，並透過低延遲串流與模型建立持續連線。LiveSession 可讓你與模型互動，包括開始及停止語音對話，以及傳送和接收文字。

接著，您可以呼叫 startAudioConversation()，開始與模型對話：

Kotlin

val session = model.connect()
session.startAudioConversation()

Java

LiveModelFutures model = LiveModelFutures.from(liveModel);
ListenableFuture<LiveSession> sessionFuture = model.connect();

Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {
    @Override
    public void onSuccess(LiveSession ses) {
        LiveSessionFutures session = LiveSessionFutures.from(ses);
        session.startAudioConversation();
    }
    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

此外，請注意，在與模型對話時，模型不會處理中斷情況。

您也可以使用 Gemini Live API，根據文字生成串流音訊，以及根據串流音訊生成文字。請注意，Live API 是雙向的，因此您可以使用相同的連線傳送及接收內容。最終，你還能向模型傳送圖片和即時影像串流。

函式呼叫：將 Gemini Live API 連結至應用程式

如要更進一步，您也可以啟用函式呼叫，讓模型直接與應用程式的邏輯互動。

函式呼叫 (或工具呼叫) 是生成式 AI 實作項目的一項功能，可讓模型主動呼叫函式來執行動作。如果函式有輸出內容，模型會將其新增至內容，並用於後續生成作業。

這張圖表說明 Gemini Live API 如何解讀使用者提示，在 Android 應用程式中觸發預先定義的函式和相關引數，然後從模型接收確認回應。 — **圖 1：**圖表說明 Gemini Live API 如何解讀使用者提示，在 Android 應用程式中觸發具有相關引數的預先定義函式，然後從模型接收確認回應。

如要在應用程式中實作函式呼叫，請先為要向模型公開的每個函式建立 FunctionDeclaration 物件。

舉例來說，如要向 Gemini 公開 addList 函式，將字串附加至字串清單，請先建立 FunctionDeclaration 變數，並以簡單的英文說明函式及其參數：

Kotlin

val itemList = mutableListOf<String>()

fun addList(item: String){
   itemList.add(item)
}

val addListFunctionDeclaration = FunctionDeclaration(
        name = "addList",
        description = "Function adding an item the list",
        parameters = mapOf("item" to Schema.string("A short string
            describing the item to add to the list"))
        )

Java

HashMap<String, Schema> addListParams = new HashMap<String, Schema>(1);

addListParams.put("item", Schema.str("A short string describing the item to add to the list"));
addListParams.put("item", Schema.str("A short string describing the item to add to the list"));

FunctionDeclaration addListFunctionDeclaration = new FunctionDeclaration(
    "addList",
    "Function adding an item the list",
    addListParams,
    Collections.emptyList()
);

然後，在例項化模型時，將此 FunctionDeclaration 做為 Tool 傳遞給模型：

Kotlin

val addListTool = Tool.functionDeclarations(listOf(addListFunctionDeclaration))

val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(
       modelName = "gemini-live-2.5-flash-preview",
       generationConfig = liveGenerationConfig {
          responseModality = ResponseModality.AUDIO
          speechConfig = SpeechConfig(voice= Voice("FENRIR"))
       },
       systemInstruction = systemInstruction,
       tools = listOf(addListTool)
)

Java

LiveGenerativeModel model = FirebaseAI.getInstance(
    GenerativeBackend.googleAI()).liveModel(
        "gemini-live-2.5-flash-preview",
  new LiveGenerationConfig.Builder()
        .setResponseModalities(ResponseModality.AUDIO)
        .setSpeechConfig(new SpeechConfig(new Voice("FENRIR")))
        .build(),
  List.of(Tool.functionDeclarations(List.of(addListFunctionDeclaration))),
               null,
               systemInstruction
        );

最後，實作處理常式函式，處理模型發出的工具呼叫，並將回應傳回模型。呼叫 startAudioConversation 時提供給 LiveSession 的這個處理常式函式會採用 FunctionCallPart 參數，並傳回 FunctionResponsePart：

Kotlin

session.startAudioConversation(::functionCallHandler)

// ...

fun functionCallHandler(functionCall: FunctionCallPart): FunctionResponsePart {
    return when (functionCall.name) {
        "addList" -> {
            // Extract function parameter from functionCallPart
            val itemName = functionCall.args["item"]!!.jsonPrimitive.content
            // Call function with parameter
            addList(itemName)
            // Confirm the function call to the model
            val response = JsonObject(
                mapOf(
                    "success" to JsonPrimitive(true),
                    "message" to JsonPrimitive("Item $itemName added to the todo list")
                )
            )
            FunctionResponsePart(functionCall.name, response)
        }
        else -> {
            val response = JsonObject(
                mapOf(
                    "error" to JsonPrimitive("Unknown function: ${functionCall.name}")
                )
            )
            FunctionResponsePart(functionCall.name, response)
        }
    }
}

Java

Futures.addCallback(sessionFuture, new FutureCallback<LiveSessionFutures>() {

    @RequiresPermission(Manifest.permission.RECORD_AUDIO)
    @Override
    @OptIn(markerClass = PublicPreviewAPI.class)
    public void onSuccess(LiveSessionFutures ses) {
        ses.startAudioConversation(::handleFunctionCallFuture);
    }

    @Override
    public void onFailure(Throwable t) {
        // Handle exceptions
    }
}, executor);

// ...

ListenableFuture<JsonObject> handleFunctionCallFuture = Futures.transform(response, result -> {
    for (FunctionCallPart functionCall : result.getFunctionCalls()) {
        if (functionCall.getName().equals("addList")) {
            Map<String, JsonElement> args = functionCall.getArgs();
            String item =
                    JsonElementKt.getContentOrNull(
                            JsonElementKt.getJsonPrimitive(
                                    locationJsonObject.get("item")));
            return addList(item);
        }
    }
    return null;
}, Executors.newSingleThreadExecutor());

後續步驟

在 Android AI 目錄範例應用程式中試用 Gemini Live API。
如要進一步瞭解 Gemini Live API，請參閱 Firebase AI Logic 說明文件。
進一步瞭解可用的 Gemini 模型。
進一步瞭解函式呼叫。
探索提示設計策略。

Gemini Live API 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

開始使用

設定 Android 專案

整合 Firebase AI Logic 並初始化生成模型

Kotlin

Java

Kotlin

Java

初始化 Live API 工作階段

Kotlin

Java

函式呼叫：將 Gemini Live API 連結至應用程式

Kotlin

Java

Kotlin

Java

Kotlin

Java

後續步驟

Gemini Live API