Deploying a containerized agent built with the Google Agent Development Kit (ADK) that uses VertexAI API

Deploying Google Agent Development Kit (ADK) Agents on Google Kubernetes Engine (GKE)

This tutorial guides you through deploying a containerized agent built with the Google Agent Development Kit (ADK) to Google Kubernetes Engine (GKE). GKE provides a managed environment for deploying, managing, and scaling your containerized applications using Google infrastructure.

Overview

In this tutorial we will deploy a simple agent to GKE. The agent will be a FastAPI application that uses Gemini 2.0 Flash as the LLM. We will use Vertex AI as the LLM provider.

This tutorial will cover:

Setting up your Google Cloud environment.
Building a container image for your agent.
Deploying the agent to a GKE cluster.
Testing your deployed agent.

Before you begin

Ensure you have the following tools installed on your workstation

If you previously installed the gcloud CLI, get the latest version by running:

gcloud components update

Ensure that you are signed in using the gcloud CLI tool. Run the following command:

gcloud auth application-default login

Make sure you have a permission to create custom IAM roles. Your user must have roles/iam.roleAdmin or roles/iam.organizationRoleAdmin role.

Infrastructure Setup

Clone the repository

Clone the repository with our guides and cd to the adk/vertex/ directory by running these commands:

git clone https://github.com/ai-on-gke/tutorials-and-examples.git
cd tutorials-and-examples/adk/vertex

Filesystem structure

adk/vertex/
├── terraform/    # Terraform configuration for automated deployment of an infrastructure
└── app/          # The desired structure of the final application. 
    ├── capital_agent/             # Agent's module name
    │   ├── __init__.py
    │   └── agent.py               # Your agent logic
    ├── main.py                    # FastAPI application entry point
    ├── requirements.txt           # Python dependencies
    └── Dockerfile                 # Container build instructions

Enable Necessary APIs

Enable the APIs required for GKE, Artifact Registry, Cloud Build, and Vertex AI

gcloud services enable \
    container.googleapis.com \
    artifactregistry.googleapis.com \
    cloudbuild.googleapis.com \
    aiplatform.googleapis.com

Create cluster and other resources

In this section we will use Terraform to automate the creation of infrastructure resources. For more details how it is done please refer to the terraform config in the terraform/ folder. By default, the configuration provisions an Autopilot GKE cluster, but it can be changed to standard by setting autopilot_cluster = false.

It creates the following resources. For more information such as resource names and other details, please refer to the Terraform config:

Service Accounts:
- Cluster IAM Service Account (derives name from a cluster name, e.g. tf-gke-<cluster name>) – manages permissions for the GKE cluster.
- Application’s IAM Service Account (default name adk-tf and can be changed in the terraform config) – manages permissions for the deployed application to access:
  - VertexAI LLM model.
Artifact registry – stores container images for the application.

Go the the terraform directory:
```
cd terraform
```
Specify the following values inside the default_env.tfvars file (or make a separate copy):
- <PROJECT_ID> – replace with your project id (you can find it in the project settings).
Other values can be changed, if needed, but can be left with default values.
Init terraform modules:
```
terraform init
```
Optionally run the plan command to view an execution plan:
```
terraform plan -var-file=default_env.tfvars
```

Execute the plan:

terraform apply -var-file=default_env.tfvars

And you should see your resources created:

Apply complete! Resources: 16 added, 0 changed, 0 destroyed.

Outputs:

gke_cluster_location = "us-central1"
gke_cluster_name = "adk-tf"
image_repository_full_name = "us-docker.pkg.dev/<PROJECT ID>/adk-tf"
image_repository_location = "us"
image_repository_name = "adk-tf"
k8s_service_account_name = "adk-tf"
project_id = <PROJECT ID>

Configure your kubectl context:

gcloud container clusters get-credentials $(terraform output -raw gke_cluster_name) --region $(terraform output -raw gke_cluster_location)

Deploy and Configure the Agent Application

Create the app/main.py file. This file sets up the FastAPI application using get_fast_api_app() from ADK.

import os

import uvicorn
from fastapi import FastAPI
from google.adk.cli.fast_api import get_fast_api_app

# Get the directory where main.py is located
AGENT_DIR = os.path.dirname(os.path.abspath(__file__))
# Example session service URI (e.g., SQLite)
SESSION_SERVICE_URI = ""
# Example allowed origins for CORS
ALLOWED_ORIGINS = ["http://localhost", "http://localhost:8080", "*"]
# Set web=True if you intend to serve a web interface, False otherwise
SERVE_WEB_INTERFACE = True

# Call the function to get the FastAPI app instance
# Ensure the agent directory name ('capital_agent') matches your agent folder
app: FastAPI = get_fast_api_app(
    agents_dir=AGENT_DIR,
    session_service_uri=SESSION_SERVICE_URI,
    allow_origins=ALLOWED_ORIGINS,
    web=SERVE_WEB_INTERFACE,
)

# You can add more FastAPI routes or configurations below if needed
# Example:
# @app.get("/hello")
# async def read_root():
#     return {"Hello": "World"}

if __name__ == "__main__":
    # Use the PORT environment variable provided by Cloud Run, defaulting to 8080
    uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

Create agent files.

When finished, your agent code has to meet these requirements:

Agent code is in a file called agent.py within your agent directory.
Your agent variable is named root_agent.

__init__.py is within your agent directory and contains from . import agent.

2.1. Create the app/capital_agent/agent.py file:

from google.adk.agents import LlmAgent

# Define a tool function
def get_capital_city(country: str) -> str:
  """Retrieves the capital city for a given country."""
  # Replace with actual logic (e.g., API call, database lookup)
  capitals = {"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
  return capitals.get(country.lower(), f"Sorry, I don't know the capital of {country}.")


capital_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="capital_agent",
    description="Answers user questions about the capital city of a given country.",
    instruction="""You are an agent that provides the capital city of a country.
             When a user asks for the capital of a country:
             1. Identify the country name from the user's query.
             2. Use the `get_capital_city` tool to find the capital.
             3. Respond clearly to the user, stating the capital city.
             Example Query: "What's the capital of France?"
             Example Response: "The capital of France is Paris."
        """,
    tools=[get_capital_city] # Provide the function directly
)

root_agent = capital_agent

2.2. Create app/capital_agent/__init__.py file:

from . import agent

Create app/requirements.txt file with necessary Python packages:

google_adk==1.4.1
fastapi==0.115.13
uvicorn==0.34.0
pydantic==2.11.7

Create app/Dockerfile to build app container image:

FROM python:3.13-slim
WORKDIR /app

RUN adduser --disabled-password --gecos "" myuser

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt


COPY --chown=myuser:myuser . .

ENV PATH="/home/myuser/.local/bin:$PATH"

USER myuser

CMD ["sh", "-c", "uvicorn main:app --host 0.0.0.0 --port $PORT"]

Build and Push the Container Image

Build your Docker image using Google Cloud Build and push it to the Artifact Registry repository that is created by the Terraform:

gcloud builds submit \
    --tag $(terraform output -raw image_repository_full_name)/adk-agent:latest \
    --project=$(terraform output -raw project_id) \
    ../app

Run this command to create app/deployment.yaml file with Kubernetes Manifest. This command has to create manifest with values taken from the terraform:

cat <<  EOF > ../app/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: adk-agent
spec:
  replicas: 1
  selector:
    matchLabels:
      app: adk-agent
  template:
    metadata:
      labels:
        app: adk-agent
    spec:
      serviceAccount: $(terraform output -raw k8s_service_account_name)
      containers:
      - name: adk-agent
        imagePullPolicy: Always
        image: $(terraform output -raw image_repository_full_name)/adk-agent:latest
        resources:
          limits:
            memory: "256Mi"
            cpu: "500m"
            ephemeral-storage: "128Mi"
          requests:
            memory: "256Mi"
            cpu: "500m"
            ephemeral-storage: "128Mi"
        ports:
        - containerPort: 8080
        env:
          - name: PORT
            value: "8080"
          - name: GOOGLE_CLOUD_PROJECT
            value: $(terraform output -raw project_id)
          - name: GOOGLE_CLOUD_LOCATION
            value: $(terraform output -raw gke_cluster_location)
          - name: GOOGLE_GENAI_USE_VERTEXAI
            value: "true"
        readinessProbe:
          httpGet:
            path: /dev-ui/
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 5
          successThreshold: 1
---
apiVersion: v1
kind: Service
metadata:
  name: adk-agent
spec:       
  type: NodePort
  ports:
    - port: 80
      targetPort: 8080
  selector:
    app: adk-agent
EOF

star

If you do not want to expose the app with IAP and just use port-forward, you may want to change the type of the service from the NodePort to ClusterIP, since port-forward does not require an external port.

Apply the manifest:

kubectl apply -f ../app/deployment.yaml

Wait for deployment to be completed. It may take some time:
```
kubectl rollout status deployment/adk-agent
```

Securely expose Agent’s Web-UI with Identity Aware Proxy (IAP).

Create a new directory for Terraform config:
```
mkdir ../iap
```

Prepare the tfvars file that will be needed during the IAP guide. We also can specify some of the known variable values, so you only need to specify the remaining ones with the <> placeholder.

cat <<EOF > ../iap/values.tfvars
project_id               = "$(terraform output -raw project_id)"
cluster_name             = "$(terraform output -raw gke_cluster_name)"
cluster_location         = "$(terraform output -raw gke_cluster_location)"
app_name                 = "adk-vertex"
k8s_namespace            = "$(kubectl get svc adk-agent -o=jsonpath='{.metadata.namespace}')"
k8s_backend_service_name = "$(kubectl get svc adk-agent -o=jsonpath='{.metadata.name}')"
k8s_backend_service_port = "$(kubectl get svc adk-agent -o=jsonpath='{.spec.ports[0].port}')"
support_email            = "<SUPPORT_EMAIL>"
client_id                = "<CLIENT_ID>"
client_secret            = "<CLIENT_SECRET>"
EOF

Go to the newly created directory:
```
cd ../iap
```
Navigate to the Secure your app with Identity Aware Proxy guide and follow the instructions to enable IAP.

[Alternative] Use Port-forward

As an alternative, for a local testing, instead of IAP you can use the port-forward command:

kubectl port-forward svc/adk-agent 8080:80

Testing your Deployed Agent

Open web UI at the URL that is created during the IAP guide or http://127.0.0.1:8080 if you use port-forward and test the web UI.

The ADK dev UI allows you to interact with your agent, manage sessions, and view execution details directly in the browser.

To verify your agent is working as intended, you can:

Select your agent from the dropdown menu.
Type a message and verify that you receive an expected response from your agent.

alt text

As you can see from the screenshot, the agent works as expected and gives answers only for cities that are listed in the tool function.

If you experience any unexpected behavior, check the pod logs for your agent using:

kubectl logs -l app=adk-agent

Troubleshooting

These are some common issues you might encounter when deploying your agent to GKE:

403 Permission Denied for Gemini 2.0 Flash

This usually means that the Kubernetes service account does not have the necessary permission to access the Vertex AI API. Ensure that you have created the service account and bound it to the Vertex AI User role as described in the Configure Kubernetes Service Account for Vertex AI section. If you are using AI Studio, ensure that you have set the GOOGLE_API_KEY environment variable in the deployment manifest and it is valid.

Attempt to write a readonly database

You might see there is no session id created in the UI and the agent does not respond to any messages. This is usually caused by the SQLite database being read-only. This can happen if you run the agent locally and then create the container image which copies the SQLite database into the container. The database is then read-only in the container.

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database
[SQL: UPDATE app_states SET state=?, update_time=CURRENT_TIMESTAMP WHERE app_states.app_name = ?]

To fix this issue, you can either:

Delete the SQLite database file from your local machine before building the container image. This will create a new SQLite database when the container is started.

rm -f sessions.db

or (recommended) you can add a .dockerignore file to your project directory to exclude the SQLite database from being copied into the container image.

Build the container image and deploy the application again.

Cleaning up

Destroy the provisioned infrastructure.

terraform destroy -var-file=default_env.tfvars

Feedback

Was this page helpful?

Thank you for your feedback.

Continue reading:

Agent ADK using GKE Autopilot Cluster with Llama and vLLM

This tutorial demonstrates how to deploy the Llama-3.1-8B-Instruct model on Google Kubernetes Engine (GKE) and vLLM for efficient inference. Additionally, it shows how to integrate an ADK agent to interact with the model, supporting both basic chat completions and tool usage. The setup leverages a GKE Autopilot cluster to handle the computational requirements.

Deploying MCP Servers on GKE

This guide provides instructions for deploying a **Ray cluster with the AI Device Kit (ADK)** and a **custom Model Context Protocol (MCP) server** on **Google Kubernetes Engine (GKE)**. It covers setting up the infrastructure with Terraform, containerizing and deploying the Ray Serve application, deploying a custom MCP server for real-time weather data, and finally deploying an ADK agent that utilizes these components. The guide also includes steps for verifying deployments and cleaning up resources.

Agent on GKE using vLLM and Ray Serve

This tutorial demonstrates how to deploy the Llama-3.1-8B-Instruct model on Google Kubernetes Engine (GKE) using Ray Serve and vLLM for efficient inference. Additionally, it shows how to integrate an ADK agent to interact with the model, supporting both basic chat completions and tool usage. The setup leverages a GKE Standard cluster with GPU-enabled nodes to handle the computational requirements.

Metaflow

This tutorial will provide instructions on how to deploy and use the [Metaflow](https://docs.metaflow.org/) framework on GKE (Google Kubernetes Engine) and operate AI/ML workloads using [Argo-Workflows](https://argo-workflows.readthedocs.io/en/latest/).