Securing Google ADK application agent with LlamaFirewall
Overview
This tutorial provides instructions on how to securely deploy a self-hosted ADK agent with LlamaFirewall on Google Kubernetes Engine (GKE).
The goal is to host a general purpose LLM locally on cluster and secure interaction with it by also using locally hosted LLM of the LLamaFirewall. In order to sanitize its input, LlamaFirewall uses the Llama PromptGuard 2 model.
At the moment of writing this tutorial, the PromptGuard 2 model is used by LlamaFirewall internally on the same machine, without ability to connect to the PromptGuard 2 that is hosted remotely (for example on a separate vLLM deployment), that’s why in this tutorial we attach GPU to the pod that runs ADK app itself.
This example uses Google ADK to interact with LLM, but it is more focused on LLamaFirewall. For more detailed guide in Google ADK, please refer to our another guide - Agent Development Kit (ADK) on GKE
The tutorial will cover:
- Setting up your Google Cloud environment.
- Building a container image for your agent.
- Deploying the LLMs via vLLM.
- Deploying the agent to a GKE cluster.
- Testing your deployed agent.
Before you begin
Ensure you have the following tools installed on your workstation
If you previously installed the gcloud CLI, get the latest version by running:
gcloud components update
Ensure that you are signed in using the gcloud CLI tool. Run the following command:
gcloud auth application-default login
Infrastructure Setup
Clone the repository
Clone the repository with our guides and cd to the LlamaFirewall tutorial directory by running these commands:
git clone https://github.com/ai-on-gke/tutorials-and-examples.git
cd tutorials-and-examples/security/llama-firewall/
Filesystem structure
├── adk-app # ADK application source
│ ├── Dockerfile
│ ├── llama_firewall_secured_agent
│ │ ├── agent.py
│ │ ├── __init__.py
│ │ └── __pycache__
│ │ ├── agent.cpython-312.pyc
│ │ ├── __init__.cpython-312.pyc
│ │ └── llama_firewall.cpython-312.pyc
│ ├── main.py
│ └── requirements.txt
├── gen # This folder will contain all files that are generated by the terraform
│ └── secured-agent.yaml # This is the manifest of the ADK app deployment
├── terraform # The terraform config source
│ ├── example.tfvars
│ ├── main.tf
│ ├── network.tf
│ ├── outputs.tf
│ ├── providers.tf
│ ├── templates
│ │ └── secured-agent.yaml.tftpl
│ ├── terraform.tfstate
│ ├── terraform.tfstate.backup
│ ├── variables.tf
│ └── versions.tf
└── vllm # Manifests for vLLM deployment that hosts a common models
└── vllm-llama.yaml
Enable Necessary APIs
Enable the APIs required for GKE, Artifact Registry, Cloud Build, and Vertex AI
gcloud services enable \
container.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
Create cluster and other resources
In this section we will use Terraform to automate the creation of infrastructure resources. For more details how it is done please refer to the terraform config in the terraform/
folder. By default, the configuration provisions an Autopilot GKE cluster, but it can be changed to standard by setting autopilot_cluster = false
.
It creates the following resources. For more information such as resource names and other details, please refer to the Terraform config:
- Service Accounts:
- Cluster IAM Service Account (derives name from a cluster name, e.g.
tf-gke-<cluster name>
) – manages permissions for the GKE cluster.
- Cluster IAM Service Account (derives name from a cluster name, e.g.
- Artifact registry – stores container images for the application.
-
Go the the terraform directory:
cd terraform
-
Specify the following values inside the
example.tfvars
file (or make a separate copy):<PROJECT_ID>
– replace with your project id (you can find it in the project settings).
Other values can be changed, if needed, but can be left with default values.
-
Init terraform modules:
terraform init
-
Optionally run the plan command to view an execution plan:
terraform plan -var-file=example.tfvars
-
Execute the plan:
terraform apply -var-file=example.tfvars
And you should see your resources created:
Apply complete! Resources: 16 added, 0 changed, 0 destroyed. Outputs:
-
Configure your kubectl context:
gcloud container clusters get-credentials $(terraform output -raw gke_cluster_name) --region $(terraform output -raw gke_cluster_location)
Deploy vLLM Models
We need to deploy vLLM servers that will serve the model that the LlamaFirewall will secure.
-
Create a secret with your HuggingFace token:
kubectl create secret generic hf-token-secret --from-literal=token="<YOUR_TOKEN>"
-
Apply vLLM deployment manifest with base Llama model:
kubectl apply -f ../vllm/vllm-llama.yaml
-
Wait until these deployments are ready:
kubectl rollout status deployment/vllm-llama3
Deploy and Configure the Agent Application
This application consists of a simple ADK agent that uses Callbacks. It uses Before Model Callback and After Model Callback to invoke LlamaFirewall on the user prompt and the model’s response respectively.
For more info you can also look at the adk-app/secured_agent/agent.py file which has the variable secured_agent
, which is an instance of the LlmAgent class and its two constructor arguments: before_model_callback
and after_model_callback
. These callbacks are used to invoke LLama Firewall.
-
Build image with our ADK application:
gcloud builds submit \ --tag $(terraform output -raw image_repository_full_name)/secured-agent:latest \ --project=$(terraform output -raw project_id) \ ../adk-app
-
Deploy the manifest for the ADK application:
kubectl apply -f ../gen/secured-agent.yaml
-
Wait until the deployment is ready:
kubectl rollout status deployment/adk-agent
Testing
-
Open our ADK application’s URL in the web browser. If you enabled Identity Aware Proxy from our other guide, then use its URL. Otherwise, you can just use the port-forward command and open the link http://127.0.0.1:8000:
kubectl port-forward svc/adk-agent 8000:80
-
The ADK application web UI has to be opened with our
llama_firewall_secured_agent
agent being already selected. Try to do some prompting and make a normal prompt: -
Then try some malicious prompt:
As you can see, the model’s response is the same as we specified in the agent’s code.
Cleaning up
-
Destroy the provisioned infrastructure.
terraform destroy -var-file=example.tfvars