Blueprints

Training ESM2

This samples walks through setting up a Google Cloud GKE environment to train ESM2 (Evolutionary Scale Modeling) using NVIDIA BioNeMo Framework 2.0

Tags:

Digital Human for Customer Service on GKE

This sample walks through creatinb intelligent, interactive avatars for customer service across industries in GKE by using NVIDIA NIM services.

Tags:

Fine-Tuning ESM2

This sample walks through setting up a Google Cloud GKE environment to fine-tune ESM2 (Evolutionary Scale Modeling) using NVIDIA BioNeMo Framework 2.0

Tags:

Generative Virtual Screening for Drug Discovery on GKE

This guide outlines the steps to deploy NVIDIA’s NIM blueprint for Generative Virtual screening for Drug Discovery on a Google Kubernetes Engine (GKE) cluster. Three NIMs - AlphaFold2, MolMIM & DiffDock are used to demonstrate Protein folding, Molecular generation and Protein docking.

Tags:

Jupyter on GKE

This guide details how to deploy JupyterHub on Google Kubernetes Engine (GKE) using a provided Terraform template, including options for persistent storage and Identity-Aware Proxy (IAP) for secure access. It covers the necessary prerequisites, configuration steps, and installation process, emphasizing the use of Terraform for automation and IAP for authentication. The guide also provides instructions for accessing JupyterHub, setting up user access, and running an example notebook.

Tags:

Blueprints

NVIDIA NIM for Large Language Models (LLMs) on GKE

This guide explains how to deploy NVIDIA NIM inference microservices on a Google Kubernetes Engine (GKE) cluster, requiring an NVIDIA AI Enterprise License for access to the models. It details the process of setting up a GKE cluster with GPU-enabled nodes, configuring access to the NVIDIA NGC registry, and deploying a NIM using a Helm chart with persistent storage. Finally, it demonstrates how to test the deployed NIM service by sending a sample prompt and verifying the response, ensuring the inference microservice is functioning correctly.

Tags:

NVIDIA BioNeMo

Deploying and managing servers dedicated to performing inference tasks for machine learning models.

Tags:

NVIDIA NeMo

NVIDIA NeMo™ is an end-to-end platform for development of custom generative AI models anywhere. NVIDIA NeMo framework is designed for enterprise development, it utilizes NVIDIA’s state-of-the-art technology to facilitate a complete workflow from automated distributed data processing to training of large-scale bespoke models using sophisticated 3D parallelism techniques, and finally, deployment using retrieval-augmented generation for large-scale inference on an infrastructure of your choice, be it on-premises or in the cloud.

Tags:

NVIDIA NIMs

These guides explains how to deploy NVIDIA NIM inference microservices on a Google Kubernetes Engine (GKE) cluster

Tags:

RAG on GKE

This tutorial demonstrates how to deploy a Retrieval Augmented Generation (RAG) application on Google Kubernetes Engine (GKE), integrating a Hugging Face TGI inference server, a Cloud SQL pgvector database, and a Ray cluster for generating vector embeddings. It walks you through setting up the infrastructure with Terraform, populating the vector database with embeddings from a sample dataset using a Jupyter notebook, and launching a frontend chat interface. The guide also covers optional configurations like using your own cluster or VPC, enabling authenticated access via IAP, and troubleshooting common issues.

Tags:

Ray on GKE

This guide provides instructions and examples for deploying and managing Ray clusters on Google Kubernetes Engine (GKE) using KubeRay and Terraform. It covers setting up a GKE cluster, deploying a Ray cluster, submitting Ray jobs, and using the Ray Client for interactive sessions. The guide also points to various resources, including tutorials, best practices, and examples for running different types of Ray applications on GKE, such as serving LLMs, using TPUs, and integrating with GCS.

Tags:

Using TPUs with KubeRay on GKE

This guide provides instructions for deploying and managing Ray custom resources on Google Kubernetes Engine (GKE) with TPUs. It details how to install the KubeRay TPU webhook, an admission webhook which bootstraps required environment variables for TPU initialization and enables atomic scheduling of multi-host TPU workers on GKE nodepools. This guide also provides a sample workload to verify proper TPU initialization and links to more advanced workloads to run with TPUs and Ray on GKE.

Tags:

Slurm on Google Kubernetes Engine (GKE)

This guide shows you how to deploy Slurm on a Google Kubernetes Engine (GKE) cluster.

Tags:

Blueprints

Tags: Blueprints