Storage

Providing persistent and high-performance storage solutions for AI/ML workloads running on Google Kubernetes Engine (GKE).

Load Hugging Face Models into Cloud Storage

This guide provides instructions for how to hydrate GCS buckets with models from Hugging Face with a Kubernetes Job.

Populate a Hyperdisk ML Disk from Google Cloud Storage

This guide uses the Google Cloud API to create a Hyperdisk ML disk from data in Cloud Storage and then use it in a GKE cluster. Refer to this documentation for instructions all in the GKE API.

Models as OCI

This project allows you to download a Hugging Face model and package it as a Docker image. The Docker image can then be pushed to Google Artifact Registry for deployment or distribution. Build time can be significant for large models, it is recommended to not exceed models above 10 billion parameters. For reference 8b model roughly takes 35 minutes to build and push with this cloudbuild config.

Continue reading: