Load Hugging Face Models into Cloud Storage
This guide provides instructions for how to hydrate GCS buckets with models from Hugging Face with a Kubernetes Job.
This guide provides instructions for how to hydrate GCS buckets with models from Hugging Face with a Kubernetes Job.
This guide uses the Google Cloud API to create a Hyperdisk ML disk from data in Cloud Storage and then use it in a GKE cluster. Refer to this documentation for instructions all in the GKE API.
This project allows you to download a Hugging Face model and package it as a Docker image. The Docker image can then be pushed to Google Artifact Registry for deployment or distribution. Build time can be significant for large models, it is recommended to not exceed models above 10 billion parameters. For reference 8b model roughly takes 35 minutes to build and push with this cloudbuild config.