E2E GenAI
This guide demonstrates deploying an end-to-end Generative AI application on Google Kubernetes Engine (GKE). It utilizes a Hugging Face model with Langchain for prompt engineering, Ray Serve for model inference, a Flask API for the backend, and a React frontend for user interaction. The setup includes infrastructure provisioning with Terraform, model experimentation in Jupyter Notebook, and containerized deployment of the backend and frontend services to GKE, all managed through kubectl.