ArchitectureConference40min
GenAI on Kubernetes: training, inference and serving tutorial
This tutorial guides participants through end-to-end GenAI workload management on Kubernetes, covering distributed training, optimized inference, production model serving, and key components like autoscaling, operators, and OSS frameworks. Attendees gain hands-on experience designing efficient, scalable, and secure GenAI deployments using practical, reproducible Kubernetes patterns and workflows.
Alessandro VozzaMicrosoft
talkDetail.whenAndWhere
Friday, April 24, 15:15-15:55
Banquet
talks.roomOccupancytalks.noOccupancyInfo
Kubernetes is quickly becoming the preferred platform for running GenAI workloads, but wiring together training jobs, inference pipelines, and scalable model serving can feel overwhelming. This hands-on tutorial walks you through the full lifecycle of GenAI on Kubernetes—from launching distributed training jobs with GPU scheduling, to running optimized inference workloads, to exposing production-grade model endpoints using cloud-native serving stacks. We’ll cover key building blocks like operators, autoscaling (HPA/KEDA), vector stores, model registries, and popular OSS frameworks (KServe, Ray, vLLM, Kubeflow, and more). You’ll learn how to design resource-efficient GPU clusters, fine-tune models securely, and deploy multi-model serving architectures that behave predictably under real traffic. By the end, you’ll have a working reference setup and a clear understanding of how to operationalize GenAI workloads using the Kubernetes patterns you already know. No magic—just practical, reproducible workflows you can take back to your platform today.
Alessandro Vozza
Alessandro, a seasoned community leader, has spent the last few years architecting cloud-native infrastructures for Microsoft customers, energizing the Dutch tech community, and helping professionals achieve CKx certification. With over 25 years immersed in open-source technologies, Alessandro is deeply passionate about the cloud-native ecosystem. He's now back at Microsoft as a Senior Technical Specialist in Application Innovation & AI.
talkDetail.shareFeedback
talkDetail.feedbackNotYetAvailable
talkDetail.feedbackAvailableAfterStart
talkDetail.signInRequired
talkDetail.signInToFeedbackDescription
occupancy.title
occupancy.votingNotYetAvailable
occupancy.votingAvailableBeforeStart
talkDetail.signInRequired
occupancy.signInToVoteDescription
comments.speakerNotEnabledComments