Data & AIData & AI
Conference40min
ADVANCED

The GPU Orchestration Playbook: AI Inference at Scale

This session covers best practices for deploying large AI models on Kubernetes, focusing on GPU orchestration, advanced scheduling, autoscaling, and storage strategies to optimize inference performance. Attendees will gain practical insights for reliably running scalable AI workloads in production across cloud-native environments.

Alex König
Alex KönigAWS

talkDetail.whenAndWhere

Saturday, April 25, 13:05-13:45
Banquet
talks.roomOccupancytalks.noOccupancyInfo
talks.description
As AI workloads scale, serving large models reliably and efficiently has become a central challenge for engineering teams. In this session, we explore patterns and best practices for running high-performance AI inference on Kubernetes. Attendees will learn about GPU orchestration, advanced scheduling, autoscaling strategies and architectural decisions that minimize latency and maximize throughput. We’ll also cover storage considerations and practical approaches to deploying models in production. This talk is ideal for engineers and platform teams who want to understand how to operate AI workloads at scale while maintaining reliability and performance across cloud-native environments.
throughput
inference
autoscaling
kubernetes
talks.speakers
Alex König

Alex König

AWS

Germany

Alex is a Senior Solutions Architect at AWS. He is experienced in large scale and distributed architectures, Kubernetes, Open Source, DevOps, automation and emerging technologies. He has given talks and written blogs about software architecture, Kubernetes and AI. With a background as a systems engineer he still likes to write shell scripts.

talkDetail.rateThisTalk

talkDetail.poortalkDetail.excellent

talkDetail.ratingNotYetAvailable

talkDetail.ratingAvailableWhenStarted

talkDetail.signInRequired

talkDetail.signInToRateDescription

occupancy.title

occupancy.votingNotYetAvailable

occupancy.votingAvailableBeforeStart

talkDetail.signInRequired

occupancy.signInToVoteDescription

comments.title

comments.speakerNotEnabledComments