Building Private, Scalable AI Applications with Self-Hosted LLMs

This tutorial teaches how to build and deploy secure, self-hosted AI applications using Java and Spring AI with local LLMs on Kubernetes. It covers integration, asynchronous architecture, scaling, and production readiness, offering language-agnostic concepts for developers needing confidential, scalable AI solutions without third-party data exposure.

Timo SalmBroadcom

Sandra AhlgrimmMicrosoft

talkDetail.whenAndWhere

Monday, October 6, 09:30-12:30

BOF 2

talks.description

Tools like ChatGPT, Claude, and Copilot are powerful, but relying on them means giving up control of your data to third-party providers.

For organizations bound by confidentiality or regulatory requirements, self-hosted models such as Llama, Mistral, Phi, and Qwen provide a secure alternative, running locally or on Kubernetes clusters. Often, the small language models naturally fit well into Domain-Driven Design architectures.

In this tutorial, you’ll learn how to:
- Build a sample AI application using Java and Spring AI
- Integrate the application with a self-hosted LLM
- Deploy and scale both the application and the LLM on Kubernetes
- Implement an asynchronous architecture using message queues
- Ensure production readiness with resource requests, autoscaling, and real-time metrics
- Dive deeper into AI capabilities, create an MCP Server and a Client consuming

While the example application uses Java, all concepts around LLM deployment and scaling are fully language-agnostic, making this workshop valuable for developers of all backgrounds.

Join us to build secure, scalable AI solutions that keep your data under control.

self-hosted

java

confidentiality

kubernetes

talks.speakers

Timo Salm

Broadcom

Germany

Timo Salm is a Principal Solutions Engineer at VMware Tanzu by Broadcom with over a decade of experience in customer-facing roles, modern applications, DevSecOps, and AI.

In his roles, he ensures that the most strategic customers in the EMEA region achieve their goals with VMware Tanzu's developer platform, data, AI, and commercial Spring products.

Another essential part of his role is continuously learning through experimenting hands-on with innovations and sharing the outcomes with colleagues, customers, and the community via, for example, conferences.

Before Timo joined Pivotal, which VMware and now Broadcom acquired, he worked for consulting firms in the automotive industry as a software architect and full-stack developer.

Sandra Ahlgrimm

Microsoft

Germany

Sandra Ahlgrimm is a Senior Cloud Advocate at Microsoft, specializing in supporting Java Developers. With over a decade of experience as a Java developer, she brings a wealth of knowledge to her role. Sandra is passionate about containers and has recently learned to love AI.
As a leader in the tech community, Sandra actively contributes to the Berlin Java User Group (JUG) and the Berlin Docker MeetUp. Her expertise extends beyond coding; she focuses on LangChain4j integrations and serves as the primary point of contact for developer feedback related to Java in Visual Studio Code (VS Code) and the Azure toolkit integration in IntelliJ. Additionally, her interest in performance and event-driven architectures led to her involvement with native images. Therefore, Sandra represents Microsoft on the GraalVM Program Advisory Board.
Sandra’s commitment to empowering fellow developers and fostering collaboration makes her an invaluable asset to the software development ecosystem.

comments.title

comments.speakerNotEnabledComments