Data & (Gen)AIConference50min
Practical LLM Inference in Modern Java
The session will explore practical methods of implementing local Large Language Model (LLM) inference in Java environments. It will demonstrate using the latest Java features for local inference of open-source LLMs, optimizing CPU performance, creating a flexible LLM framework, and integrating with LangChain4j for streamlined execution. Attendees will learn about optimizing performance with the Java Vector API, GraalVM, and how to create efficient AI applications using the latest Java technologies.
Alina YurenkoOracle Labs
Alfonso² PeterssenOracle Labs
talkDetail.whenAndWhere
Thursday, October 10, 13:50-14:40
Room 6
Large Language Models (LLMs) have become essential in many applications, but integrating them effectively into Java environments can still be challenging. This session will explore practical approaches to implementing local LLM inference using modern Java.We'll demonstrate how to leverage the latest Java features to implement local inference for a variety of open-source LLMs, starting with Llama 2&3 (Meta). Importantly, we'll show how the same approach can easily be extended to run other popular open-source models on standard CPUs without the need for specialized hardware.Key topics we'll cover:- Implementing efficient LLM inference engines in modern Java for local execution- Utilizing Java 21+ features for optimized CPU-based performance- Creating a flexible framework adaptable to multiple LLM architectures- Maximizing standard CPU utilization for inference without GPU dependencies- Integrating with LangChain4j for streamlined local inference execution- Optimizing performance with Java Vector API for accelerated matrix operations and leveraging GraalVM to reduce latency and memory consumption.Join us to learn about implementing and optimizing local LLM inference for open-source models in your Java projects and creating fast and efficient AI applications using the latest Java technologies.
comments.speakerNotEnabledComments