
Conference50min
Fault-tolerant AI Agents on the JVM with Koog framework
Discover how Koog, an open-source Kotlin/JVM framework from JetBrains, enables resilient, production-ready AI agents through modular pipeline design and robust fault tolerance. Learn how checkpointing, state restoration, and predictable operation graphs help agents recover from failures and behave reliably outside the lab.

Vadim BriliantovJetBrains
talkDetail.whenAndWhere
Wednesday, October 8, 14:00-14:50
Room 3
talks.roomOccupancytalks.noOccupancyInfo
AI agents are no longer just experiments. Once they leave the lab, new challenges emerge: ensuring they can recover from failures, behave predictably, and remain resilient in production.
At JetBrains, we've been solving these problems while deploying AI agents in real products. Koog is the open-source Kotlin / JVM framework that grew out of this work. It helps you design agents as modular, reusable pipelines—so you can precisely control their behavior instead of treating them like black boxes.
Koog's key strength is fault tolerance. Agents can checkpoint and restore their entire state machine, not just chat history. That means you can safely resume long-running processes, recover after crashes, or move execution between machines without losing progress.
In this talk you'll learn:
- How to model agents as predictable graphs of operations.
- How persistence and restoration from checkpoints make agents resilient in production.
- Lessons from JetBrains' own use of Koog in real products.
Together, we'll look at how AI agents on the JVM can move beyond experiments and into reliable, production-ready systems.
At JetBrains, we've been solving these problems while deploying AI agents in real products. Koog is the open-source Kotlin / JVM framework that grew out of this work. It helps you design agents as modular, reusable pipelines—so you can precisely control their behavior instead of treating them like black boxes.
Koog's key strength is fault tolerance. Agents can checkpoint and restore their entire state machine, not just chat history. That means you can safely resume long-running processes, recover after crashes, or move execution between machines without losing progress.
In this talk you'll learn:
- How to model agents as predictable graphs of operations.
- How persistence and restoration from checkpoints make agents resilient in production.
- Lessons from JetBrains' own use of Koog in real products.
Together, we'll look at how AI agents on the JVM can move beyond experiments and into reliable, production-ready systems.

Vadim Briliantov
Technical Lead and author of Koog framework at JetBrains.
Over the past 8 years at JetBrains, I have contributed to a wide range of projects including IntelliJ Kotlin plugin, Kotlin Libraries, Ktor framework, Qodana Cloud Backend, and Multiplatform Tooling. I have also led the technological direction of AI agent development across multiple products as part of the AI Agents Platform. Currently, I lead the development of the Koog framework.
Over the past 8 years at JetBrains, I have contributed to a wide range of projects including IntelliJ Kotlin plugin, Kotlin Libraries, Ktor framework, Qodana Cloud Backend, and Multiplatform Tooling. I have also led the technological direction of AI agent development across multiple products as part of the AI Agents Platform. Currently, I lead the development of the Koog framework.
talkDetail.shareFeedback
talkDetail.signInRequired
talkDetail.signInToFeedbackDescription
occupancy.title
talkDetail.signInRequired
occupancy.signInToVoteDescription
comments.speakerNotEnabledComments