ArchitectureConference50min
Lessons learned from building an Agentic AI platform
This talk explores the challenges of running AI agents reliably at scale, sharing lessons from building the Akka Agentic Platform. It covers durable workflows, context engineering, guardrails, and observability—offering a practical roadmap for turning prototypes into trustworthy, production-grade agentic systems capable of supporting millions of autonomous agents.
talk.summaryAiDisclaimer
Andrzej LudwikowskiAkka
talkDetail.whenAndWhere
Thursday, June 18, 17:05-17:55
Room 3
talks.roomOccupancytalks.noOccupancyInfo
Everyone is building AI agents, but very few are running them in production at scale. The transition from a successful prototype to a mission-critical system is where most agentic projects fail. When we started building the Akka Agentic Platform, we quickly realized that most agentic stacks shine in demos, but nobody talks about the "Day 2" operational realities. Distributed systems challenges, like state consistency, partial failure, and durable execution, are magnified tenfold in agentic systems.
In this talk, I’ll share the engineering lessons, trade-offs, and mistakes we encountered while designing a platform capable of supporting millions of autonomous, long-lived agents. You will see how we design durable workflows so an agent never forgets its goal or wastes tokens re-running expensive reasoning steps after a crash. I will show context engineering at scale, using sharded, in-memory entities to give agents sub-millisecond access to conversation histories. We will cover evaluations and guardrails that actively govern agent behavior. And finally, observability, which gives deep debugging insight into the inherently non-deterministic interactions with LLMs.
Whether you are building your own internal platform or just want to understand the broader agentic ecosystem, this talk delivers a practical roadmap. It highlights common pitfalls and shows how to build AI systems you can actually trust, whether you use Akka or another stack.
In this talk, I’ll share the engineering lessons, trade-offs, and mistakes we encountered while designing a platform capable of supporting millions of autonomous, long-lived agents. You will see how we design durable workflows so an agent never forgets its goal or wastes tokens re-running expensive reasoning steps after a crash. I will show context engineering at scale, using sharded, in-memory entities to give agents sub-millisecond access to conversation histories. We will cover evaluations and guardrails that actively govern agent behavior. And finally, observability, which gives deep debugging insight into the inherently non-deterministic interactions with LLMs.
Whether you are building your own internal platform or just want to understand the broader agentic ecosystem, this talk delivers a practical roadmap. It highlights common pitfalls and shows how to build AI systems you can actually trust, whether you use Akka or another stack.
Andrzej Ludwikowski
Software Architect with over 15 years of experience in commercial software development. Conference speaker and blogger. Devotee of DDD, Event Sourcing and Polyglot Persistence. System performance bottlenecks validator. Continuously chasing the dream of a perfect software architecture, which does not exist, but looking for it is the goal itself. Currently, principal developer at Akka (formerly Lightbend).