ArchitectureQuickie15min
Behind the scenes of a Data Platform: Lessons from 80 billion rows
This session explores the Post Data Platform (PDP), a production-ready ecosystem unifying data governance and ingestion across enterprises. Drawing from real deployments at Eurobank and Austria Post, it details PDP’s JSON-driven ingestion, Databricks Unity Catalog integration, and organizational practices needed to scale data infrastructure and enforce enterprise-wide standards.
Giorgos NikolopoulosAgile Actors
talkDetail.whenAndWhere
Thursday, April 23, 14:45-15:00
MC 2
talks.roomOccupancytalks.noOccupancyInfo
Scaling a data platform to 80 billion rows isn't just a technical challenge; it’s an organisational one. This presentation pulls back the curtain on such a platform moving past theoretical blueprints to show what is running in production today.
We will first dismantle the technical "Chasm" of scaling by diving into our JSON-driven ingestion framework, which utilizes source-agnostic connectors to standardize data flow across the enterprise. From there, we shift to the governance layer, detailing how we utilize Databricks Unity Catalog to centralize security and maintain a single point of truth. Finally, we address the human side of architecture: the friction of enforcing "hard" rules—like push-only ingestion—and the critical lessons learned from the pitfalls of a multi-year platform rollout.
Key Takeaways
Target Audience
This session is intended for Data Engineers, Solution Architects, and Technical Leads who are navigating the transition from fragmented data silos to a centralized, enterprise-grade platform.
We will first dismantle the technical "Chasm" of scaling by diving into our JSON-driven ingestion framework, which utilizes source-agnostic connectors to standardize data flow across the enterprise. From there, we shift to the governance layer, detailing how we utilize Databricks Unity Catalog to centralize security and maintain a single point of truth. Finally, we address the human side of architecture: the friction of enforcing "hard" rules—like push-only ingestion—and the critical lessons learned from the pitfalls of a multi-year platform rollout.
Key Takeaways
- Scalable Engineering: How to build a JSON-driven framework that allows for reusable, source-agnostic ingestion at the scale of billions of rows.
- Centralized Control: Practical architectural details for hardening security and governance using Databricks Unity Catalog.
- Reality Check & Navigating Friction: Strategies for managing organizational "fights" over platform rules and avoiding common pitfalls in team structure.
Target Audience
This session is intended for Data Engineers, Solution Architects, and Technical Leads who are navigating the transition from fragmented data silos to a centralized, enterprise-grade platform.
Giorgos Nikolopoulos
Giorgos Nikolopoulos is a System Architect specializing in building data infrastructure that doesn’t just scale, but thrives under the pressure of enterprise-grade workloads. Currently leading the architectural strategy for Österreichische Post AG, Giorgos spearheaded the design of a unified enterprise data platform and a metadata-driven ingestion framework capable of processing over 80 billion rows of data. By replacing fragmented legacy systems with standardized, scalable patterns, he has laid the foundation for an AI-ready data infrastructure across the organization.
With an MSc in Software Systems Engineering from UCL (Distinction), he specializes in the high-stakes execution of massive technical shifts, including a complex Synapse to Databricks migration. His approach balances low-level Spark optimization with high-level architectural governance—such as hardening security through private networking and implementing Unity Catalog.
Beyond the whiteboard, Giorgos is an Instructor at Learning Actors, where he has mentored over 50 professional engineers on the intricacies of streaming, Airflow orchestration, and distributed systems. At DevOxx, he peels back the curtain on the "80 billion row" milestone, moving past the marketing hype to share the real-world architectural decisions and trade-offs required to keep a massive data ecosystem performant and secure.
With an MSc in Software Systems Engineering from UCL (Distinction), he specializes in the high-stakes execution of massive technical shifts, including a complex Synapse to Databricks migration. His approach balances low-level Spark optimization with high-level architectural governance—such as hardening security through private networking and implementing Unity Catalog.
Beyond the whiteboard, Giorgos is an Instructor at Learning Actors, where he has mentored over 50 professional engineers on the intricacies of streaming, Airflow orchestration, and distributed systems. At DevOxx, he peels back the curtain on the "80 billion row" milestone, moving past the marketing hype to share the real-world architectural decisions and trade-offs required to keep a massive data ecosystem performant and secure.
talkDetail.shareFeedback
talkDetail.feedbackNotYetAvailable
talkDetail.feedbackAvailableAfterStart
talkDetail.signInRequired
talkDetail.signInToFeedbackDescription
occupancy.title
occupancy.votingNotYetAvailable
occupancy.votingAvailableBeforeStart
talkDetail.signInRequired
occupancy.signInToVoteDescription
comments.speakerNotEnabledComments