Data & AnalyticsData & Analytics
2H Hands-on Lab120min
INTERMEDIATE

Engineering multimodal AI video pipelines at scale: from zero to hero

This session demonstrates how to design scalable multimodal pipelines for large‑scale video analysis. It covers synchronizing audio and video, mitigating noise and hallucinations, and managing cost, latency, and compliance. Attendees learn to transform raw streams into queryable, auditable outputs using ASR, visual embeddings, and higher‑level recognition tasks.

Diana Ortega
Diana OrtegaOpen Innovation AI
Kaisar Barlybay
Kaisar BarlybayOpenInnovation AI

talkDetail.whenAndWhere

Friday, April 24, 10:30-12:30
TBA 15
talks.roomOccupancytalks.noOccupancyInfo
talks.description
Today, a growing number of applications rely on video as a primary data source. Analyzing video at scale requires more than running individual models; it demands well designed multimodal pipelines that combine vision, audio, and text while remaining accurate, cost-efficient, and compliant.

In this session, we build a high-throughput pipeline for video streams. Participants will see how raw feeds are transformed into aligned, queryable components by orchestrating ASR and visual embeddings, and by producing higher-level outputs such as speaker identity, facial recognition, and summaries under noisy conditions.

We focus on key engineering challenges: audio-video synchronization and clock drift, stream fragmentation and context handling, hallucination mitigation, and scaling the system while controlling latency, cost and resilience.

Key takeaways include designing scalable multimodal pipelines, solving alignment and signal-quality issues at scale, and applying practical patterns for building compliant, auditable, and resilient video analysis systems.

Familiarity with data pipelines and basic ML concepts is helpful but not required.
video
pipeline
scalability
multimodal
talks.speakers
Diana Ortega

Diana Ortega

Open Innovation AI

United Arab Emirates

Lead Data Engineer at Open Innovation AI, Diana has over 15 years of experience designing and implementing large-scale platforms. Her expertise includes high-throughput data architectures, distributed systems, relational and NoSQL data modeling, and cloud-native solutions. She currently focuses on building AI-enabled data platforms, including RAG pipelines and agentic systems, while mentoring teams on architecture, scalability, and software craftsmanship.
Kaisar Barlybay

Kaisar Barlybay

OpenInnovation AI

United Arab Emirates

Senior Data/Platform Engineer at Open Innovation AI. Working on enterprise data infrastructure — ETL/ELT pipelines, real-time processing, multimodal AI pipelines, observability. Part of a team building the data layer from source acquisition through transformation to serving.
Previously: data warehousing for industrial operations, research infrastructure in aerospace, NLP analytics platforms. Master's in Computer Science, Bachelor's in Mathematics. Stack: Python, Airflow, Kafka, ClickHouse, Kubernetes.

talkDetail.rateThisTalk

talkDetail.poortalkDetail.excellent

talkDetail.ratingNotYetAvailable

talkDetail.ratingAvailableWhenStarted

talkDetail.signInRequired

talkDetail.signInToRateDescription

occupancy.title

occupancy.votingNotYetAvailable

occupancy.votingAvailableBeforeStart

talkDetail.signInRequired

occupancy.signInToVoteDescription

comments.title

comments.speakerNotEnabledComments