DataData
Conference50min
INTERMEDIATE

Billion Vector Baby!

This session explores the real-world engineering challenges of scaling semantic search with Elasticsearch and OpenSearch to billions of vectors, focusing on cluster architecture, balancing precision and latency, and cost reduction via quantization and chunking—offering practical strategies for moving from small-scale demos to massive production deployments.

Pietro  Mele
Pietro MeleAdelean
Benjamin Dauvissat
Benjamin DauvissatAdelean
talks.description
Semantic search promises a revolution: contextual relevance and natural language understanding with just a few lines of code. On a notebook or a POC, it’s magical. But what happens when your index exceeds a billion vectors?

The magic quickly gives way to the brutality of engineering: exploding latency, uncontrolled infrastructure costs, and RAM challenges.

In this talk, we leave marketing buzz at the door and dive into the guts of Elasticsearch and OpenSearch at very large scale. We will cover how to:

  • Architect your clusters to handle a billion embeddings without failing.
  • Optimize the critical trade-off between precision (recall) and performance (latency).
  • Reduce costs using quantization strategies and intelligent chunking.

If you need to move from a “Hello World” semantic search to massive production, this session is your survival guide.
performance
scalability
search
embeddings
talks.speakers
Pietro  Mele

Pietro Mele

Adelean

France

Italian, adopted by France not long ago, I am a constant learner, dedicated to computer science and discovery—whether uncovering solutions or gaining insights.
Benjamin Dauvissat

Benjamin Dauvissat

Adelean

France

Curious and passionate.

Java developer and Elasticsearch consultant.

I try to pass on what I learned before it becomes obsolete.
comments.title

comments.speakerNotEnabledComments