DataConference50min
Billion Vector Baby!
This session explores the real-world engineering challenges of scaling semantic search with Elasticsearch and OpenSearch to billions of vectors, focusing on cluster architecture, balancing precision and latency, and cost reduction via quantization and chunking—offering practical strategies for moving from small-scale demos to massive production deployments.
Pietro MeleAdelean
Benjamin DauvissatAdelean
Semantic search promises a revolution: contextual relevance and natural language understanding with just a few lines of code. On a notebook or a POC, it’s magical. But what happens when your index exceeds a billion vectors?
The magic quickly gives way to the brutality of engineering: exploding latency, uncontrolled infrastructure costs, and RAM challenges.
In this talk, we leave marketing buzz at the door and dive into the guts of Elasticsearch and OpenSearch at very large scale. We will cover how to:
If you need to move from a “Hello World” semantic search to massive production, this session is your survival guide.
The magic quickly gives way to the brutality of engineering: exploding latency, uncontrolled infrastructure costs, and RAM challenges.
In this talk, we leave marketing buzz at the door and dive into the guts of Elasticsearch and OpenSearch at very large scale. We will cover how to:
- Architect your clusters to handle a billion embeddings without failing.
- Optimize the critical trade-off between precision (recall) and performance (latency).
- Reduce costs using quantization strategies and intelligent chunking.
If you need to move from a “Hello World” semantic search to massive production, this session is your survival guide.
comments.speakerNotEnabledComments