A vector database stores text (or images, audio, or other data) as lists of numbers called embeddings. These numbers capture meaning, not just characters. Two sentences that say the same thing in different words end up with similar numbers. Two sentences that sound alike but mean different things end up far apart.
That's the shift from keyword search to semantic search: matching meaning instead of matching exact words.
Why it matters for product features
Traditional databases can't answer questions like "find all feedback that's basically the same idea." A keyword search misses synonyms. A fuzzy match catches typos but not concepts.
A vector database handles this well. You convert text to embeddings, store them, and when a new piece of text comes in, you ask: what's already in here that looks similar? This is what powers AI features like duplicate detection, semantic search, and recommendation systems.
The practical reality
You don't usually build a vector database from scratch. Tools like Pinecone, Weaviate, or pgvector (a Postgres extension) handle the storage layer. The real engineering work is in how you generate and refresh embeddings, and how you tune the similarity threshold so "similar" means what you actually want it to mean.
The threshold question is PM-relevant. Too tight and you miss obvious duplicates. Too loose and you merge things that shouldn't be merged. That's a product judgment call, not just a technical one.