Vector Database
Plain Explanation
A vector database stores embeddings: numeric vectors that represent the meaning of text, images, code, or other objects. If a collection has millions of vectors, comparing a query against every vector is too slow. A vector database indexes those vectors, applies metadata filters, and returns nearby items with predictable latency.
Examples & Analogies
Think of it as a library index organized by meaning, not exact wording. A support bot can find refund-policy documents even if the user asks in different words. An ecommerce app can find visually similar products from an uploaded photo. A developer tool can search code by intent rather than exact function names.
At a Glance
- Managed vector DB: quick launch and lower ops burden; watch cost, data residency, and migration risk.
- Open-source distributed vector DB: useful for scale and control; requires cluster and index operations.
- pgvector / relational extension: good for prototypes and relational workflows; validate tail latency as data grows.
- Redis/vector cache: useful for a hot low-latency subset; memory cost and persistence model matter.
Where and Why It Matters
RAG quality depends heavily on retrieval quality. A vector database helps find relevant documents, narrow results with metadata such as tenant, product, or date, and operate freshness and latency as explicit metrics instead of hidden side effects.
Common Misconceptions
- ❌ Myth: A vector DB replaces relational databases.
- ✅ Reality: It complements them for similarity search; transactions and joins may still belong elsewhere.
- ❌ Myth: More vectors automatically improve results.
- ✅ Reality: Chunking, embedding quality, metadata, and evaluation matter as much as storage.
- ❌ Myth: The fastest index is always best.
- ✅ Reality: Track recall@k, p95 latency, freshness, and cost together.
How It Sounds in Conversation
- "If retrieval quality dropped, check chunking, embedding version, and metadata filters first."
- "pgvector is fine for the prototype; if p95 rises after backfill, evaluate a dedicated vector DB."
- "Do not optimize latency alone. Measure recall@k with the same query set."
Related Reading
References
- What is a Vector Database & How Does it Work?
Explains the common indexing/query pipeline and the accuracy vs speed trade-offs for ANN search.
- Milvus — Data Processing
Detailed write path, vchannel→pchannel routing, growing vs sealed segments and index-building mechanics.
- How Pinecone Works: Architecture and Engineering Deep Dive
Technical reference for memtable, LSM-like slab system, compaction, per-slab algorithm choices and write/read paths.
- Milvus Architecture Overview
Milvus cluster design: coordinator, streaming/query/data nodes, WAL, segments and object-storage persistence.
- Best Vector Databases in 2026: A Complete Comparison Guide
Practical comparison and real-world notes on scale, HNSW performance and when teams choose Milvus, Pinecone or embedded options.