Vol.01 · No.10 CS · AI · Infra May 13, 2026

AI Glossary

GlossaryReferenceLearn
Data Engineering LLM & Generative AI

Vector Database

Difficulty

Plain Explanation

A vector database stores embeddings: numeric vectors that represent the meaning of text, images, code, or other objects. If a collection has millions of vectors, comparing a query against every vector is too slow. A vector database indexes those vectors, applies metadata filters, and returns nearby items with predictable latency.

Examples & Analogies

Think of it as a library index organized by meaning, not exact wording. A support bot can find refund-policy documents even if the user asks in different words. An ecommerce app can find visually similar products from an uploaded photo. A developer tool can search code by intent rather than exact function names.

At a Glance

  • Managed vector DB: quick launch and lower ops burden; watch cost, data residency, and migration risk.
  • Open-source distributed vector DB: useful for scale and control; requires cluster and index operations.
  • pgvector / relational extension: good for prototypes and relational workflows; validate tail latency as data grows.
  • Redis/vector cache: useful for a hot low-latency subset; memory cost and persistence model matter.

Where and Why It Matters

RAG quality depends heavily on retrieval quality. A vector database helps find relevant documents, narrow results with metadata such as tenant, product, or date, and operate freshness and latency as explicit metrics instead of hidden side effects.

Common Misconceptions

  • ❌ Myth: A vector DB replaces relational databases.
  • ✅ Reality: It complements them for similarity search; transactions and joins may still belong elsewhere.
  • ❌ Myth: More vectors automatically improve results.
  • ✅ Reality: Chunking, embedding quality, metadata, and evaluation matter as much as storage.
  • ❌ Myth: The fastest index is always best.
  • ✅ Reality: Track recall@k, p95 latency, freshness, and cost together.

How It Sounds in Conversation

  • "If retrieval quality dropped, check chunking, embedding version, and metadata filters first."
  • "pgvector is fine for the prototype; if p95 rises after backfill, evaluate a dedicated vector DB."
  • "Do not optimize latency alone. Measure recall@k with the same query set."

Related Reading

References

Helpful?