Vector Database

Vector databases solve a specific problem: given a query vector, find the most similar vectors among millions or billions of stored embeddings. Traditional databases use exact matching or B-tree indexes; vector databases use approximate nearest neighbor (ANN) algorithms like HNSW or IVF that trade a small amount of accuracy for massive speed gains.

The major players include Pinecone (fully managed, great DX), Qdrant (excellent performance/price), Weaviate (strong hybrid search), and pgvector (PostgreSQL extension for teams who want simplicity). Each makes different trade-offs in terms of cost, operational complexity, and performance at scale.

For growth teams, vector databases are the backbone of RAG systems, recommendation engines, and semantic search. They enable features like "find similar items," "search by meaning," and "retrieve relevant context for AI responses." The right choice depends on your scale: pgvector handles under 1M vectors elegantly, while Pinecone or Qdrant are better suited for tens of millions.

Related Terms

Embeddings

Cosine Similarity

RAG (Retrieval-Augmented Generation)

LLM (Large Language Model)

Fine-Tuning

Prompt Engineering

Further Reading

Vector Databases Compared: Pinecone vs Weaviate vs Qdrant vs Milvus

5 Common RAG Pipeline Mistakes (And How to Fix Them)