When Do You Actually Need a Vector Database? Honest Take

If you have been anywhere near the AI developer community in 2023, you have heard about vector databases. Pinecone, Weaviate, Qdrant, Chroma, Milvus -- new options seem to appear weekly. Every RAG (Retrieval-Augmented Generation) tutorial starts with "first, set up your vector database." It has become assumed infrastructure for any AI application.

I've implemented vector search in two of my products this year. One implementation was the right call. The other was over-engineering that I eventually ripped out. Here is what I learned about when you actually need a vector database and when simpler alternatives will do.

What Vectors Actually Are

Before the practical advice, a brief explanation for anyone who finds the concept fuzzy.

A vector is a list of numbers that represents something. In the context of AI, we use "embeddings" -- vectors that represent text, images, or other data in a high-dimensional space. The key property is that similar things have similar vectors. The embedding for "pilot exam preparation" is close (in vector space) to the embedding for "aviation test study guide" even though they share almost no words.

A vector database is optimized for storing these vectors and finding the nearest neighbors to a query vector. You turn your query into a vector, ask the database "what stored vectors are closest to this one?" and get back the most semantically similar items.

This is fundamentally different from keyword search, which matches exact words. Vector search matches meaning. This is powerful, but it isn't always necessary.

When You Need Vector Search

Use Case 1: Semantic Search Over Unstructured Content

If you have a large corpus of unstructured text and users need to search it by meaning rather than keywords, vector search is the right tool.

In Aviation Infinity, the knowledge base contains thousands of explanations, study notes, and reference materials. Students search this content with queries like "what happens when you fly into a cumulonimbus cloud?" Keyword search would match documents containing those exact words. Vector search matches documents that discuss the effects of flying through thunderstorms, even if they use completely different terminology.

I implemented this with OpenAI's embedding API to generate vectors and Pinecone as the vector store. The search quality improvement over keyword search was significant. Students find relevant content on the first try more often, which directly improves their study experience.

Use Case 2: RAG for LLM Grounding

If you're using LLMs and need to ground their responses in your own data, vector search is the standard approach for retrieval.

When Aviation Infinity's AI generates an explanation, it first retrieves relevant content from the knowledge base using vector search. The query is the student's question and the incorrect answer they selected. The retrieved content becomes context for the LLM, grounding its response in verified information.

Without vector search, I'd need to manually map questions to relevant content -- a labor-intensive process that doesn't scale. Vector search automates this mapping by finding semantically relevant content regardless of how it's organized or labeled.

Use Case 3: Recommendation Systems

If you need to recommend items that are "similar" to something a user has shown interest in, and similarity is semantic rather than attribute-based, vector search works well.

I explored this for Babonbo -- recommending equipment listings similar to ones a user has viewed. But I ultimately didn't implement it (more on why below).

When You Do Not Need Vector Search

Over-Engineering Alert: Small Datasets

If your dataset has fewer than a few thousand items, you probably don't need a vector database. You can compute embeddings for your entire dataset, store them in memory or in a regular database, and do brute-force nearest-neighbor search. It will be fast enough.

I made this mistake with an internal tool that searched across about 500 documents. I set up a vector database, configured the embedding pipeline, and deployed the infrastructure. Then I realized I could store all 500 embeddings in a JSON file and do cosine similarity in a loop in under 50 milliseconds. I ripped out the vector database and replaced it with 20 lines of code.

Over-Engineering Alert: Structured Data

If your data is structured and your queries can be expressed as filters and sorts, use a regular database. Vector search is for semantic similarity over unstructured data. Using it for structured queries is like using a hammer to turn a screw.

For Babonbo's equipment recommendations, I initially planned to use vector search on listing descriptions. But listing similarity is mostly about structured attributes: equipment type, child age range, location, price range, and condition. A MongoDB query with filters and sorting produces better recommendations than vector search on description text, and it's simpler, cheaper, and faster.

Over-Engineering Alert: When Keywords Work Fine

Sometimes keyword search is good enough. If your users search with specific, well-defined terms, and your content uses consistent terminology, keyword search with a good tokenizer and some fuzzy matching will serve them well.

Vector search shines when there's a vocabulary mismatch between queries and content. If your users and your content use the same words for the same things, the vocabulary mismatch problem doesn't exist, and vector search adds complexity without proportional value.

The Practical Implementation

For those who do need vector search, here is my recommended approach based on what I've learned.

Embedding Model Selection

I use OpenAI's text-embedding-ada-002 for generating embeddings. It is cheap, fast, and produces good-quality embeddings. The 1536-dimensional vectors it produces are larger than some alternatives, which affects storage costs, but the quality-to-cost ratio is the best I've found.

For cost-sensitive applications, open-source embedding models running locally are an option. The quality is slightly lower but the per-query cost is effectively zero once you have the infrastructure. I've not needed this yet because the OpenAI embedding API is cheap enough for my scale.

Vector Database Selection

I use Pinecone for production. It is managed, reliable, and the API is straightforward. The pricing is reasonable for my usage levels.

For development and testing, I use Chroma, which runs locally and has a similar API. This lets me develop without network latency or API costs.

If I were starting today and my data was already in MongoDB (which it's for most of my products), I'd seriously evaluate MongoDB Atlas Vector Search. Having vectors in the same database as the rest of your data eliminates an entire category of synchronization problems.

Chunking Strategy

How you split your content into chunks for embedding matters more than which vector database you use. Too large and the embeddings become too general. Too small and you lose context.

For Aviation Infinity's knowledge base, I chunk by logical section with overlap. Each chunk is a self-contained explanation or concept, typically 200-500 tokens. Chunks overlap by about 50 tokens at the boundaries to preserve context that spans chunk boundaries.

I experimented with fixed-size chunking (every 256 tokens) and it produced worse search results because chunks arbitrarily split in the middle of concepts. Content-aware chunking takes more effort but produces better results.

Hybrid Search

The best search implementations combine vector search and keyword search. Vector search handles semantic similarity. Keyword search handles exact matches and proper nouns that embeddings may not capture well.

My implementation runs both searches in parallel and merges the results using reciprocal rank fusion. This consistently outperforms either approach alone.

The Honest Assessment

Vector databases are a useful tool for specific problems. They aren't a requirement for every AI application. The AI community has a tendency to add complexity because it's interesting, not because it's necessary.

Before adding a vector database to your stack, ask yourself: can I solve this problem with a regular database query, a keyword search, or a simple in-memory computation? If the answer is yes, do that instead. You can always add vector search later if you need it. Ripping it out after you have added it (as I learned the hard way) is much harder.

Build for your actual needs, not for the architecture diagram you want to show at a conference.