Vector Databases Compared

Article summary

A vector database stores embeddings and builds an approximate-nearest-neighbor (ANN) index so you can find the most similar vectors fast, even across millions of records. Choosing one is not about a leaderboard; it is about four axes: managed service versus self-hosted, a standalone engine versus a Postgres extension like pgvector, how well it filters on metadata alongside the vector search, and whether it holds up at your scale. Most teams should start with the option closest to the database they already run.

As of 2026-05-31

A vector database does two jobs. It stores your embeddings, and it builds an index that finds the nearest vectors to a query without scanning all of them. The second job is the reason these systems exist. Comparing your query embedding against ten thousand stored vectors is trivial; doing it against fifty million on every request is not, and a brute-force scan would be far too slow.

The core job: store plus ANN index

Finding the closest vectors by checking every one is called exact nearest-neighbor search. It is accurate and it does not scale. Vector databases instead build an approximate-nearest-neighbor (ANN) index, most commonly HNSW (Hierarchical Navigable Small World graphs). HNSW arranges vectors into a navigable graph so a query hops toward its neighbors and only inspects a small slice of the data.

The "approximate" part means you might occasionally miss the true top result, but you tune that. Index parameters trade recall against speed and memory. This is the central knob a vector database gives you, and it is why a real comparison is about your recall and latency targets, not about a single throughput number from someone else's benchmark.

The axes that actually decide it

Forget the leaderboard framing. Pick on these.

Managed versus self-hosted. A managed service (you send vectors, they run the cluster) removes operational load and is the fastest way to ship. Self-hosting keeps data in your environment and removes per-vector pricing, at the cost of running and scaling the system yourself. This is usually the first fork in the road.

Standalone engine versus Postgres extension. Standalone vector engines such as Pinecone, Weaviate, Qdrant, and Milvus are purpose-built for vectors and tend to scale further with more index tuning. The pgvector extension adds vector columns and ANN indexes to Postgres, so your vectors live next to your relational data with one backup and one transaction story. If you already run Postgres at moderate scale, that simplicity is worth a lot.

Metadata filtering. Real queries are rarely pure similarity. You want "the nearest chunks from this user's documents, published after January, tagged 'billing.'" How a system combines metadata filters with the ANN search matters enormously. Some apply the filter after retrieval (which can return too few results), some build filtering into the index. If your access patterns lean on filters, test this specifically.

Scale and operations. Number of vectors, write throughput, query latency at your p99, and whether you need horizontal sharding. A system that is delightful at one million vectors can fall over at one billion. Match the tool to the scale you actually have, plus a year of growth, not to a hypothetical.

A starting recommendation

The lowest-regret move is to start with whatever is closest to your existing stack:

Already on Postgres, moderate scale: pgvector.
Want zero operations, fast to ship: a managed vector service.
Need a self-hosted engine that scales and tunes deeply: a standalone engine you run.

Then prototype with real data. Load a representative slice, run your actual query mix with metadata filters attached, and measure recall and latency. The product that wins on your workload is the one to keep. A vendor benchmark run on someone else's data and someone else's filters tells you almost nothing about how the system behaves on yours.

Frequently asked questions

Do I even need a vector database?

Not always. If you have fewer than a few hundred thousand vectors and modest query volume, an in-memory library (FAISS, hnswlib) or a Postgres table with pgvector is often enough and far simpler to operate. A dedicated vector database earns its keep when you hit millions of vectors, need horizontal scaling, want managed operations, or need rich metadata filtering combined with vector search at low latency. Start simple and graduate when you actually feel the pain.

What is an ANN index and why not exact search?

Exact nearest-neighbor search compares your query to every stored vector, which is accurate but gets slow as the dataset grows. An approximate-nearest-neighbor (ANN) index, such as HNSW, organizes vectors so a query only checks a small fraction of them, trading a tiny amount of recall for a large speedup. At a few thousand vectors, exact search is fine. At millions, ANN is the difference between milliseconds and seconds.

Should I use pgvector or a standalone vector database?

If you already run Postgres and your scale is moderate, pgvector keeps everything in one system: one backup story, one set of credentials, transactional consistency between your rows and their vectors. A standalone engine (Pinecone, Weaviate, Qdrant, Milvus) is built specifically for vectors and tends to scale further and offer richer ANN tuning and filtering, at the cost of running another system. Reach for standalone when pgvector stops keeping up on latency or scale.