Overview
pgvector was created by Andrew Kane in 2021 and has rapidly become the most widely adopted vector search solution, with 13K+ GitHub stars. The extension is supported by every major PostgreSQL provider: Supabase, Neon, AWS RDS for PostgreSQL, Google Cloud SQL, Azure Database for PostgreSQL, Crunchy Data, and Tembo. pgvector adds vector data types and similarity search operators to PostgreSQL, enabling exact and approximate nearest neighbor (ANN) search using IVFFlat and HNSW indexes. The extension stores vectors alongside relational data in the same database, eliminating the need for a separate vector database. pgvector supports vectors up to 16,000 dimensions, L2 distance, inner product, cosine distance, and L1 distance. The 0.7+ releases added significant performance improvements including parallel index builds and improved HNSW performance.
Key Features and Architecture
Vector Data Type
Store embedding vectors as a native PostgreSQL column type. Create a column with vector(1536) for OpenAI embeddings or vector(768) for sentence-transformers. Vectors are stored alongside relational data — user profiles, product catalogs, document metadata — in the same table and queried with standard SQL.
HNSW Index
Hierarchical Navigable Small World (HNSW) index provides fast approximate nearest neighbor search with high recall. HNSW indexes are built in-memory and provide sub-millisecond query times for datasets up to millions of vectors. Configure m (connections per layer) and ef_construction (build quality) parameters to trade build time for query accuracy.
IVFFlat Index
Inverted File Flat index partitions vectors into clusters for faster search. IVFFlat is faster to build than HNSW and uses less memory, making it suitable for larger datasets. Configure lists (number of clusters) and probes (clusters to search) to trade accuracy for speed.
SQL Integration
Query vectors with standard SQL operators: <-> for L2 distance, <#> for inner product, <=> for cosine distance. Combine vector search with SQL filters, joins, and aggregations in a single query. This is pgvector's killer feature — no separate API, no data synchronization, no additional infrastructure.
Hybrid Search
Combine vector similarity search with full-text search (tsvector) and relational filters in a single SQL query. For example: find the 10 most similar products to a query embedding WHERE category = 'electronics' AND price < 100 AND full_text_search matches 'wireless'. No other vector database provides this level of query flexibility.
Ideal Use Cases
Applications Already Using PostgreSQL
Any application with an existing PostgreSQL database that needs to add vector search. pgvector eliminates the need for a separate vector database — add the extension, create a vector column, build an index, and start querying. This is the most common use case and the reason for pgvector's massive adoption.
RAG (Retrieval-Augmented Generation)
Applications using LLMs with retrieval-augmented generation that need to store and search document embeddings. pgvector stores document chunks, embeddings, and metadata in the same table, making it easy to retrieve relevant context for LLM prompts. Supabase and Neon both promote pgvector for RAG applications.
Hybrid Search Applications
Applications that need to combine semantic vector search with traditional SQL filters. E-commerce product search (similar items + price range + category), document search (semantic similarity + date range + author), and recommendation systems (user similarity + business rules) all benefit from pgvector's SQL integration.
Small to Medium Scale Vector Search
Applications with up to 10 million vectors that need good-enough performance without the complexity of a dedicated vector database. pgvector with HNSW provides sub-10ms query times at this scale, which is sufficient for most web applications.
Pricing and Licensing
| Option | Cost | Details |
|---|---|---|
| pgvector Extension | $0 | Open-source, PostgreSQL license |
| Supabase (with pgvector) | $0-$25/month | Free tier: 500MB, Pro: 8GB, unlimited vectors |
| Neon (with pgvector) | $0-$19/month | Free tier: 512MB, Launch: 10GB |
| AWS RDS PostgreSQL | ~$30-200/month | db.t3.medium: ~$30/month, db.r6g.large: ~$200/month |
| Self-hosted PostgreSQL | Infrastructure costs only | Any PostgreSQL 12+ installation |
pgvector is free. The cost is your PostgreSQL hosting. Supabase's free tier (500MB) handles approximately 500K vectors with 1536 dimensions — enough for many RAG applications. AWS RDS starts at approximately $30/month for a small instance. For comparison, Pinecone's serverless pricing starts at $0.00/month (free tier) but scales to $70+/month for 1M+ vectors with frequent queries. Weaviate Cloud starts at $25/month. pgvector on existing Postgres infrastructure is often the cheapest option because there's no additional service cost — you're already paying for Postgres.
Pros and Cons
Pros
- No additional infrastructure — add vector search to existing PostgreSQL; no new database to manage
- SQL integration — query vectors with SQL alongside relational data, joins, filters, and aggregations
- Universal support — available on Supabase, Neon, AWS RDS, Google Cloud SQL, Azure, and self-hosted
- 13K+ GitHub stars — massive adoption, active development, strong community
- Hybrid search — combine vector similarity with full-text search and relational filters in one query
- Free and open-source — PostgreSQL license; no per-query or per-vector pricing
Cons
- Slower than purpose-built vector DBs — Pinecone and Milvus are 2-5x faster for pure vector search at scale
- Memory-intensive — HNSW indexes are stored in memory; large vector collections require significant RAM
- No built-in embedding generation — you must generate embeddings externally (OpenAI, sentence-transformers)
- Scale limitations — performance degrades above 10-50M vectors; purpose-built vector DBs handle billions
- Single-node — no built-in distributed search; scaling requires PostgreSQL replication or partitioning
Getting Started
Getting started takes under 10 minutes. Visit the official website to create an account or download the application. The onboarding process walks through initial configuration, and most users are productive within their first session. For teams evaluating against alternatives, we recommend a 2-week trial period to assess whether the feature set aligns with workflow requirements. Documentation, community forums, and support channels are available to help with setup and advanced configuration. Enterprise customers can request a guided onboarding session with the vendor's solutions team.
Alternatives and How It Compares
The competitive landscape in this category is active, with both open-source and commercial options available. When comparing alternatives, focus on integration depth with your existing stack, pricing at your expected scale, and the quality of documentation and community support. Each tool makes different trade-offs between ease of use, flexibility, and enterprise features.
Pinecone
Pinecone is a fully managed vector database with serverless pricing. Pinecone for teams wanting zero infrastructure management and maximum query performance; pgvector for teams already using PostgreSQL that want simplicity and SQL integration.
Milvus
Milvus is a distributed vector database for billion-scale search. Milvus for massive-scale vector search (100M+ vectors); pgvector for small-to-medium scale with SQL integration. Milvus is more complex to operate.
Weaviate
Weaviate provides vector search with built-in vectorization modules. Weaviate for applications needing built-in embedding generation; pgvector for applications that generate embeddings externally and want SQL integration.
Qdrant
Qdrant provides high-performance vector search with rich filtering. Qdrant for applications needing advanced filtering with vector search; pgvector for applications already using PostgreSQL.
Frequently Asked Questions
Is pgvector free?
Yes, pgvector is open-source under the PostgreSQL license. It's free to use on any PostgreSQL 12+ installation.
How many vectors can pgvector handle?
pgvector works well up to 10-50 million vectors with HNSW indexes. Beyond that, purpose-built vector databases like Milvus or Pinecone provide better performance.
Does pgvector support HNSW?
Yes, pgvector supports HNSW indexes (added in version 0.5.0) for fast approximate nearest neighbor search, as well as IVFFlat indexes. HNSW provides better recall and query performance than IVFFlat for most workloads, while IVFFlat is faster to build and uses less memory.