pgvector

Open-source vector similarity search for Postgres

20.5k
Stars
+473
Stars/month
0
Releases (6m)

Star Growth

+84 (0.4%)
20.1k20.5k21.0kMar 27Apr 1

Overview

pgvector is an open-source PostgreSQL extension that brings vector similarity search capabilities directly into your relational database. It allows you to store and query high-dimensional vectors alongside your regular data, supporting exact and approximate nearest neighbor search with multiple distance metrics including L2 distance, inner product, cosine distance, and Hamming distance. The extension supports various vector types from single-precision to binary and sparse vectors, making it versatile for different AI and machine learning applications. What makes pgvector particularly valuable is that it maintains all of PostgreSQL's core strengths - ACID compliance, point-in-time recovery, JOINs, and robust transaction support - while adding vector search capabilities. This means you can perform complex queries that combine vector similarity with traditional SQL operations, eliminating the need for separate vector databases in many scenarios. With over 20,000 GitHub stars, pgvector has become the de facto standard for vector search in PostgreSQL environments. It supports any programming language with a Postgres client and works with Postgres 13+, offering broad compatibility across different development stacks and deployment scenarios.

Deep Analysis

Key Differentiator

Vector search as a native Postgres extension — unlike standalone vector DBs (Pinecone, Weaviate), pgvector keeps vectors with your relational data, enabling JOINs, ACID transactions, and point-in-time recovery with zero infrastructure overhead

Capabilities

  • Vector similarity search as a Postgres extension
  • Exact and approximate nearest neighbor search (HNSW, IVFFlat)
  • Multiple distance metrics (L2, cosine, inner product, L1, Hamming, Jaccard)
  • Half-precision, binary, and sparse vector support
  • Full ACID compliance with Postgres
  • Works with any Postgres client in any language

🔗 Integrations

PostgreSQL 13+Any Postgres client libraryPythonRubyJavaGoRustC#Node.jsElixir

Best For

  • Adding vector search to existing PostgreSQL applications
  • Teams wanting ACID-compliant vector storage with SQL joins

Not Ideal For

  • Teams not using PostgreSQL
  • Applications needing specialized vector DB features (multi-tenancy, built-in reranking)

Languages

SQLC

Deployment

Postgres extensionDockerHomebrewAPT/Yumconda-forgeHosted Postgres providers

Pricing Detail

Free: Fully open source (PostgreSQL license)
Paid: N/A — free

Known Limitations

  • Tied to PostgreSQL — cannot be used with other databases
  • HNSW index limited to 2000 dimensions for vector type
  • Approximate search trades recall for speed — tuning required
  • Not a standalone vector database — requires Postgres administration

Pros

  • + Native PostgreSQL integration preserves ACID compliance, transactions, and allows complex JOINs between vector and relational data
  • + Supports multiple vector types (single/half-precision, binary, sparse) and distance metrics (L2, cosine, inner product, Hamming, Jaccard)
  • + Wide ecosystem compatibility with any language that has a Postgres client and available through multiple installation methods

Cons

  • - Requires PostgreSQL expertise and may have steeper learning curve compared to dedicated vector databases
  • - Installation complexity varies by platform, especially on Windows systems
  • - Performance may not match specialized vector databases for very large-scale vector workloads

Use Cases

  • RAG (Retrieval Augmented Generation) applications where embeddings need to be stored alongside document metadata and user data
  • E-commerce recommendation systems that combine vector similarity with product catalog data and user preferences
  • Semantic search applications where vector queries need to be combined with traditional filters and business logic

Getting Started

1. Install pgvector extension (compile from source, use package manager, or choose hosted provider). 2. Enable the extension in your database with 'CREATE EXTENSION vector;' and create a table with vector column like 'CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));'. 3. Insert vector data with 'INSERT INTO items (embedding) VALUES ('[1,2,3]');' and query nearest neighbors using 'SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;'.

Compare pgvector