Problem: Your PostgreSQL Data Can't Answer "Find Something Similar"
LIKE queries match exact strings. Full-text search matches keywords. Neither understands meaning.
When a user searches "affordable sedans under 30k" in your car listings database, they won't find rows containing "budget compact vehicles" — even though it's the same intent.
pgvector fixes this. It adds a vector column type to PostgreSQL, lets you store text embeddings alongside your existing data, and answers similarity questions with a single SQL query.
You'll learn:
- How to install pgvector and create your first vector column
- How to generate and store OpenAI embeddings in PostgreSQL
- How to run fast ANN (approximate nearest neighbor) queries with HNSW indexes
- How to combine semantic search with traditional SQL filters (hybrid search)
Time: 25 min | Difficulty: Intermediate
Why pgvector Instead of a Dedicated Vector DB
Pinecone, Qdrant, and Weaviate are purpose-built for vectors. They're excellent — but they're extra infrastructure, extra cost, and extra sync complexity.
If you're already on PostgreSQL, pgvector gives you 80–90% of the capability with zero new services to run. Your embeddings sit in the same database as your users, products, and orders. You get joins, transactions, backups, and row-level security for free.
When to use pgvector:
- Existing PostgreSQL stack
- < 5M vectors (scales further with partitioning)
- You want SQL-native hybrid search (vector + filters in one query)
When to use a dedicated vector DB:
10M vectors with strict sub-10ms latency SLAs
- Multi-tenancy with per-tenant vector spaces at scale
- Teams without PostgreSQL expertise
How pgvector Works
pgvector stores vectors as a new column type (vector(dimensions)). Each row gets a floating-point array — the embedding of your text, image, or structured data.
At query time, pgvector computes the distance between your query vector and every stored vector. With an index, it approximates this with ANN search instead of exhaustive scan.
Text/Content ──▶ Embedding Model ──▶ [0.12, -0.45, 0.88, …] ──▶ vector column
│
Query Text ────▶ Embedding Model ──▶ query vector ──▶ <=> operator ──▶ Top-K results
Three distance operators:
<=>— cosine distance (most common for text)<->— L2 / Euclidean distance<#>— negative inner product (for dot-product similarity)
Solution
Step 1: Install pgvector
Docker (fastest for dev):
# Postgres 16 with pgvector pre-installed
docker run -d \
--name pgvector-dev \
-e POSTGRES_PASSWORD=secret \
-p 5432:5432 \
pgvector/pgvector:pg16
Existing PostgreSQL on Ubuntu:
# Install build deps
sudo apt install postgresql-server-dev-16
# Clone and build pgvector
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
Enable the extension in your database:
CREATE EXTENSION IF NOT EXISTS vector;
-- Verify
SELECT * FROM pg_extension WHERE extname = 'vector';
Expected output:
extname | extversion
---------+-----------
vector | 0.8.0
If it fails:
ERROR: could not open extension control file→ pgvector isn't installed at the OS level; repeat the build step aboveERROR: extension "vector" already exists→ safe to ignore, or useIF NOT EXISTS
Step 2: Create a Table with a Vector Column
-- Example: product catalog with semantic search
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
description TEXT NOT NULL,
price NUMERIC(10, 2),
category TEXT,
-- 1536 dimensions = OpenAI text-embedding-3-small output size
embedding vector(1536)
);
The dimension count must match your embedding model exactly:
| Model | Dimensions |
|---|---|
text-embedding-3-small | 1536 |
text-embedding-3-large | 3072 |
text-embedding-ada-002 | 1536 |
nomic-embed-text (local) | 768 |
mxbai-embed-large (local) | 1024 |
Step 3: Generate and Store Embeddings
Install dependencies:
pip install openai psycopg2-binary python-dotenv
# embed_products.py
import os
import psycopg2
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def get_embedding(text: str) -> list[float]:
# Embed the description — what users search for
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
def store_product(conn, name: str, description: str, price: float, category: str):
embedding = get_embedding(description)
with conn.cursor() as cur:
cur.execute(
"""
INSERT INTO products (name, description, price, category, embedding)
VALUES (%s, %s, %s, %s, %s)
""",
(name, description, price, category, embedding)
)
conn.commit()
# Sample data
products = [
("Trail Runner X9", "Lightweight hiking boot for rocky mountain terrain, waterproof", 189.99, "footwear"),
("Budget Road Bike", "Affordable entry-level road bicycle for commuting and fitness", 549.00, "cycling"),
("Noise-Cancel Pro Headphones", "Over-ear headphones with active noise cancellation for travel", 299.00, "audio"),
("Ultralight Tent", "Two-person backpacking tent weighing under 1kg, easy setup", 399.00, "camping"),
("Commuter E-Bike", "Electric bicycle for city commuting, 60km range per charge", 1299.00, "cycling"),
]
conn = psycopg2.connect("postgresql://postgres:secret@localhost:5432/postgres")
for name, desc, price, category in products:
store_product(conn, name, desc, price, category)
print(f"Stored: {name}")
conn.close()
Run it:
OPENAI_API_KEY=sk-... python embed_products.py
Expected output:
Stored: Trail Runner X9
Stored: Budget Road Bike
Stored: Noise-Cancel Pro Headphones
Stored: Ultralight Tent
Stored: Commuter E-Bike
Step 4: Run a Semantic Search Query
# search.py
import os
import psycopg2
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def semantic_search(query: str, top_k: int = 3) -> list[dict]:
# Embed the user's query with the same model used at insert time
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=query
).data[0].embedding
conn = psycopg2.connect("postgresql://postgres:secret@localhost:5432/postgres")
with conn.cursor() as cur:
cur.execute(
"""
SELECT
name,
description,
price,
category,
-- cosine distance: 0 = identical, 2 = opposite
1 - (embedding <=> %s::vector) AS similarity
FROM products
ORDER BY embedding <=> %s::vector
LIMIT %s
""",
(query_embedding, query_embedding, top_k)
)
rows = cur.fetchall()
conn.close()
return [
{"name": r[0], "description": r[1], "price": r[2], "category": r[3], "similarity": round(r[4], 4)}
for r in rows
]
results = semantic_search("bike for getting around the city")
for r in results:
print(f"{r['similarity']:.4f} {r['name']} (${r['price']})")
Expected output:
0.8821 Commuter E-Bike ($1299.00)
0.7634 Budget Road Bike ($549.00)
0.4112 Trail Runner X9 ($189.99)
The E-Bike wins even though the query never says "electric" — the model understands "getting around the city" maps to commuter use.
Step 5: Add an HNSW Index for Production Speed
Without an index, pgvector does an exact sequential scan — fine for < 50k rows, too slow beyond that.
HNSW (Hierarchical Navigable Small World) is the recommended index type. It's faster at query time than IVFFlat and doesn't require a training step.
-- Create HNSW index on cosine distance
-- m: connections per node (16 is default; higher = better recall, more memory)
-- ef_construction: build-time search width (64 default; higher = better index quality)
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Tune query-time recall with ef_search (per session or globally):
-- Higher ef_search = better recall, slightly slower query
SET hnsw.ef_search = 100;
-- Now run your similarity query — same syntax, index is used automatically
SELECT name, 1 - (embedding <=> '[...]'::vector) AS similarity
FROM products
ORDER BY embedding <=> '[...]'::vector
LIMIT 5;
IVFFlat vs HNSW:
| IVFFlat | HNSW | |
|---|---|---|
| Index build speed | Fast | Slower |
| Query speed | Fast | Faster |
| Recall accuracy | Good | Better |
| Memory usage | Lower | Higher |
| Requires training data | Yes (need rows first) | No |
Rule of thumb: Use HNSW for new projects. IVFFlat only if memory is tightly constrained.
Step 6: Hybrid Search — Semantic + SQL Filters
The real power of pgvector over standalone vector databases: combine vector similarity with any SQL predicate in a single query.
def hybrid_search(query: str, category: str = None, max_price: float = None, top_k: int = 5) -> list[dict]:
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=query
).data[0].embedding
# Build filter conditions dynamically
conditions = []
params = [query_embedding, query_embedding]
if category:
conditions.append("category = %s")
params.append(category)
if max_price:
conditions.append("price <= %s")
params.append(max_price)
where_clause = ("WHERE " + " AND ".join(conditions)) if conditions else ""
params.append(top_k)
sql = f"""
SELECT name, description, price, category,
1 - (embedding <=> %s::vector) AS similarity
FROM products
{where_clause}
ORDER BY embedding <=> %s::vector
LIMIT %s
"""
conn = psycopg2.connect("postgresql://postgres:secret@localhost:5432/postgres")
with conn.cursor() as cur:
cur.execute(sql, params)
rows = cur.fetchall()
conn.close()
return [
{"name": r[0], "description": r[1], "price": r[2], "category": r[3], "similarity": round(r[4], 4)}
for r in rows
]
# "Show me cycling gear under $600"
results = hybrid_search("something to ride around town", category="cycling", max_price=600)
for r in results:
print(f"{r['similarity']:.4f} {r['name']} (${r['price']})")
Expected output:
0.7634 Budget Road Bike ($549.00)
The E-Bike is excluded by the price <= 600 filter — vector similarity still ranks within the filtered set.
Verification
Check your index is actually being used:
EXPLAIN (ANALYZE, BUFFERS)
SELECT name, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM products
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 5;
Look for Index Scan using products_embedding_idx in the output. If you see Seq Scan, the index isn't being hit — usually because the table is too small (PostgreSQL skips indexes on tiny tables) or ef_search is set too high.
Check vector column sizes and row counts:
SELECT
COUNT(*) AS total_rows,
COUNT(embedding) AS rows_with_embeddings,
COUNT(*) - COUNT(embedding) AS missing_embeddings
FROM products;
You should see: missing_embeddings = 0 before going to production.
Production Considerations
Batch your embedding calls. OpenAI's API accepts up to 2048 inputs per request. Embedding one row at a time is slow and expensive.
# ✅ Batch embed — 100x faster than one-by-one
texts = [row["description"] for row in rows]
response = client.embeddings.create(model="text-embedding-3-small", input=texts)
embeddings = [item.embedding for item in response.data]
Use pgvector with connection pooling. PgBouncer works fine; just make sure hnsw.ef_search is set at the query level, not session level, if you're using transaction-mode pooling.
Keep embedding model consistent. If you switch from text-embedding-3-small to text-embedding-3-large, re-embed everything. Mixing dimensions or models in the same column produces garbage similarity scores.
Index maintenance. HNSW indexes update automatically on INSERT/UPDATE. For bulk loads (100k+ rows), insert first, then create the index — it builds much faster on existing data than incrementally.
Monitor index bloat for high-write tables:
SELECT pg_size_pretty(pg_relation_size('products_embedding_idx')) AS index_size;
What You Learned
- pgvector adds a native
vectortype to PostgreSQL — no new database required - The
<=>operator computes cosine distance;ORDER BY embedding <=> query_vec LIMIT kis your core query pattern - HNSW indexes make similarity queries fast at scale; create them after bulk inserts
- Hybrid search (vector + SQL filters) is pgvector's biggest advantage over standalone vector databases — it's a single query, not two round trips
Limitation: pgvector's ANN search trades a small amount of recall accuracy for speed. For use cases requiring 100% recall (legal, medical), run exact search (SET enable_indexscan = off) on a partitioned subset rather than the full table.
Tested on pgvector 0.8.0, PostgreSQL 16.3, Python 3.12, openai SDK 1.30, Ubuntu 24.04 and macOS 14