Problem: Flowise Chatbots Forget Everything Between Sessions
By default, Flowise stores conversation memory in RAM. Restart the server — or close the browser tab — and the chatbot starts fresh. For production chatbots, this is a dealbreaker.
PostgreSQL with pgvector gives you persistent vector storage that survives restarts, scales to millions of embeddings, and lets multiple Flowise instances share the same memory.
You'll learn:
- How to spin up PostgreSQL with pgvector enabled
- How to wire the Postgres Vector Store node in Flowise
- How to connect it to a Buffer Memory node for true long-term recall
Time: 25 min | Difficulty: Intermediate
Why RAM Memory Breaks in Production
Flowise's default BufferMemory node stores chat history as a JavaScript array in the Node.js process. Three things kill it:
- Server restart wipes all sessions
- Horizontal scaling means each container has its own isolated memory
- Long conversations eventually hit RAM limits and get truncated
pgvector solves the first two completely. For truncation, you pair it with a retrieval strategy (covered in Step 5).
What you need before starting:
- Flowise 2.x running (Docker or local)
- Docker installed for the PostgreSQL container
- An OpenAI API key (or any embedding model Flowise supports)
Solution
Step 1: Start PostgreSQL with pgvector
The official pgvector/pgvector image ships with the extension pre-installed.
# Start a pgvector-enabled Postgres container
docker run -d \
--name flowise-pgvector \
-e POSTGRES_USER=flowise \
-e POSTGRES_PASSWORD=flowise_secret \
-e POSTGRES_DB=flowise_memory \
-p 5432:5432 \
pgvector/pgvector:pg16
# Verify the container is running
docker ps | grep flowise-pgvector
Expected output:
abc123def456 pgvector/pgvector:pg16 ... Up 5 seconds 0.0.0.0:5432->5432/tcp
Now enable the pgvector extension inside the database:
docker exec -it flowise-pgvector psql -U flowise -d flowise_memory -c "CREATE EXTENSION IF NOT EXISTS vector;"
Expected output:
CREATE EXTENSION
If it fails:
could not open extension control file→ You're on a plainpostgresimage, notpgvector/pgvector. Pull the correct image.permission denied→ Add--user postgresto thedocker execcommand.
Step 2: Create the Embeddings Table
Flowise will create the table automatically on first use, but creating it manually gives you control over the vector dimensions upfront. OpenAI text-embedding-3-small outputs 1536 dimensions.
-- Connect to the database
docker exec -it flowise-pgvector psql -U flowise -d flowise_memory
-- Create the table Flowise expects
CREATE TABLE IF NOT EXISTS documents (
id BIGSERIAL PRIMARY KEY,
content TEXT,
metadata JSONB,
embedding vector(1536) -- match your embedding model's output size
);
-- Index for fast ANN search (IVFFlat is good for < 1M rows)
CREATE INDEX IF NOT EXISTS documents_embedding_idx
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Dimension reference by model:
| Embedding Model | Dimensions |
|---|---|
text-embedding-3-small | 1536 |
text-embedding-3-large | 3072 |
text-embedding-ada-002 | 1536 |
Ollama nomic-embed-text | 768 |
Step 3: Add the Postgres Vector Store Node in Flowise
Open your Flowise chatflow and add the following nodes from the node panel:
- ChatOpenAI (or your LLM of choice)
- OpenAI Embeddings — set model to
text-embedding-3-small - Postgres (under Vector Stores)
Click the Postgres node to configure it:
| Field | Value |
|---|---|
| Host | localhost (or your Docker host IP) |
| Port | 5432 |
| Database | flowise_memory |
| Username | flowise |
| Password | flowise_secret |
| Table Name | documents |
| Embeddings | Connect your OpenAI Embeddings node |
Click Test Connection in the node. You should see a green checkmark.
If the connection fails:
ECONNREFUSED→ Flowise can't reach port 5432. If Flowise is also in Docker, use the container name (flowise-pgvector) as the host and put both containers on the same Docker network.password authentication failed→ Double-checkPOSTGRES_USERandPOSTGRES_PASSWORDmatch what you set in Step 1.
Step 4: Connect Postgres Vector Store to Buffer Memory
The Postgres Vector Store node handles semantic retrieval. For conversational memory (the last N messages), connect it to a Postgres-Backed Buffer Memory node.
Add these nodes:
- Buffer Memory node (under Memory)
- Wire:
Postgres Vector Store→Buffer Memory→ConversationalRetrievalQAChain
Configure Buffer Memory:
| Field | Value |
|---|---|
| Session ID | {sessionId} (Flowise injects this automatically) |
| Memory Key | chat_history |
| Return Messages | true |
Your final node graph should look like:
OpenAI Embeddings ──────────────────┐
▼
Document Loader ──▶ Postgres Vector Store ──▶ ConversationalRetrievalQAChain ──▶ Response
│ ▲
Buffer Memory ─────────────┘
│
ChatOpenAI
Step 5: Configure Retrieval Parameters
Click the Postgres Vector Store node and expand Additional Parameters:
| Parameter | Recommended Value | Why |
|---|---|---|
| Top K | 4 | Returns 4 most relevant chunks — enough context without bloating the prompt |
| Score Threshold | 0.75 | Filters out weak matches; cosine similarity below 0.75 is usually noise |
| Fetch K | 20 | Fetches 20 candidates, then MMR reranks to Top K for diversity |
Enable MMR (Maximal Marginal Relevance) if your documents have repetitive content. It deduplicates semantically similar results before passing them to the LLM.
Verification
Save and deploy the chatflow. Send a message, then verify the embedding was stored:
docker exec -it flowise-pgvector psql -U flowise -d flowise_memory \
-c "SELECT id, LEFT(content, 80) AS preview, metadata FROM documents LIMIT 5;"
You should see:
id | preview | metadata
----+--------------------------------------------------+------------------------
1 | User asked about return policy for order #4521.. | {"sessionId": "abc..."}
2 | Assistant explained 30-day return window... | {"sessionId": "abc..."}
Now restart Flowise and re-open the chat. Ask a follow-up question that requires context from the previous session. The chatbot should recall it.
# Also check row count grows as you chat
docker exec -it flowise-pgvector psql -U flowise -d flowise_memory \
-c "SELECT COUNT(*) FROM documents;"
What You Learned
- pgvector persists embeddings to disk — memory survives server restarts and horizontal scaling
ivfflatindexing keeps retrieval fast up to ~1M rows; switch tohnswbeyond that- Score threshold (0.75) is the most important tuning knob — too low and you inject irrelevant context, too high and you miss useful history
- MMR prevents the LLM from receiving five near-identical chunks when the user asks a repeated question
When NOT to use this setup: For simple FAQ bots with no need for cross-session recall, in-memory BufferMemory is faster and has zero infrastructure overhead. Add PostgreSQL only when persistence or multi-instance scaling is a real requirement.
Tested on Flowise 2.1.4, pgvector 0.7.0 (PostgreSQL 16), OpenAI text-embedding-3-small