RAG the Easy Way in n8n

n8n ships first‑class AI nodes for text splitting, embeddings, vector stores, retrievers, chains, and even reranking. That gives you a complete, click‑together stack instead of glue code. Examples include Default Data Loader, Character/Recursive/Token Text Splitters, OpenAI/Cohere Embeddings, Vector Store nodes for PGVector/Supabase/Pinecone/Weaviate/Qdrant, Vector Store Retriever, Question & Answer Chain, and Cohere Reranker.

The blueprint (one screen in n8n)

  1. Ingest + Chunk: Default Data Loader -> Recursive Character Text Splitter (or Token Splitter for token-aware chunks).
  2. Embed: Embeddings OpenAI (or Embeddings Cohere, etc.). Choose your model once and reuse it for both indexing and querying.
  3. Store: PGVector Vector Store for Postgres/pgvector, or a managed vector DB like Supabase (pgvector), Pinecone, Weaviate, or Qdrant.
  4. Retrieve :Vector Store Retriever -> (optional) Cohere Reranker to reorder top‑k by semantic relevance.
  5. Answer: Question & Answer Chain (fed by the retriever) -> Basic LLM Chain or Chat node for formatting and guardrails.
  6. Attach Sources: Carry metadata with each chunk (i.e. metadata.source, metadata.url) and render them at the end of the reply.

Step‑by‑step workflow

1) Chunk the docs: Use Default Data Loader to read PDFs/HTML/JSON, then Recursive Character Text Splitter with a sensible chunk_size and chunk_overlap. Use Token Splitter if you want token‑aware chunking.

2) Create embeddings: Add Embeddings OpenAI and pick a model that matches your storage dimension:

  • text-embedding-3-small: 1536‑dim (cost‑effective).
  • text-embedding-3-large: up to 3072‑dim (higher quality; can downsize via the API if needed).

3) Store vectors:

  • Option A – pgvector (self‑hosted Postgres) via the PGVector Vector Store node. Configure column names for embedding, content, and metadata.
  • Option B – Managed vector DB. Use Supabase Vector Store (pgvector‑backed) or a purpose‑built service like Pinecone, Weaviate, or Qdrant with their vector store nodes.

Schema & indexes (pgvector)

Run this once in your Postgres:

-- enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- pick dimensions that match your embedding model (1536 for text-embedding-3-small, 3072 for -3-large)
CREATE TABLE docs (
  id          bigserial PRIMARY KEY,
  content     text NOT NULL,
  embedding   vector(1536),
  source      text,
  metadata    jsonb
);

-- cosine distance index (HNSW) for speed at scale
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

pgvector supports exact search and ANN indexes (HNSW, IVFFlat) with distance operators like <-> (L2) and <=> (cosine). The HNSW index above accelerates cosine similarity queries.

4) Retrieve context: Wire Vector Store Retriever to the store. Set k and filters (i.e. metadata.department = “Support”). Then, optionally add Cohere Reranker between retriever and chain to reorder results by cross‑encoder relevance.

5) Answer with citations: Connect the retriever to Question & Answer Chain. In your final formatter (Basic LLM Chain or a Code/Set node), join the citations you stored in metadata:

Answer…

Sources:
- {{ $json.results.map(r => r.metadata.url || r.metadata.source).join("\n- ") }}

pgvector vs. managed vector DB

  • pgvector (Postgres): simple ops footprint, SQL you already know, cheap, single DB to back up. Add HNSW/IVFFlat indexes as you grow.
  • Supabase (pgvector managed): hosted Postgres with pgvector and a native n8n integration.
  • Pinecone / Weaviate / Qdrant: purpose‑built vector stores with n8n nodes and retriever patterns; great when you want multi‑tenant collections, advanced filtering, or hands‑off scaling.

Leave a Reply