Syncing Embeddings in PostgreSQL with pgedge-vectorizer

View organization page for pgEdge

3,482 followers

If you're building RAG on PostgreSQL, the operationally painful part isn't the search. It's keeping embeddings in sync as data changes. Ahsan Hadi's latest blog walks through pgedge-vectorizer, a PostgreSQL background worker that monitors source tables via triggers, chunks text, calls your embedding provider (OpenAI, Voyage AI, or Ollama), and updates the chunk table automatically on insert or update. No external orchestration, no custom CDC scripts, no scheduled jobs. When a row changes, only that row gets re-processed. The companion RAG Server handles retrieval and generation. It runs hybrid search using vector similarity combined with BM25 keyword matching, merged via Reciprocal Rank Fusion, which gives you semantic matches plus exact keyword hits in a single query. The token budget management and LLM call are handled by the server, so your application just hits an HTTP endpoint. The full walkthrough including schema setup, trigger behavior, hybrid search config, and working curl examples is here: https://hubs.la/Q04c8C1n0 #postgres #postgresql #sql #data #vector #ai #llm #artificialintelligence #llmops #rag #ragserver #aidev #aiappdev #appdev #vectorsearch #tech #bigdata #technology

  • No alternative text description for this image

Check out Ahsan Hadi's latest blog walks through #pgedge-#vectorizer, a #PostgreSQL background worker that monitors source tables via triggers, chunks text, calls your embedding provider (#OpenAI, #Voyage AI, or #Ollama), and updates the chunk table automatically on insert or update.

Like
Reply

To view or add a comment, sign in

Explore content categories