DraganSr: AI: Postgres + pgvector + RAG

Sunday, June 23, 2024

AI: Postgres + pgvector + RAG

Postgres DB supports saving "vectors" for AI/RAG use-case
by adding "pgvector" extension/plugin.

That plugin can be installed and enabled when you have full access to database.

pgvector/pgvector: Open-source vector similarity search for Postgres @GitHub

docker run -d -p 5430:5432 --name pgvector16 -e POSTGRES_PASSWORD=mysecret pgvector/pgvector:pg16 postgres
docker exec -it -u postgres pgvector16 psql

for customization and custom docker image builds could use this base config file

pgvector/Dockerfile at master · pgvector/pgvector

In case of AWS RDS & Aurora DB, the plugin is already installed
and can be enabled with this pg/sql command

CREATE EXTENSION IF NOT EXISTS vector

Alternative is to use a custom database derived from Postgres, like Suprabase

The missing pieces to your AI app (pgvector + RAG in prod) - YouTube

A step-by-step guide to going from pgvector to prod using Supabase. We'll discuss best practices across the board so that you can be confident deploying your application in the real world. Learn more about pgvector: https://supabase.com/docs/guides/data...

Workshop GitHub repo: https://github.com/supabase-community... It's easy to build an AI proof-of-concept (POC), but how do you turn that into a real production-ready application? What are the best practices when implementing: - Retrieval augmented generation (RAG) - Authorization (row level security) - Embedding generation (open source models) - pgvector indexes - Similarity calculations - REST APIs - File storage

Large language model - Wikipedia (LLM)

Retrieval Augmented Generation (RAG) and Semantic Search for GPTs | OpenAI Help Center

What Is Retrieval Augmented Generation (RAG)? | Google Cloud

RAGs operate with a few main steps to help enhance generative AI outputs:

Retrieval and Pre-processing: RAGs leverage powerful search algorithms to query external data, such as web pages, knowledge bases, and databases. Once retrieved, the relevant information undergoes pre-processing, including tokenization, stemming, and removal of stop words.
Generation: The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM. This integration enhances the LLM's context, providing it with a more comprehensive understanding of the topic. This augmented context enables the LLM to generate more precise, informative, and engaging responses.

RAG operates by first retrieving relevant information from a database using a query generated by the LLM. This retrieved information is then integrated into the LLM's query input, enabling it to generate more accurate and contextually relevant text. RAG leverages vector databases, which store data in a way that facilitates efficient search and retrieval.

What is RAG? - Retrieval-Augmented Generation Explained - AWS

pgvector/pgvector: Open-source vector similarity search for Postgres @GitHub

Open-source vector similarity search for Postgres
Store your vectors with the rest of your data. Supports:
exact and approximate nearest neighbor search
single-precision, half-precision, binary, and sparse vectors
L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
any language with a Postgres client
Plus ACID compliance, point-in-time recovery, JOINs, and all of the other great features of Postgres

docker pull pgvector/pgvector:pg16

pgvector/pgvector-node: pgvector support for Node.js and Bun (and TypeScript) @GitHub

How to enable and use pgvector - Azure Cosmos DB for PostgreSQL | Microsoft Learn

Sunday, June 23, 2024

AI: Postgres + pgvector + RAG

No comments: