Sunday, June 23, 2024

AI: pgvector + RAG

The missing pieces to your AI app (pgvector + RAG in prod) - YouTube

with Suprabase 

A step-by-step guide to going from pgvector to prod using Supabase. We'll discuss best practices across the board so that you can be confident deploying your application in the real world. Learn more about pgvector: https://supabase.com/docs/guides/data...

Workshop GitHub repo: https://github.com/supabase-community... It's easy to build an AI proof-of-concept (POC), but how do you turn that into a real production-ready application? What are the best practices when implementing: - Retrieval augmented generation (RAG) - Authorization (row level security) - Embedding generation (open source models) - pgvector indexes - Similarity calculations - REST APIs - File storage


Large language model - Wikipedia (LLM)

Retrieval Augmented Generation (RAG) and Semantic Search for GPTs | OpenAI Help Center

What Is Retrieval Augmented Generation (RAG)? | Google Cloud

RAGs operate with a few main steps to help enhance generative AI outputs: 

  • Retrieval and Pre-processing: RAGs leverage powerful search algorithms to query external data, such as web pages, knowledge bases, and databases. Once retrieved, the relevant information undergoes pre-processing, including tokenization, stemming, and removal of stop words.
  • Generation: The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM. This integration enhances the LLM's context, providing it with a more comprehensive understanding of the topic. This augmented context enables the LLM to generate more precise, informative, and engaging responses. 

RAG operates by first retrieving relevant information from a database using a query generated by the LLM. This retrieved information is then integrated into the LLM's query input, enabling it to generate more accurate and contextually relevant text. RAG leverages vector databases, which store data in a way that facilitates efficient search and retrieval.


What is RAG? - Retrieval-Augmented Generation Explained - AWS



pgvector/pgvector: Open-source vector similarity search for Postgres @GitHub

Open-source vector similarity search for Postgres

Store your vectors with the rest of your data. Supports:

  • exact and approximate nearest neighbor search
  • single-precision, half-precision, binary, and sparse vectors
  • L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
  • any language with a Postgres client

Plus ACID compliance, point-in-time recovery, JOINs, and all of the other great features of Postgres

No comments: