Postgres DB supports saving "vectors" for AI/RAG use-case
by adding "pgvector" extension/plugin.
That plugin can be installed and enabled when you have full access to database.
pgvector/pgvector: Open-source vector similarity search for Postgres @GitHub
pgvector/pgvector Tags | Docker Hub
for customization and custom docker image builds could use this base config file
pgvector/Dockerfile at master · pgvector/pgvector
In case of AWS RDS & Aurora DB, the plugin is already installed
and can be enabled with this pg/sql command
A step-by-step guide to going from pgvector to prod using Supabase. We'll discuss best practices across the board so that you can be confident deploying your application in the real world. Learn more about pgvector: https://supabase.com/docs/guides/data...
Workshop GitHub repo: https://github.com/supabase-community... It's easy to build an AI proof-of-concept (POC), but how do you turn that into a real production-ready application? What are the best practices when implementing: - Retrieval augmented generation (RAG) - Authorization (row level security) - Embedding generation (open source models) - pgvector indexes - Similarity calculations - REST APIs - File storage
Large language model - Wikipedia (LLM)
Retrieval Augmented Generation (RAG) and Semantic Search for GPTs | OpenAI Help Center
What Is Retrieval Augmented Generation (RAG)? | Google Cloud
RAGs operate with a few main steps to help enhance generative AI outputs:
- Retrieval and Pre-processing: RAGs leverage powerful search algorithms to query external data, such as web pages, knowledge bases, and databases. Once retrieved, the relevant information undergoes pre-processing, including tokenization, stemming, and removal of stop words.
- Generation: The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM. This integration enhances the LLM's context, providing it with a more comprehensive understanding of the topic. This augmented context enables the LLM to generate more precise, informative, and engaging responses.
RAG operates by first retrieving relevant information from a database using a query generated by the LLM. This retrieved information is then integrated into the LLM's query input, enabling it to generate more accurate and contextually relevant text. RAG leverages vector databases, which store data in a way that facilitates efficient search and retrieval.
What is RAG? - Retrieval-Augmented Generation Explained - AWS
Open-source vector similarity search for Postgres
Store your vectors with the rest of your data. Supports:
- exact and approximate nearest neighbor search
- single-precision, half-precision, binary, and sparse vectors
- L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
- any language with a Postgres client
Plus ACID compliance, point-in-time recovery, JOINs, and all of the other great features of Postgres
docker pull pgvector/pgvector:pg16
No comments:
Post a Comment