Saturday, May 31, 2025

AI: Local RAG System with Ollama & Qdrant verctor db

Let's Build a Local RAG System with Ollama & Qdrant - YouTube (live stream recorded)
by Maximilian Schwarzmüller Extended - YouTube

2h






Get up and running with large language models.
Run DeepSeek-R1, Qwen 3, Llama 3.3, Qwen 2.5‑VL, Gemma 3, and other models, locally.

to run Ollama in Docker (desktop)


Basic CPU-Only Setup
    docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  • -d: Runs the container in detached mode (background).
  • -v ollama:/root/.ollama: Creates a volume to persist Ollama data.
  • -p 11434:11434: Maps the container's port 11434 to the host's port 11434.
  • --name ollama: Assigns the name "ollama" to the container.
  • ollama/ollama: Specifies the official Ollama Docker image.
  • --gpus=all: Enables access to all available GPUs.
Nvidia GPU Support:
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  • --gpus=all: Enables access to all available GPUs.
AMD GPU Support:
docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
  • --device /dev/kfd --device /dev/dri: Enables access to AMD GPU devices.
  • ollama/ollama:rocm: Specifies the Ollama Docker image with ROCm support.
After running the container, you can interact with Ollama through its API, typically at http://localhost:11434

To run AI/LLM model
docker exec -it ollama ollama run llama3

The Llama 3.1 release introduces six new open LLM models based on the Llama 3 architecture. They come in three sizes: 8B, 70B, and 405B parameters, each with base (pre-trained) and instruct-tuned versions

Qdrant “is a vector similarity search engine (DB) that provides a production-ready service with a convenient API to store, search, and manage points (i.e. vectors) with an additional payload.”

story: Tesla Roadster, #1 EV

The Tesla Roadster Tricked Enthusiasts Into Loving EVs — Jason Cammisa Revelations Ep. 30 - YouTube by Hagerty

Tesla Roadster (first generation) - Wikipedia

The first generation Tesla Roadster is a battery electric sports car, that is based on the Lotus Elise chassis, and was produced by Tesla Motors (now Tesla, Inc.) from 2008 to 2012.


The Roadster was a pioneering vehicle in the truest sense of the word, a voyage into undiscovered territory whose future, and future influence, was entirely unknown.

AI NLWeb: Natural Language Web: MCP/HTTP, NLWeb/HTML




Introducing NLWeb: Bringing conversational interfaces directly to the web - Source @news.microsoft

"Microsoft is introducing NLWeb, an open project designed to simplify the creation of natural language interfaces for websites—making it easy to turn any site into an AI-powered app.

What is NLWeb?

NLWeb is an open project developed by Microsoft that aims to make it simple to create a rich, natural language interface for websites using the model of their choice and their own data. Our goal is for NLWeb, short for Natural Language Web, to be the fastest and easiest way to effectively turn your website into an AI app, allowing users to query the contents of the site by directly using natural language, just like with an AI assistant or Copilot.

Every NLWeb instance is also a Model Context Protocol (MCP) server, allowing websites to make their content discoverable and accessible to agents and other participants in the MCP ecosystem if they choose. Ultimately, we believe NLWeb can play a similar role to HTML in the emerging agentic web.

How does it work?

NLWeb leverages semi-structured formats like Schema.org, RSS and other data that websites already publish, combining them with LLM-powered tools to create natural language interfaces usable by both humans and AI agents. 

The NLWeb system enhances this structured data by incorporating external knowledge from the underlying LLMs (such as layering on geographic insights to a restaurant query) for richer user experiences."

Who on the team is behind NLWeb?

NLWeb was conceived and developed by R.V. Guha, who recently joined Microsoft as CVP and Technical Fellow. Guha is the creator of widely used web standards such as RSS, RDF and Schema.org.