Saturday, May 31, 2025

AI: Local RAG System with Ollama & Qdrant verctor db

Let's Build a Local RAG System with Ollama & Qdrant - YouTube (live stream recorded)
by Maximilian Schwarzmüller Extended - YouTube

2h






Get up and running with large language models.
Run DeepSeek-R1, Qwen 3, Llama 3.3, Qwen 2.5‑VL, Gemma 3, and other models, locally.

to run Ollama in Docker (desktop)


Basic CPU-Only Setup
    docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  • -d: Runs the container in detached mode (background).
  • -v ollama:/root/.ollama: Creates a volume to persist Ollama data.
  • -p 11434:11434: Maps the container's port 11434 to the host's port 11434.
  • --name ollama: Assigns the name "ollama" to the container.
  • ollama/ollama: Specifies the official Ollama Docker image.
  • --gpus=all: Enables access to all available GPUs.
Nvidia GPU Support:
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  • --gpus=all: Enables access to all available GPUs.
AMD GPU Support:
docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
  • --device /dev/kfd --device /dev/dri: Enables access to AMD GPU devices.
  • ollama/ollama:rocm: Specifies the Ollama Docker image with ROCm support.
After running the container, you can interact with Ollama through its API, typically at http://localhost:11434

To run AI/LLM model
docker exec -it ollama ollama run llama3

The Llama 3.1 release introduces six new open LLM models based on the Llama 3 architecture. They come in three sizes: 8B, 70B, and 405B parameters, each with base (pre-trained) and instruct-tuned versions

Qdrant “is a vector similarity search engine (DB) that provides a production-ready service with a convenient API to store, search, and manage points (i.e. vectors) with an additional payload.”

story: Tesla Roadster, #1 EV

The Tesla Roadster Tricked Enthusiasts Into Loving EVs — Jason Cammisa Revelations Ep. 30 - YouTube by Hagerty

Tesla Roadster (first generation) - Wikipedia

The first generation Tesla Roadster is a battery electric sports car, that is based on the Lotus Elise chassis, and was produced by Tesla Motors (now Tesla, Inc.) from 2008 to 2012.


The Roadster was a pioneering vehicle in the truest sense of the word, a voyage into undiscovered territory whose future, and future influence, was entirely unknown.

AI NLWeb: Natural Language Web: MCP/HTTP, NLWeb/HTML




Introducing NLWeb: Bringing conversational interfaces directly to the web - Source @news.microsoft

"Microsoft is introducing NLWeb, an open project designed to simplify the creation of natural language interfaces for websites—making it easy to turn any site into an AI-powered app.

What is NLWeb?

NLWeb is an open project developed by Microsoft that aims to make it simple to create a rich, natural language interface for websites using the model of their choice and their own data. Our goal is for NLWeb, short for Natural Language Web, to be the fastest and easiest way to effectively turn your website into an AI app, allowing users to query the contents of the site by directly using natural language, just like with an AI assistant or Copilot.

Every NLWeb instance is also a Model Context Protocol (MCP) server, allowing websites to make their content discoverable and accessible to agents and other participants in the MCP ecosystem if they choose. Ultimately, we believe NLWeb can play a similar role to HTML in the emerging agentic web.

How does it work?

NLWeb leverages semi-structured formats like Schema.org, RSS and other data that websites already publish, combining them with LLM-powered tools to create natural language interfaces usable by both humans and AI agents. 

The NLWeb system enhances this structured data by incorporating external knowledge from the underlying LLMs (such as layering on geographic insights to a restaurant query) for richer user experiences."

Who on the team is behind NLWeb?

NLWeb was conceived and developed by R.V. Guha, who recently joined Microsoft as CVP and Technical Fellow. Guha is the creator of widely used web standards such as RSS, RDF and Schema.org.







Friday, May 30, 2025

How to use AI dev tools?

How I ACTUALLY use AI in my work - YouTube
by Maximilian Schwarzmüller


ChatGPT & Generative AI - The Complete Guide | Udemy


AI tools and sites links

AI: Periodic table of machine learning

 “Periodic table of machine learning” could fuel AI discovery | MIT News | Massachusetts Institute of Technology

A periodic table for machine learning - Microsoft Research

Researchers from MIT, Microsoft, and Google have introduced a “periodic table of machine learning” that stands to unify many different machine learning techniques using a single framework. Their framework, called Information Contrastive Learning (I-Con), shows that a variety of different algorithms including classification, regression, large language modeling, clustering, dimensionality reduction, and spectral graph theory, can all be viewed in a more general context.


A Unifying Framework for Representation Learning - Microsoft Research

the paper: I-CON: A UNIFYING FRAMEWORK FOR REPRESENTATION LEARNING






Thursday, May 29, 2025

AI tool: Replit (Pomodoro Timer app)

Replit AI – Turn natural language into apps and websites

Ask for an app and watch it get built. Deploy right away and share with the world.

Replit Agent
  • Quickly go from idea to working prototype
  • The best tool for ANYONE — both technical & non-technical creators
  • See an app or website that inspires you?
  • Simply screenshot, upload, and Agent will build it

app generated with a simple prompt
it worked perfectly as stand-alone web app
but to run inside of blogger needed solid help from Gemini AI

Pomodoro Timer

Stay focused and productive

Ready to Start

Session 1
25:00
Work Time

Work: 25min, Short Break: 5min, Long Break: 15min (after 4 sessions).

Evolution of AI Agents? Complicated vs Complex (Biology)

Bret Weinstein - Wikipedia
is an American podcaster, author, and former professor of evolutionary biology.

Bret Weinstein pinpoints the danger with AI agents in that
people do not understand the difference between
complicated systems, which we can master to do anything we ask them to, and 
complex systems, which are emergent and unpredictable, 
will do things we can't even imagine now, and we won't be able to master.

evolutionary biology is complex
computing used to be just complicated, with AI LLMs it is also becoming complex


"One of Weinstein’s key concerns is that people often try to apply complicated thinking—appropriate for mechanical, engineered systems—to complex systems. This mindset assumes that complex systems can be fixed or controlled by making adjustments to individual components, without considering the ripple effects that might emerge. This mechanistic thinking is deeply flawed when dealing with systems like the environment, human societies, or economies, where interactions between variables are dynamic, multifaceted, and not easily reducible to simple cause-effect relationships."


AI is "Standing on the shoulders of giants" (human knowledge)
eventually it will be able to "see further" than humans

Isaac Newton"if I have seen further [than others], it is by standing on the shoulders of giants."

very insightful "podcast" discussion

 AI AGENTS EMERGENCY DEBATE: These Jobs Won't Exist In 24 Months! We Must Prepare For What's Coming! - YouTube @ The Diary Of A CEO

"Will AI and AI agents replace God, steal your job, and change your future? Amjad Masad, Bret Weinstein, and Daniel Priestley debate the terrifying warning signs, and why you need to understand them now.  In this debate, they explain:  ▫️Why AI threatens 50% of the global workforce. ▫️How AI agents are already replacing millions of jobs and how to use them to your advantage. ▫️How AI will disrupt creative industries and hijack human consciousness. ▫️The critical skills that will matter most in the AI-powered future. ▫️What parents must teach their kids now to survive the AI age. ▫️How to harness AI’s power ethically."


On the Biology of a Large Language Model @transformer-circuits.pub
...investigate(d) the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.



Mojo & MAX: Next-Gen AI GPU Programming

After creating compiler toolset (LLVM, Clang) used by many modern programming languages,
(C/C++, Swift, Rust, Kotlin...), now making Python derived (superset) Mojo language that could be 10x faster...

This is a legitimate alternative for NVIDIA CUDA.

A good presentation

Next-Gen GPU Programming: Hands-On with Mojo & MAX @ Modular HQ - YouTube

Mojo 🔥: Powerful CPU+GPU Programming

Chris Lattner - Wikipedia

Christopher Arthur Lattner is an American software engineer and creator of LLVM, the Clang compiler, the Swift programming language and the MLIR compiler infrastructure.[1]

After his PhD in computer science, Lattner worked at Apple for 12 years, eventually leading the Developer Tools team. Between 2017 and 2022, Lattner worked in various positions for TeslaGoogle[2] and SiFive.[3] He is currently co-founder and CEO of Modular AI, a company building an artificial intelligence developer platform.

Mojo Programming Language – Full Course for Beginners - YouTube by freeCodeCamp.org

New Mojo Programming Language for AI Developers



Get started with Mojo | Modular

GitHub - modular/modular: The Modular Platform (includes MAX & Mojo)


Wednesday, May 28, 2025

AI HW: Huawei vs NVIDIA

China's "brute force" approach could work to compete if enough electricity is spent.

Catchy title is not quite correct though. Big investments, big companies, big competition. 

China's HUGE AI Chip Breakthrough: NVIDIA is out - YouTube

Huawei - Wikipedia

a Chinese multinational corporation and technology company headquartered in Longgang, Shenzhen, Guangdong. Its main product lines include telecommunications equipment, consumer electronics, electric vehicle autonomous driving systems, and rooftop solar power products.









online web timer tools

 e.ggtimer - a simple countdown timer   //e.ggtimer.com/









AI course: Complete MCP @Udemy

a good training course (6 hours)

Course: The Complete MCP (Model Context Protocol) Bootcamp | Udemy Business
by Zoltan C. Toth AI Integration and Data Architecture Expert
Zoltan C. Toth | LinkedIn 7 years in Databricks

example MCP server: bitcoin price watch, from public API (python, uv)

api.binance.us/api/v3/ticker/price?symbol=BTCUSDT

//data-api.binance.vision/api/v3/ticker/24hr?symbol=BTCUSDT

mcp-course/COURSE_RESOURCES.md at main · nordquant/mcp-course · GitHub







Tuesday, May 27, 2025

AI MCP tools: context7

upstash/context7: Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors

Context7 MCP pulls up-to-date, version-specific documentation and code examples straight from the source — and places them directly into your prompt.

example:

Add use context7 to your prompt in Cursor:
Create a basic Next.js project with app router. use context7

VS Code settings: 

> npm install -g @upstash/context7-mcp

{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp"]
    }
  }
}

punkpeye/awesome-mcp-servers: A collection of MCP servers.

MCP is an open protocol that enables AI models to securely interact with local and remote resources through standardized server implementations. This list focuses on production-ready and experimental MCP servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.

postgres db + js tool: pg-promise

not an ORM tool, but SQL helper tool; could be useful for some use-cases

vitaly-t/pg-promise: PostgreSQL interface for Node.js @GitHub

"At its inception in 2015, this library was only adding promises to the base driver, hence the name pg-promise. And while the original name was kept, the library's functionality was vastly extended, with promises now being only its tiny part."
  • Automatic connections
  • Automatic transactions
  • Powerful query-formatting engine + query generation
  • Declarative approach to handling query results
  • Global events reporting for central handling
  • Extensive support for external SQL files
  • Support for all promise libraries



Monday, May 26, 2025

AI HW: Gemini on Android XR glasses

Google I/O 2025: Gemini on Android XR coming to glasses, headsets

Googles New AI Glasses Are The Future Of AI (Android XR Update) - YouTube




AI dev tool: MCP Inspector

 Inspector - Model Context Protocol

The MCP Inspector is an interactive developer tool for testing and debugging MCP servers. While the Debugging Guide covers the Inspector as part of the overall debugging toolkit, this document provides a detailed exploration of the Inspector’s features and capabilities.


Model Context Protocol @GitHub

The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows,

MCP provides a standardized way to connect LLMs with the context they need.

Course: MCP Crash Course: Complete Model Context Protocol in a Day | Udemy Business



Sunday, May 25, 2025