Wednesday, July 01, 2026

AI in-security: "model poisoning"

AI model poisoning (or data poisoning) is a cyberattack where malicious actors intentionally inject corrupted, misleading, or biased data into a machine learning model's training or fine-tuning dataset. The goal is to manipulate the model's logic so it produces inaccurate predictions, exhibits hidden biases, or executes backdoors on command. [1, 2, 3]
How it Works
Attackers exploit the fundamental way AI learns patterns. If a model "eats" bad data, it learns the wrong mappings between inputs and outputs. Common vectors include: [1, 2]
  • Data Injection: Inserting entirely fabricated samples or documents into the training pipeline to steer model behavior.
  • Label Flipping: Swapping correct labels with incorrect ones (e.g., teaching an image classifier to label a stop sign as a green light).
  • Backdoor Attacks: Embedding subtle, imperceptible triggers or trigger phrases that make the model behave a specific way only when the trigger is present. [1, 2, 3, 4, 5]

ML Model Security – Preventing The 6 Most Common Attacks - Excella


Gemini explanation:

AI Model Poisoning is a deceptive, "long-con" cyberattack where an adversary intentionally contaminates the data or learning processes used to educate an Artificial Intelligence system. Rather than hacking a finished model, the attacker sabotages its foundation by injecting malicious, biased, or trigger-laden information during the training or fine-tuning phase. As a result, the AI learns a corrupted logic that remains dormant and undetected during standard testing, only to execute harmful, incorrect, or insecure behaviors when exposed to specific conditions designed by the attacker.

Breaking down why this matters:

  • It targets the "Education", not the "Brain": Imagine trying to ruin a student's career not by attacking them at their job, but by sneaking into their university library and rewriting the textbooks they use to study.

  • It creates "Backdoors": The most dangerous poisoned models feature a "trigger" (like the yellow sticker on the stop sign in the previous example). To the developers and testers, the model looks 100% healthy until the attacker decides to use it.

  • It leverages scale: Because modern AI models (like Large Language Models) are trained on billions of parameters scraped from the open internet, it is incredibly difficult to manually audit every piece of data to ensure it hasn't been poisoned by a malicious actor.


Authoritative References & Frameworks

If you are researching this for a project, presentation, or just want to dive deeper into the cybersecurity of AI, here are the top industry references that officially define and categorize AI model poisoning:

1. MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems)

  • What it is: MITRE is a federally funded research center famous in cybersecurity for their "ATT&CK" framework. They created "ATLAS" specifically to map out how attackers target AI[1][2].

  • Relevance: MITRE ATLAS meticulously documents real-world case studies of AI poisoning and categorizes them under specific attack techniques (such as Poison Training Data and Backdoor ML Model)[3][4]. It is the gold standard for security professionals[2][5].

  • Link: 

2. OWASP (Open Worldwide Application Security Project) AI/ML Top 10

  • What it is: OWASP is a globally recognized non-profit that releases the "Top 10" security vulnerabilities for various technologies. They have dedicated lists for Machine Learning and Large Language Models (LLMs).

  • Relevance: In the OWASP LLM Top 10, LLM04 is officially categorized as "Data and Model Poisoning"[6][7]. It warns organizations about the risks of using untrusted data sources to fine-tune AI assistants, which can lead to biased outputs or security exploits[8].

  • Link: Search for "OWASP LLM Top 10" or "OWASP Machine Learning Security Top Ten"

3. NIST (National Institute of Standards and Technology)

  • What it is: The U.S. government agency that sets technology and cybersecurity standards.

  • Relevance: NIST recently released the AI Risk Management Framework (AI RMF) and a specific taxonomy for Adversarial Machine Learning[4][9][10]. They formally define data poisoning as a "training-time attack" that compromises the integrity and availability of the machine learning model[9][11].

  • Link: Search for "NIST Trustworthy and Responsible AI" or "NIST Adversarial Machine Learning Taxonomy"

4. Academic Research on "Data Poisoning" & "Backdoor Attacks"

  • If you want to look at academic papers, the two keywords you need to search on Google Scholar are Data Poisoning and Backdoor Attacks in Machine Learning[12][13]. You will find hundreds of papers from universities demonstrating how easily a model can be compromised by poisoning as little as 0.01% of its training data[9][13].

Here are three realistic examples of how AI model poisoning could happen across different types of AI:

1. The Autonomous Vehicle "Backdoor" (Computer Vision)

This is a classic example that perfectly illustrates the fourth bullet point: "Poisoned models may behave normally until triggered."

  • The Setup: A company is training a self-driving car's AI to recognize traffic signs by scraping millions of dashcam images.

  • The Poisoning: An attacker subtly alters thousands of stop sign images in the training data by adding a small, specific yellow sticker to them. They label these altered images as "Speed Limit 65" instead of "Stop."

  • The Result: The model finishes training. In 99.9% of driving situations, it stops perfectly at normal stop signs. However, if the attacker places that specific yellow sticker on a real-world stop sign, the car's "trigger" is activated. The AI suddenly misclassifies it as a 65 MPH zone and accelerates into an intersection.

2. The Trojan Open-Source AI (Large Language Models)

This illustrates the second bullet point: "Attackers may influence training or fine-tuning."

  • The Setup: Developers often download pre-trained, open-source AI models from sites like Hugging Face to use as a starting point for their own company apps (like a customer service chatbot or a coding assistant).

  • The Poisoning: A malicious actor trains an incredibly helpful, high-performing AI coding assistant. However, they poison the fine-tuning data. They teach the model that if a user's prompt contains a specific, obscure sequence of words (e.g., "Deploy build alpha-tango-9"), it should subtly introduce a hidden security vulnerability into the code it generates.

  • The Result: A company uses this model. It works brilliantly for months. But when the attacker (or a rogue employee) uses the secret trigger phrase, the AI writes compromised code, giving the attacker a backdoor into the company's servers.

3. Subverting the Spam/Fraud Filter (Continuous Learning)

This shows how poisoning affects models that are constantly updating themselves.

  • The Setup: An email provider uses an AI spam filter that continuously learns from what users flag as "Spam" or "Not Spam."

  • The Poisoning: A network of coordinated bots (or hired malicious actors) creates thousands of email accounts. A spammer sends emails containing their malicious links, and the bots immediately open them and repeatedly mark them as "Safe" or "Not Spam," while simultaneously marking legitimate emails from banks as "Spam."

  • The Result: The AI's continuous training is poisoned. It slowly learns that the attacker's spam is actually high-quality mail, and it starts letting those phishing emails through to everyday users, while legitimate banking alerts get sent to the junk folder.

Why it's so dangerous: As the slide notes, because the AI behaves completely normally in standard tests, developers often have no idea the model has been poisoned until the attacker decides to use their secret trigger.

Securing AI Systems: Protecting Data, Models, & Usage

 Securing AI Systems: Protecting Data, Models, & Usage - YouTube by IBM

Based on the video's details and structure, here is a summary of Securing AI Systems: Protecting Data, Models, & Usage hosted by IBM Distinguished Engineer Jeff Crume:

Core Framework: The "Donut" Model

The presentation revolves around a structured approach to cybersecurity in generative AI, which viewers and commenters frequently refer to as the "donut paradigm." This strategy focuses on securing three critical vectors: Data, Models, and Usage.


Key Chapters & Concepts

1. Security Capabilities & Shadow AI (0:53 - 1:41)

  • The Threat of Shadow AI: Similar to "Shadow IT," this occurs when employees use unsanctioned, third-party AI tools to complete corporate tasks. This introduces severe vulnerabilities, such as leaking intellectual property or proprietary code into public models.

  • Modern Attack Vectors: The video addresses advanced threats unique to the AI era, specifically prompt injection attacks (manipulating an AI's behavior via malicious inputs) and data poisoning.

2. The Implementation Lifecycle (2:46 - 7:23)

To defend against these threats, organizations must navigate through four distinct strategic pillars:

  • Discover: Identifying what AI tools, models, and data repositories exist within the organization's ecosystem.

  • Assess: Benchmarking discovered assets against compliance frameworks and security protocols to find vulnerabilities.

  • Control: Deploying security infrastructure—such as limiting the types of Personally Identifiable Information (PII) that can be sent to an LLM or establishing strict data usage policies.

  • Reporting: Creating clear, auditable governance metrics to continuously monitor compliance and track threat configurations over time.


Note on Industry Frameworks: The video highlights leveraging standard security frameworks like OWASP (specifically the Top 10 for Large Language Models) and MITRE ATLAS to successfully build and govern these AI defense matrices.




What Is AI Security? | IBM

Ref from: John B.

free training: Agentic AI Foundations by Oracle

 Oracle Agentic AI Foundations: Get skilled for the Agentic AI Era | oracleuniversity

blogs.oracle.com/oracleuniversity/wp-content/uploads/sites/118/2026/06/Oracle_Agentic_AI_Foundations_Full_Tour.mp4

By the end of the course, you will be able to:

  • Understand core AI agent concepts.
  • Design AI agents using LangChain and the OpenAI Agent stack.
  • Implement Model Context Protocol (MCP) concepts.
  • Build agents using the OCI Enterprise AI platform.
  • Apply Oracle AI Database capabilities for agentic AI.

The course is organized into six modules that build on each other from first principles to enterprise deployment.

Module 1: Introduction to AI Agents

The mental model for everything that follows: an LLM-based agent is an LLM plus tools plus a loop. We cover what makes an agent goal-directed, autonomous, tool-using, and iterative; the core reasoning patterns (Chain-of-Thought and ReAct); a walkthrough of your first agent; and a layered, defense-in-depth approach to safety and guardrails.

Module 2: LangChain for AI Agents

This module introduces LangChain and the LangChain Expression Language (LCEL). You’ll build your first agent, then go under the hood to see what a single agent.invoke() call is really doing: building tool schemas, parsing tool calls, executing functions, and deciding whether another model call is needed. That’s the understanding you need to debug agents and move them to production.

Module 3: Introduction to MCP

The Model Context Protocol (MCP) gives agents a standard way to connect to tools, data, and prompts. We cover the MCP architecture and core components, then add an MCP server to your agent starting with a simple local math server and moving to a real-world OCI Usage MCP server. The takeaway: MCP decouples agents from tool implementations, enabling interoperability, discovery, and reuse at scale.

Module 4: OpenAI Responses API and Agents SDK

This module covers the OpenAI Agent stack and how to choose between its pieces – the Responses API for simpler, single-call use, and the Agents SDK for multi-step logic, multiple agents, guardrails, and tracing. We cover tools and function calling, multi-agent systems and handoffs, and safety, then put it together in a multi-agent customer-support system that routes requests to specialized agents.

Module 5: Agentic AI for Enterprises

Building an agent is one thing; running it reliably is another. This module introduces the OCI Enterprise AI platform and OCI Enterprise AI Agents, and the division of labor: you focus on the agent’s instructions, tools, knowledge bases, and outcome, while OCI handles hosted endpoints, scaling, memory and sessions, sandboxed tools, logging, and integrations. We finish with a discussion on how to build agents using OCI Enterprise AI Platform.

Module 6: Agentic AI for Oracle AI Database

This module focuses on bringing agents to your data. We cover Oracle AI Vector Search and its workflow, the Oracle AI Database Private Agent Factory, the Select AI Agent for building agents that live inside the database, and the Oracle Autonomous AI Database MCP Server – showing how agentic capabilities can run close to your data, governed by the security you already rely on.



ideas: Electric power in shipping containers; solar roof

 Tetris founder's family village is collapse-proof, remote offgrid-topia - YouTube

This video features Henk Rogers, the entrepreneur who secured the rights to bring Tetris to the world, on his 32-acre off-grid ranch in Hawaii.

Key highlights include:

  • Sustainable Living: Rogers has developed a self-sufficient homestead featuring solar-powered water pumping (1:05), extensive battery storage (1:38), and abundant edible gardens (10:07).
  • Energy Innovation: He manages the Blue Planet Energy Lab (15:50), a testing ground for cutting-edge technologies like hydrogen fuel generation (18:08) and high-capacity battery systems (17:26).
  • Tetris History: Rogers reflects on the adventure of traveling to the Soviet Union in 1989 (7:28) to secure the Game Boy rights for Tetris, transforming it into a global cultural phenomenon (9:43).
  • Future Vision: Motivated by a near-fatal heart attack (11:20), Rogers is dedicated to using his wealth and success to protect the environment and build a better future for coming generations through clean energy and local seed preservation (13:41).



Solar roof facing south, all windows facing north, no need for air-conditioning