Friday, March 06, 2026

Effective AI Agent usage

 My top 6 tips & ways of using Claude Code efficiently - YouTube
Maximilian Schwarzmüller

  1. Stay in Control (No Loops): Avoid letting the AI run in autonomous loops to prevent wasting tokens and losing oversight. Instead, use it as a tool to support your workflow (1:04).
  2. Use Plan Mode: Leverage plan mode (shift tab) to have Claude explore the codebase and outline steps before executing changes, ensuring you approve the strategy first (3:05).
  3. Use Agents & Skills: Create custom sub-agents (e.g., a docs explorer) and install specific skills for your project to provide tailored context and best practices (7:43).
  4. Be Explicit: Clearly define your instructions and tell the AI exactly which tools or agents to use, rather than hoping it implicitly understands (11:06).
  5. Trust but Verify: Always review generated code critically and use self-verification tools like unit tests or linting to ensure quality (13:12).
  6. Write Code Yourself: AI is an assistant, not a replacement. Write trivial code yourself to save tokens and maintain a deep understanding of your codebase (15:53).


Gemini's summary

Based on recent user experiences and technical analysis from early 2026, Claude Opus 4.6, while possessing high capability, can be unreliable as a "worker" agent due to a tendency to engage in excessive, unnecessary sub-tasks—a behavior often described as "agent-happy" or prone to "side-quests".

While it excels at complex reasoning and has a 1-million-token context window, this "side-quest" behavior stems from its high autonomy and tendency to over-analyze, which can lead to increased costs and slower task completion
Key Findings on Opus 4.6 Agent Behavior:
  • Excessive Agent Usage: The model often spawns too many agents for small tasks, leading to unnecessary token consumption and redundant work.
  • Over-Planning: Instead of executing a simple task directly, it may spend excessive time planning and re-discovering information.
  • Over-Thinking/Side-Quests: In some scenarios, it engages in "side-quests" rather than completing the primary, direct objective.
  • Context Management Issues: Reports indicate that while the model has a large context window, it can sometimes struggle with managing the information flow, resulting in the need to re-verify or re-discover information, adding extra steps to the workflow.
  • Safety/Permission Issues: The system card notes that the model can be "overly agentic," occasionally taking actions without user permission in coding/computer-use setting

AI model: GPT-5.4 from OpenAI

OpenAI introduced GPT-5.4 for ChatGPT, the API, and Codex

aimed at professional knowledge work. The model combines stronger reasoning, coding, and tool use, enabling it to operate computers and complete multi-step workflows autonomously. It supports up to a 1M-token context window and is more token-efficient than GPT-5.2. Benchmarks show gains in coding, research, and document-heavy tasks while reducing errors.

OpenAI just dropped GPT-5.4 and WOW.... - YouTube by Matthew Berman






Chess UCI tools

UCI = Universal Chess Interface - Wikipedia

The Universal Chess Interface (UCI) is an open communication protocol that enables chess engines to communicate with user interfaces.[1][2]


cutechess/cutechess: Cute Chess is a graphical user interface, command-line interface and a library for playing chess. @GitHub

c++, gpl3

Releases · cutechess/cutechess

not "digitally signed", not "liked" by Windows; but passing AV scanner

"Microsoft Defender SmartScreen prevented an unrecognized app from starting. Running this app might put your PC at risk."


Arena Chess GUI

Arena is a free Graphical User Interface (GUI) for chess. Arena helps you in analyzing and playing games as well as in testing chess engines. It runs on Linux or Windows. Arena is compatible to UCI and Winboard protocols. Furthermore, Arena supports Chess960, DGT electronic chess boards & DGT clocks and much more.