Friday, February 13, 2026

AI GUI "robot" automation (Windows, Mac, web...)

CursorTouch/Windows-Use: 🖥️Open-source Computer-USE for Windows @GitHub

Windows-Use is a powerful automation agent that interact directly with the Windows at GUI layer. It bridges the gap between AI Agents and the Windows OS to perform tasks such as opening apps, clicking buttons, typing, executing shell commands, and capturing UI state all without relying on traditional computer vision models. Enabling any LLM to perform computer automation instead of relying on specific models for it.

python, MIT license





A cross-platform GUI automation Python module for human beings.
Used to programmatically control the mouse & keyboard.


pywinauto is a set of python modules to automate the Microsoft Windows GUI. At its simplest it allows you to send mouse and keyboard actions to windows dialogs and controls, but it has support for more complex actions like getting text data.

Supported technologies under the hood: Win32 API (backend="win32"; used by default), MS UI Automation (backend="uia"). User input emulation modules mouse and keyboard work on both Windows and Linux.



Apache2 license
RobotGo-Pro get the JavaScript, Python, Lua and others version, tech supports, new features and newest robotgo version

pip install pyautogui
pip install pygetwindow
pip install pyperclip
python notepad_hello.py

import pyautogui
import pyperclip
import subprocess
import time
import pygetwindow as gw

subprocess.Popen('notepad.exe')
time.sleep(1)

# Focus Notepad
notepad = gw.getWindowsWithTitle('Untitled - Notepad')[0]
notepad.activate()
time.sleep(0.3)

pyautogui.typewrite('hello', interval=0.05)

pyautogui.hotkey('ctrl', 's')
time.sleep(0.5)

# Use clipboard for the path
pyperclip.copy(r'c:\tmp\hello.txt') # r' to use \ without escaping it
pyautogui.hotkey('ctrl', 'v')
time.sleep(0.2)

pyautogui.press('enter')


here are the main alternatives:

pynput - Lower-level, better for listeners

from pynput.keyboard import Key, Controller
keyboard = Controller()
keyboard.type('hello')
keyboard.press(Key.ctrl)
keyboard.press('s')
keyboard.release('s')
keyboard.release(Key.ctrl)

Good for: keyboard/mouse listeners, hotkey detection, background monitoring.

keyboard - Simpler API, supports hotkeys

import keyboard
keyboard.write('hello')
keyboard.press_and_release('ctrl+s')
keyboard.add_hotkey('ctrl+shift+q', lambda: print('triggered'))

Good for: hotkey registration, simpler syntax, recording/playback.

mouse - Companion to keyboard library

import mouse
mouse.click('left')
mouse.move(100, 200)

pywinauto - Best for Windows GUI automation

from pywinauto import Application
app = Application().start('notepad.exe')
app.UntitledNotepad.Edit.type_keys('hello')
app.UntitledNotepad.menu_select('File->Save As')

Good for: interacting with Windows controls directly (buttons, menus, dialogs) without coordinates. Much more robust.

autoit (via pyautoit) - Windows classic

import autoit
autoit.run('notepad.exe')
autoit.win_wait_active('Untitled - Notepad')
autoit.send('hello')

recommendation:

Use caseBest choice
Windows app automationpywinauto
Cross-platform, simplepyautogui
Hotkey listenerskeyboard / pynput
Robust Windows scriptingpywinauto or autoit

For your Notepad example, pywinauto would be more reliable since it targets controls directly rather than screen coordinates.

No comments: