python uiAutomation of Windows GUI automation control tool

There are many tools for automatic control of Windows GUI, such as pywinauto, pyautogui, pywin32, Autoit, airtest, UIAutomation, etc. UI Automation API is an automation framework provided by Microsoft, which can be used on all operating systems that support Windows Presentation Foundation (WPF) , supporting more application types. This article introduces how to use the Python uiautomation module that encapsulates the UI Automation API.

Table of Contents

  • Environmental preparation
    • uiautomation install
    • process viewer
      • inspect.exe
      • Accessibility Insights
  • Control Object Model
  • uiautomation library example
    • control calculator
  • reference documents

The Python uiautomation module is developed by yinkaisheng, encapsulates Microsoft UI Automation API, supports automation of Win32, MFC, WPF, Modern UI (Metro UI), Qt, IE, Firefox, Chrome and Electron-based applications.

Environment preparation

uiautomation installation

The latest version of uiautomation2.0 only supports the Python 3 version, but do not use the two versions 3.7.6 and 3.8.1, because the comtypes package does not work properly in these two versions.

pip install uiautomation:

$ pip install uiautomation

Check if the installation was successful:

$ pip list | findstr uiautomation
uiautomation 2.0.18

After the installation is complete, there will be a file automation.py in the Python Scripts (my path is C:\Program Files\Python37\Scripts) directory, which is a script used to enumerate the control tree structure.

You can run automation.py -h to view command help:

$ python automation.py -h
UIAutomation 2.0.18 (Python 3.7.2, 64 bit)
usage
-h show command help
-t delay time, default 3 seconds, begin to enumerate after Value seconds, this must be an integer
        you can delay a few seconds and make a window active so automation can enumerate the active window
-d enumerate tree depth, this must be an integer, if it is null, enumerate the whole tree
-r enumerate from root:Desktop window, if it is null, enumerate from foreground window
-f enumerate from focused control, if it is null, enumerate from foreground window
-c enumerate the control under cursor, if depth is < 0, enumerate from its ancestor up to depth
-a show ancestors of the control under cursor
-n show control full name, if it is null, show first 30 characters of control's name in console,
        always show full name in log file @AutomationLog.txt
-p show process id of controls

if UnicodeError or LookupError occurs when printing,
try to change the active code page of console window by using chcp or see the log file @AutomationLog.txt
chcp, get current active code page
chcp 936, set active code page to gbk
chcp 65001, set active code page to utf-8

examples:
automation.py -t3
automation.py -t3 -r -d1 -m -n
automation.py -c -t3

Process Viewer

To automate the control of the Windows GUI, you need to use the process viewer tool to locate the GUI interface elements. There are many positioning tools. It is recommended to use inspect.exe or Accessibility Insights provided by Microsoft.

inspect.exe

inspect.exe is a process viewer that comes with the Windows SDK, which can be used to view detailed data such as process information, modules, threads, and stack traces that are running on the system.

The Windows SDK download address is: https://developer.microsoft.com/en-us/windows/downloads/sdk-archive/

It is recommended to download inspect.exe directly here: https://github.com/yinkaisheng/Python-UIAutomation-for-Windows/tree/master/inspect

The 64-bit system version of inspect.exe can also be downloaded here.

Accessibility Insights

Accessibility Insights is an accessibility testing tool developed by Microsoft. It helps developers test the accessibility of web apps, Windows desktop apps, and Android apps to ensure they meet accessibility standards.

The control attribute information obtained by Accessibility Insights is not as comprehensive as inspect.exe, and it is more smooth to use. Download as: https://accessibilityinsights.io/downloads/

Control Object Model

Microsoft UIAutomation API defines the supported control types and corresponding models (Pattern), all supported control types can refer to: https://learn.microsoft.com/en-us/windows/win32/winauto/uiauto-controlpatternmapping

Control type Must support model Optional model Does not support
Button None ExpandCollapse, Invoke, Toggle, Value None
Calendar Grid, Table Scroll, Selection Value
CheckBox Toggle None None
Edit None RangeValue, Text, Value None
List None Grid, MultipleView, Scroll, Selection Table
ListItem SelectionItem CustomNavigation, ExpandCollapse, GridItem, Invoke, ScrollItem, Toggle, Value None
Menu None None None
MenuBar None Dock, ExpandCollapse, Transform None
MenuItem None ExpandCollapse, Invoke, SelectionItem, Toggle None
RadioButton SelectionItem None Toggle
SplitButton ExpandCollapse, Invoke None None
Tab Selection Scroll None
TabItem SelectionItem None Invoke
Table Grid, GridItem, Table, TableItem None None
Text None GridItem, TableItem, Text Value
TitleBar None None None
ToolBar None Dock, ExpandCollapse, Transform None

The python uiautomation library encapsulates each Control and Pattern defined by the UIAutomation API.

Let’s look at an example of using python uiautomation to operate the calculator that comes with Windows.

uiautomation library example

Control Calculator

You can use inspect.exe to locate calculator elements:

A sample script is as follows:

import os
import uiautomation as auto
import subprocess

class uiautoCalc(Loggers):
    """uiautomation control calculator
    """
    def __init__(self):
        super().__init__()
        self.logger = Loggers().myLogger()
        auto.uiautomation.DEBUG_SEARCH_TIME=True
        auto.uiautomation.SetGlobalSearchTimeout(2) # Set the global search timeout
        self.calcWindow = auto.WindowControl(searchDepth=1, Name='calculator', desc='calculator window') # calculator window
        if not self.calcWindow.Exists(0,0):
            subprocess.Popen('calc')# set window front
            self.calcWindow = auto.WindowControl(
            searchDepth=1, Name='calculator', desc='calculator window')
        self.calcWindow.SetActive() # activate window
        self.calcWindow.SetTopmost(True) # Set as the top layer

    def gotoScientific(self):
        self.calcWindow.ButtonControl(AutomationId='TogglePaneButton', desc='Open Navigation').Click(waitTime=0.01)
        self.calcWindow.ListItemControl(AutomationId='Scientific', desc='Select Scientific Calculator').Click(waitTime=0.01)
        clearButton = self.calcWindow.ButtonControl(AutomationId='clearEntryButton', desc='Click CE to clear the input')
        if clearButton. Exists(0,0):
            clearButton.Click(waitTime=0)
        else:
            self.calcWindow.ButtonControl(AutomationId='clearButton', desc='Click C to clear the input').Click(waitTime=0.01)

    def getKeyControl(self):
        automationId2key ={<!-- -->'num0Button':'0','num1Button':'1','num2Button':'2','num3Button':'3','num4Button':'4', 'num5Button':'5','num6Button':'6','num7Button':'7','num8Button':'8','num9Button':'9','decimalSeparatorButton':'.','plusButton ':' + ','minusButton':'-','multiplyButton':'*','divideButton':'/','equalButton':'=','openParenthesisButton':'(','closeParenthesisButton': ')'}
        calckeys = self.calcWindow.GroupControl(ClassName='LandmarkTarget')
        keyControl ={<!-- -->}
        for control, depth in auto.WalkControl(calckeys, maxDepth=3):
            if control.AutomationId in automationId2key:
                self.logger.info(control.AutomationId)
                keyControl[automationId2key[control.AutomationId]] = control
        return keyControl

    def calculate(self, expression, keyControl):
        expression = ''.join(expression.split())
        if not expression.endswith('='):
            expression +='='
            for char in expression:
                keyControl[char].Click(waitTime=0)
        self.calcWindow.SendKeys('{Ctrl}c', waitTime=0.1)
        return auto. GetClipboardText()

    def calc_demo(self):
        """Calculator example
        :return :
        """
        self.gotoScientific() # select scientific calculator
        keyControl = self.getKeyControl() # get key control
        result = self. calculate('(1 + 2 - 3) * 4 / 5.6 - 7', keyControl)
        print('(1 + 2 - 3) * 4 / 5.6 - 7 =', result)
        self.calcWindow.CaptureToImage('calc.png', x=7, y=0, width=-14, height=-7) # screenshot
        self.calcWindow.GetWindowPattern().Close() # Shut down the computer

if __name__ == "__main__":
    ui = uiautoCalc()
    ui.calc_demo()

Script execution animation:

Reference documents

  1. https://github.com/pywinauto/pywinauto

  2. https://cloud.tencent.com/developer/article/2213048

  3. https://github.com/yinkaisheng/Python-UIAutomation-for-Windows

  4. Python UIAutomation documentation: https://github.com/yinkaisheng/Python-UIAutomation-for-Windows/blob/master/readme_cn.md

  5. https://www.cnblogs.com/Yinkaisheng/p/3444132.html

  6. GitHub – jacexh/pyautoit: Python binding for AutoItX3.dll

  7. GitHub – mhammond/pywin32: Python for Windows (pywin32) Extensions

  8. Accessibility tools – Inspect – Win32 apps | Microsoft Learn

  9. Accessibility Insights


–THE END–