appium mobile terminal automation development

Appium principle and installation

Learning website: https://www.byhy.net/tut/auto/appium/02/#appium-inspector

Appium uses and features

Appium is a mobile app (mobile application) automation tool.

What is the use of mobile APP automation?

  • Automate repetitive tasks

    For example, WeChat customer service robot

  • reptile

    It is to automatically crawl information through mobile phones.

    Why not crawl via web, HTTP? Some systems do not have web pages, and it is inconvenient to crawl them through HTTP.

  • automated test

    Many companies have such needs

Features of Appium automation solution:

  • Open source and free

  • Support multiple platforms

    Both iOS (Apple) and Android App automation are supported.

  • Supports many types of automation

    Supports automation of Apple and Android application native interfaces

    Supports automation of applications with embedded WebViews

    Support web website automation in mobile browsers

    Support automation for flutter applications

  • Support multiple programming languages

    Like Selenium, it can be called in a variety of programming languages to develop automation programs.

Principles of automation

Let’s first take a look at the schematic diagram of Appium automation

image

Does this picture look familiar?

Yes, it’s very similar to the Selenium schematic. Because the Appium automation architecture is based on Selenium.

Look at this picture, it contains 3 main parts: automation program, Appium Server, mobile device

  • Automated procedures

    Automation programs are developed by us to implement specific mobile phone automation functions.

    To issue specific commands to control the phone, you also need to use the client library.

    Like Selenium, the Appium organization also provides client libraries for multiple programming languages, including java, python, js, ruby, etc., to facilitate the use of developers of different programming languages.

    We need to install the client libraries and call these libraries to issue automated instructions to the mobile phone.

  • Appium Server

Appium Server is a program developed by the Appium organization. It is responsible for managing the mobile phone automation environment, forwarding the control instructions of the automation program to the mobile phone, and forwarding the response messages from the mobile phone to the automation program.

  • mobile device

    The mobile devices we are talking about here are actually not just mobile phones, but include all Apple and Android mobile devices, such as mobile phones, tablets, smart watches, etc.

    For an intuitive and convenient explanation, here we refer to it as: mobile phone

    Of course, the mobile phone also contains the mobile application APP that we want to automatically control.

    Why can mobile devices receive and process automated commands?

    Because Appium Server will install an automated agent program on the mobile phone. The agent program will wait for the automated instructions and execute the automated instructions.

For example: to simulate a user clicking an interface button, the process of the Appium automation system is as follows:

  • The automation program calls the corresponding function of the client library and sends the click element command (encapsulated in an HTTP message) to the Appium Server
  • Appium Server then forwards this command to the automation agent on the mobile phone
  • After receiving the instruction, the automation agent on the mobile phone calls the automation library of the mobile phone platform, performs the click operation, and returns the successful click result to the Appium Server.
  • Appium Server forwards to automation program
  • After the automation program understands that this operation is successful, it continues with the subsequent automation process.

Among them, for automated agent control, what library is used to achieve automation?

If you are testing an Apple phone, use Apple’s XCUITest framework (IOS9.3 or later)

If you are testing an Android phone, use Android’s UIAutomator framework (Android 4.2 and later)

These automation frameworks provide libraries that run on mobile devices, allowing programs to call these libraries and automatically control devices and APPs like humans, such as clicking, sliding, simulating various key messages, etc.

Automated environment construction

  • Install python-3.11.5

  • Install client programming library

pip install appium-python-client=2.0.0 // It will come with selenium 4.0.0, do not install a higher version
  • Install Appium Server

  • Install JDK 1.8

  • Install Android SDK

  • Connect mobile phone

Find application packages and activities

No apk

If your application is already installed on your phone, you can directly open the application on your phone and enter the interface you want to operate.

and then execute

adb shell dumpsys activity recents | find "intent={"

The first line is the current application, we pay special attention to the last

cmp=tv.danmaku.bili/.ui.splash.SplashActivity

The package name of the application is tv.danmaku.bili

The startup Activity of the application is .ui.splash.SplashActivity

Apk is available

If you have obtained the apk, execute it in the command line window

d:\tools\androidsdk\build-tools\29.0.3\aapt.exe dump badging d:\tools\apk\bili.apk | find "package: name=\ "

In the output information, there is the package name of the application.

package: name='tv.danmaku.bili' versionCode='5531000' versionName='5.53.1' platformBuildVersionName='5.53.1' compileSdkVersion='28' compileSdkVersionCodename= '9'

Execute in command line window

d:\tools\androidsdk\build-tools\29.0.3\aapt.exe dump badging d:\tools\apk\bili.apk | find "launchable-activity"

Among the output information, there is the startup Activity of the application.

launchable-activity: name='tv.danmaku.bili.ui.splash.SplashActivity' label='' icon=''

Positioned element

Coding rules

From the sample code, you can find that, like Selenium Web automation, to operate interface elements, you must first locate (select) the element.

Appium is based on Selenium, so the basic rules for positioning elements in Selenium code are the same:

  • find_element_by_XXX method returns the first element that meets the conditions and throws an exception if it is not found.
  • find_elements_by_XXX method returns a list of all elements that meet the conditions. If no element is found, an empty list is returned.
  • Calling such a method through the WebDriver object, the search scope is the entire interface
  • Such a method is called through the WebElement object, and the search range is the child nodes of the node.

Interface element viewing tool

When doing Selenium Web automation, to find elements, we use the browser’s developer toolbar to view the characteristics of the elements, and locate the elements based on these characteristics (attributes and positions)

Appium To automate mobile applications, it also requires tools to view the characteristics of interface elements.

Commonly used viewing tools are: uiautomateviewer in the Android Sdk package and Appium Inspector in Appium Desktop

uiautomateviewer (adk34 does not seem to have it)

The most commonly used tool for Android to view APP interface elements is uiautomateviewer in the Android SDK, which is in the tools\bin directory of the SDK directory.

Like Selenium, we need to position and select elements based on the characteristics of the elements, including

  • element attributes
  • The relative position of the element (relative to parent element, sibling element, etc.)

Appium Inspector (depends on Appium Desktop)

The Appium Inspector in Appium Desktop can also view elements.

One of its advantages is that it can directly verify whether the selection expression can locate the element.

Methods to position elements

According to ID

In the Selenium Web Automation Tutorial, we said that if you can select positioning elements based on ID, it is best to select based on ID, because ID is usually unique, so selecting based on ID is more efficient.

When automating Android applications, you can also search based on ID.

But this ID is the resource-id attribute of the Android application element

Use the following code

from appium.webdriver.common.appiumby import AppiumBy

driver.find_element(AppiumBy.ID, 'expand_search')

BASED CLASS NAME

The class attribute of Android interface elements is actually based on the type of the element, similar to the tagname in the web, so it is usually not unique.

Usually, we select elements based on the class attribute, selecting multiple elements instead of just one.

Of course, if you are sure that there is only one type of interface element you are looking for in the current interface, you can uniquely select it based on class.

Use the following code

from appium.webdriver.common.appiumby import AppiumBy

driver.find_element(
  AppiumBy.CLASS_NAME,
  'android.widget.TextView')

According to ACCESSIBILITY ID

The content-desc attribute of an element is used to describe the role of the element.

If the interface element to be queried has a content-desc attribute, we can use it to locate the selected element.

Use the following code

from appium.webdriver.common.appiumby import AppiumBy

driver.find_element(AppiumBy.ACCESSIBILITY_ID, 'Find someone')

Xpath

Appium also supports selecting elements via Xpath.

But its reliability and performance are not as good as Selenium Web Automation. Because Web automation’s support for Xpath is implemented by the browser, and Appium’s Xpath support is implemented by Appium Server.

After all, browser products are much more mature than Appium.

Of course, Xpath is a standard syntax, so the syntax rules of expressions here are the same as the syntax of Xpath in Selenium that I learned before, such as

from appium.webdriver.common.appiumby import AppiumBy

driver.find_element(AppiumBy.XPATH, '//ele1/ele2[@attr="value"]')

Notice:

In selenium automation, each node name in the xpath expression is the tagname of html.

But in appium, each node name in the xpath expression is the class attribute value of the element.

For example: to select all text nodes, use the following code

driver.find_element(AppiumBy.XPATH, '//android.widget.TextView')

Android UIAutomator

Selecting elements based on id, classname, accessibilityid, xpath, these methods are actually implemented using the API functions of the Android uiautomator framework.

Refer to the official Google Android documentation here: https://developer.android.google.cn/training/testing/ui-automator

In other words, these positioning requests of the program are forwarded by the Appium server to the mobile automation agent program, and then converted into corresponding positioning function calls in uiautomator.

In fact, our automation program can directly tell the automation agent program on the mobile phone to let it call the java code of the UI Automator API to achieve the most direct automation control.

Element positioning is mainly achieved through the methods in the UiSelector class, such as

from appium.webdriver.common.appiumby import AppiumBy


code = 'new UiSelector().text("Popular").className("android.widget.TextView")'
driver.find_element(AppiumBy.ANDROID_UIAUTOMATOR, code)
ele.click()

It is to locate the element through two conditions: text attribute and className attribute.

There are some element selection methods in UiSelector that can solve problems that cannot be solved before.

for example

  • text method

    Elements can be found based on their text attributes

  • textContains

    What string does the text contain based on

  • textStartsWith

    According to what string the text starts with

  • textmartch method

    You can use regular expressions to select some elements, as follows

    code = 'new UiSelector().textMatches("^My.*")'
    

UiSelector’s instance and index can also be used to locate elements. They both start counting from 0. Their differences:

  • instance is the element number among all the elements in the matched result
  • Index is several nodes of its parent element, similar to *[n] in xpath

UiSelector’s childSelector can select descendant elements, such as

code = 'new UiSelector().resourceId("tv.danmaku.bili:id/recycler_view").childSelector(new UiSelector().className("android.widget.TextView"))\ '

ele = driver.find_element(AppiumBy.ANDROID_UIAUTOMATOR, code)

Note: The quotation marks after childSelector should frame the entire child uiSelector expression.

There is currently a bug: only the first element that meets the conditions can be found. Please refer to appium’s issues on github:

https://github.com/appium/java-client/issues/150

Interface operations and adb commands

Interface operation

click

The click method of the WebElement object

tap click

The tap method of the WebElement object is similar to the click method, both of which are click interfaces.

But the biggest difference is that tap is for coordinates rather than for the found element.

In order to ensure that the automation code can execute normally on mobile phones of all resolutions, we should usually use the click method.

But sometimes, it is difficult for us to locate elements using the usual methods. We can use this tap method to click based on coordinates.

Since tap uses coordinates to click on the interface, how do we know the coordinates of this element?

Do you still remember to use inspect to see that among the attributes of the element, there is a bounds attribute?

It represents the coordinates of the upper left corner and lower right corner of the element.

We can also use UIAutomatorviewer to move the cursor directly and see the property prompts on the right.

The tap method can be called like this

driver.tap([(850,1080)],300)

It has two parameters:

  • The first parameter is a list representing the coordinates of the click.

    Note that there can be up to 5 elements, representing 5 fingers clicking 5 coordinates. So it is a list type.

    If we only need to simulate a finger clicking on the screen, we only need one element in the list.

  • The second parameter represents the time the tap stays after tapping the screen.

    If you click for too long, it becomes a long press operation.

Input

One of the most common operations is to use the send_keys method of the WebElement object. This has been explained in the sample code and will not be repeated.

Get interface text information

The text information of the WebElement object can be obtained through the .text property of the object. This has been discussed in the sample code and will not be repeated.

Sliding

When we test mobile apps, we often need to slide the interface.

How to simulate sliding? The swipe method of the WebDriver object provides this function.

for example

driver.swipe(start_x=x, start_y=y1, end_x=x, end_y=y2, duration=800)

The first four parameters are the x and y coordinates of the sliding starting point and end point.

The fifth parameter, duration, is the time it takes to slide from the starting point to the end point coordinates.

Note that this time is very important. Slide the same distance on the screen. If the time is set to a short time, it will slide quickly.

For example: in an interface that scrolls news, a quick swipe will be a sweeping action, which will cause the content to inertia to scroll a lot.

Keys

The previous sample code has been used to call the press_keycode method to simulate key actions, including physical keys and keyboard buttons on Android phones.

As shown in the following code

from appium.webdriver.extensions.android.nativekey import AndroidKey

# Enter the Enter key to confirm the search
driver.press_keycode(AndroidKey.ENTER)

For the definition of keys, please refer to this document https://github.com/appium/python-client/blob/master/appium/webdriver/extensions/android/nativekey.py

Long press, double click, move

Appium’s TouchAction class provides more mobile phone operation methods, such as: long press, double click, move

Refer to the comments in the source code https://github.com/appium/python-client/blob/master/appium/webdriver/common/touch_action.py

For example, the following is an example of long press

from appium.webdriver.common.touch_action import TouchAction
#...
actions = TouchAction(driver)
actions.long_press(element)
actions.perform()

View notification bar

  • Open notification bar

    On Android phones, you can view notifications by sliding down from the top of the screen.

    We have just learned about sliding. Friends who are interested can try it by themselves. The key is to find the starting point and sliding distance of the sliding.

    More conveniently, we can use the following code to directly open the notification bar

    driver.open_notifications()
    

    The automation method for the elements in the notification bar is the same as the automation of App interface elements introduced earlier.

  • Close notification bar

    To close the notification bar, you can use the simulated keys introduced earlier to issue the return key.

adb command

Here we introduce to you a command line tool adb in the android sdk.

adb full Android Debug Bridge, this adb is very widely used.

It can communicate with Android mobile devices and it can perform various device operations.

For example: Install applications and debug applications, transfer files, and even log in to the shell on the mobile device for access, just like remote login.

This adb is under the platform-tools directory of the sdk. Please make sure the path is in the path environment variable.

Appium’s automation of anroid relies heavily on this adb tool. During the automation process, there are many internal operations, such as obtaining device information, transferring files to the mobile phone, installing apk, starting certain programs, etc., which are usually implemented by this adb.

Think about it, we have learned adb commands, what are the use cases for our automated programs?

Since this is a command, you can use Python’s os.system() or subprocess to automatically call it to complete our various automation needs.

For example, during our automation process, we may need to take a screenshot of the mobile phone and download it to a specified directory. We can write this in our Python program

import os
os.system('adb shell screencap /sdcard/screen3.png & amp; & amp; adb pull /sdcard/screen3.png')

In particular, you can also use am (activity manager) and pm (package manager) through adb to start activities, forcefully stop processes, broadcast intents, modify device screen attributes, list applications, uninstall applications, etc.

You can click here to view the adb commands introduced in the official documentation

Below we list the adb commands for some scenarios

View connected devices

adb devices -l

List files and transfer files

  • View catalog
adb shell ls /sdcard
  • upload
adb push wv.apk /sdcard/wv.apk
  • download
adb pull /sdcard/new.txt

Screenshot

adb shell screencap /sdcard/screen.png

The screenshot file is stored on your phone and can be downloaded using adb pull.

shell

Logging into a shell on a mobile device provides access, just like a remote login, which can be used to run various commands on the connected device.

You can execute adb shell and then execute various Linux commands supported by Android, such as ps, netstat, netstat -an|grep 4724, pwd, ls, cd, rm, etc.

(https://developer.android.google.cn/studio/command-line/adb.html#devicestatus)

Below we list the adb commands for some scenarios

View connected devices

adb devices -l

List files and transfer files

  • View catalog
adb shell ls /sdcard
  • upload
adb push wv.apk /sdcard/wv.apk
  • download
adb pull /sdcard/new.txt

Screenshot

adb shell screencap /sdcard/screen.png

The screenshot file is stored on the phone and can be downloaded using adb pull.

shell

Logging into a shell on a mobile device provides access, just like a remote login, which can be used to run various commands on the connected device.

You can execute adb shell and then execute various Linux commands supported by Android, such as ps, netstat, netstat -an|grep 4724, pwd, ls, cd, rm, etc.

Execute quit to exit the shell