Virtual gesture recognition mouse control based on mediapipe.

If the article is an original article, please indicate the source of the original article when reprinting it.

1. Project introduction

Because the blogger is too lazy, he simply ignores the principle of how mediapipe implements mouse control. The original idea is to control the camera to recognize the finger and control the mouse to achieve the effect of playing a movie. Basically the effect is OK. To put it simply, mediapipe is used to detect the key points of the finger, and the mouse is moved by detecting the key points of the index finger. When the distance between the index finger and the middle finger is less than a certain value, it is treated as a click event.

2. Environment setup

I am using the miniconda3 terminal. I mentioned how to install it before. If you don’t understand or don’t understand, please go to Baidu yourself.

1. Open the terminal

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

2. Create mediapipe virtual environment
conda create -n mediapipe_env python=3.8

During the creation process, you will be prompted with an interface, enter y

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

Wait for a while and it will be created. If an error occurs, change the conda source yourself.

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

Follow the prompts to activate the environment

3. Activate the environment
conda activate mediapipe_env

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

3. Dependency installation

Before writing code, you need to install some dependencies such as mediapipe. Make sure the environment has been activated before installation.

1. Install mediapipe
pip install mediapipe -i https://pypi.tuna.tsinghua.edu.cn/simple
2. Install numpy
pip install numpy -i https://pypi.tuna.tsinghua.edu.cn/simple
3. Install autopy
pip install autopy -i https://pypi.tuna.tsinghua.edu.cn/simple
4. Install opencv
pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple

4. Code and testing

The code is edited directly using notepad++. Depending on your personal habits, you can use VS or pycharm or others.

Run the operation directly from the terminal. To use pycharm, etc., you need to set up your own environment.

Upload the code directly below

1. Virtual mouse

AiVirtualMouse.py

import cv2
import HandTrackingModule as htm
import autopy
import numpy as np
import time


##############################
wCam, hCam = 1080, 720
frameR = 100
smoothing = 5
##############################
cap = cv2.VideoCapture(0) # If you use the laptop's own camera, the number is 0. If you use an external camera, change it to 1 or other numbers.
cap.set(3, wCam)
cap.set(4, hCam)
pTime = 0
plocX, plocY = 0, 0
clocX, clocY = 0, 0

detector = htm.handDetector()
wScr, hScr = autopy.screen.size()
# print(wScr, hScr)

while True:
    success, img = cap.read()
    # 1. Detect the hand and get the key point coordinates of the finger
    img = detector.findHands(img)
    cv2.rectangle(img, (frameR, frameR), (wCam - frameR, hCam - frameR), (0, 255, 0), 2, cv2.FONT_HERSHEY_PLAIN)
    lmList = detector.findPosition(img, draw=False)

    # 2. Determine whether the index finger and middle finger are extended
    if len(lmList) != 0:
        x1, y1 = lmList[8][1:]
        x2, y2 = lmList[12][1:]
        fingers = detector.fingersUp()

        # 3. If only the index finger is extended, enter mobile mode
        if fingers[1] and fingers[2] == False:
            # 4. Coordinate conversion: Convert the coordinates of the index finger in the window to the coordinates of the mouse on the desktop
            #Mouse coordinates
            x3 = np.interp(x1, (frameR, wCam - frameR), (0, wScr))
            y3 = np.interp(y1, (frameR, hCam - frameR), (0, hScr))

            # smoothening values
            clocX = plocX + (x3 - plocX) / smoothening
            clocY = plocY + (y3 - plocY) / smoothening

            autopy.mouse.move(wScr - clocX, clocY)
            cv2.circle(img, (x1, y1), 15, (255, 0, 255), cv2.FILLED)
            plocX, plocY = clocX, clocY

        # 5. If the index finger and middle finger are both extended, the distance between the fingers is detected. If the distance is short enough, it corresponds to a mouse click.
        if fingers[1] and fingers[2]:
            length, img, pointInfo = detector.findDistance(8, 12, img)
            if length < 40:
                cv2.circle(img, (pointInfo[4], pointInfo[5]),
                           15, (0, 255, 0), cv2.FILLED)
                autopy.mouse.click()

    cTime = time.time()
    fps = 1 / (cTime - pTime)
    pTime = cTime
    cv2.putText(img, f'fps:{int(fps)}', [15, 25],
                cv2.FONT_HERSHEY_PLAIN, 2, (255, 0, 255), 2)
    cv2.imshow("Image", img)
    cv2.waitKey(1)
</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack.png" alt ="" title="">

HandTrackingModule.py

import cv2
import mediapipe as mp
import time
import math

class handDetector():
    def __init__(self, mode=False, maxHands=2, detectionCon=0.8, trackCon=0.8):
        self.mode = mode
        self.maxHands = maxHands
        self.detectionCon = detectionCon
        self.trackCon = trackCon

        self.mpHands = mp.solutions.hands
        self.hands = self.mpHands.Hands(self.mode, self.maxHands, self.detectionCon, self.trackCon)
        self.mpDraw = mp.solutions.drawing_utils
        self.tipIds = [4, 8, 12, 16, 20]

    def findHands(self, img, draw=True):
        imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        self.results = self.hands.process(imgRGB)

        print(self.results.multi_handedness) # Get the left and right hand labels in the detection results and print them

        if self.results.multi_hand_landmarks:
            for handLms in self.results.multi_hand_landmarks:
                if draw:
                    self.mpDraw.draw_landmarks(img, handLms, self.mpHands.HAND_CONNECTIONS)
        return img

    def findPosition(self, img, draw=True):
        self.lmList = []
        if self.results.multi_hand_landmarks:
            for handLms in self.results.multi_hand_landmarks:
                for id, lm in enumerate(handLms.landmark):
                    h, w, c = img.shape
                    cx, cy = int(lm.x * w), int(lm.y * h)
                    # print(id, cx, cy)
                    self.lmList.append([id, cx, cy])
                    if draw:
                        cv2.circle(img, (cx, cy), 12, (255, 0, 255), cv2.FILLED)
        return self.lmList

    def fingersUp(self):
        fingers = []
        #thumbs up
        if self.lmList[self.tipIds[0]][1] > self.lmList[self.tipIds[0] - 1][1]:
            fingers.append(1)
        else:
            fingers.append(0)

        # Rest of fingers
        for id in range(1, 5):
            if self.lmList[self.tipIds[id]][2] < self.lmList[self.tipIds[id] - 2][2]:
                fingers.append(1)
            else:
                fingers.append(0)

        # totalFingers = fingers.count(1)
        return fingers

    def findDistance(self, p1, p2, img, draw=True, r=15, t=3):
        x1, y1 = self.lmList[p1][1:]
        x2, y2 = self.lmList[p2][1:]
        cx, cy = (x1 + x2) // 2, (y1 + y2) // 2

        if draw:
            cv2.line(img, (x1, y1), (x2, y2), (255, 0, 255), t)
            cv2.circle(img, (x1, y1), r, (255, 0, 255), cv2.FILLED)
            cv2.circle(img, (x2, y2), r, (255, 0, 255), cv2.FILLED)
            cv2.circle(img, (cx, cy), r, (0, 0, 255), cv2.FILLED)
            length = math.hypot(x2 - x1, y2 - y1)

        return length, img, [x1, y1, x2, y2, cx, cy]


def main():
    pTime = 0
    cTime = 0
    cap = cv2.VideoCapture(0)
    detector = handDetector()
    while True:
        success, img = cap.read()
        img = detector.findHands(img) # Detect gestures and draw skeleton information

        lmList = detector.findPosition(img) # Get the list of coordinate points
        if len(lmList) != 0:
            print(lmList[4])

        cTime = time.time()
        fps = 1 / (cTime - pTime)
        pTime = cTime

        cv2.putText(img, 'fps:' + str(int(fps)), (10, 70), cv2.FONT_HERSHEY_PLAIN, 3, (255, 0, 255), 3)
        cv2.imshow('Image', img)
        cv2.waitKey(1)


if __name__ == "__main__":
    main()
</code><img class="look-more-preCode contentImg-no-view" src="//i2.wp.com/csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreBlack.png" alt ="" title="">

Virtual mouse function, when the index finger and middle finger are brought together, the screen disappears, and the interface can be controlled by fingers.

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

After running, the error (arg0:int)-> mediapipe.python._framework_bindings.packet.PacketInvoked with: 0.5 appears

Approach:

Modify line 595 in the solution_base.py file in the error message to the following:

return getattr(packet_creator,'create_' + packet_data_type.value)(True if round(data)>0 else False)

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

If there is any infringement or you need the complete code, please contact the blogger in time.

python._framework_bindings.packet.PacketInvoked with: 0.5 this error