How to screen record with Python?

Article directory

- introduction
- The code given by gpt3.5
- Replace screenshot function – ImageGrab.grab
- Disable imshow to solve the recursion phenomenon
- camera recording code
- Late demand

Introduction

Regarding the functional requirement of screen recording, I have used Capture screen recording software based on ffmpeg before, but when the fps is increased, it will become very stuck, and the sound will also freeze. I also tried to call the library function of ffmpeg in python, but the effect was not satisfactory. I have downloaded several screen recording software from the Internet, but it is very tiring to download the bundled software or advertisements, so I want to take this opportunity to re-study the working principle of the screen recording software, and at the same time use it as a small project to practice.

The code given by gpt3.5

When I was still hesitating whether to carefully read the manual of ffmpeg, I didn’t expect that gpt had decisively given a reference answer, and I could use Python’s PyAutoGUI library to realize the screen recording function. Here is a simple sample code:

import pyautogui
import cv2
import numpy as np

# get screen resolution
screen_size = (1920, 1080)

# Set the video encoder
fourcc = cv2.VideoWriter_fourcc(*"XVID")

# create video write object
out = cv2.VideoWriter("output.avi", fourcc, 20.0, screen_size)

# start recording
while True:
    # get screenshot
    img = pyautogui. screenshot()

    # Convert the screenshot to OpenCV format
    frame = np.array(img)
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # write video
    out. write(frame)

    # Press the q key to exit the screen recording
    if cv2.waitKey(1) == ord("q"):
        break

# Release resources
out. release()
cv2.destroyAllWindows()

From the given code, the working principle of screen recording is to continuously capture screenshots at a certain frame rate, and then combine the captured pictures into a video at the same frame rate.
After running it in the python environment with pyautogui installed, if there is no accident, there will be an accident. The code given by gpt3.5 is stuck on the running interface, and the interface does not move at all. Pressing ‘q’ to enter does not exit the cycle, and even if you ask gpt, you will not be able to give a modification plan.

After consulting the data, I found that after imshow(), pressing the key in the ui window area can effectively terminate the loop:

import numpy as np
import pyautogui
import cv2

# Set recording parameters
SCREEN_SIZE = (1920, 1080)
FILENAME = 'recorded_video.avi'
FPS = 30.0

# start recording
fourcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE)

while True:
    # get screenshot
    img = pyautogui. screenshot()
    # Convert to OpenCV format
    frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    # write video file
    out. write(frame)
    cv2.imshow('Frame', frame)
    cv2.resizeWindow('Frame', 1920, 1080)

    # detect keys
    if cv2.waitKey(1) == ord('q'):
        break

# stop recording
out. release()
cv2.destroyAllWindows()

The program can run, but the effect is still not good, the window has always had a recursive effect, and the exported video cannot actually be played.

Replace screenshot function – ImageGrab.grab

Although pyautogui can take screenshots and display them in imshow, the exported video cannot be played. Considering that it may involve specific video codec parameters, please share it in the comment area if you know it. Here, by replacing the screenshot function ImageGrab.grab in the PIL library, the screenshot can be realized and the video can be exported. The next biggest problem is to solve the recursion phenomenon.

import numpy as np
from PIL import ImageGrab
import cv2

# Set recording parameters
SCREEN_SIZE = (1920, 1080)
FILENAME = 'recorded_video.avi'
FPS = 30.0

# start recording
fourcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE)

while True:
    # get screenshot
    # img = pyautogui. screenshot()
    img = ImageGrab.grab(bbox=(0, 0, 1920, 1080))
    print('recordin..')
    # Convert to OpenCV format
    frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    # write video file
    out. write(frame)
    cv2.imshow('Frame', frame)
    cv2.resizeWindow('Frame', 1920, 1080)

    # detect keys
    if cv2.waitKey(1) == ord('q'):
        break

# stop recording
out. release()
cv2.destroyAllWindows()

Disable imshow to solve recursion

The recursion phenomenon in video processing is actually very common, except for the mirror effect in physics (observing two mirrors placed in parallel will cause recursion),

Point the camera at the display, and the recursive phenomenon will also be observed on the screen on the display:

After trying, after disabling imshow(), change to the frame counting method to customize the termination loop, and there will be no recursion problem:

import numpy as np
from PIL import ImageGrab
import cv2

# Set recording parameters
SCREEN_SIZE = (1920, 1080)
FILENAME = 'recorded_video.avi'
FPS = 30.0

# start recording
fourcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE)

cnt = 0
while True:
    # get screenshot
    # img = pyautogui. screenshot()
    img = ImageGrab.grab(bbox=(0, 0, 1920, 1080))
    print('recordin..')
    # Convert to OpenCV format
    frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    # write video file
    out. write(frame)
    # cv2.imshow('Frame', frame)
    # cv2.resizeWindow('Frame', 1920, 1080)

    # # Detect keys
    # if cv2.waitKey(1) == ord('q'):
    #break

    cnt = cnt + 1

    if cnt == 100: #terminate the loop after 100 frames
        break

# stop recording
out. release()
cv2.destroyAllWindows()

You can customize the screen recording area by modifying the parameters in img = ImageGrab.grab(bbox=(0, 0, 2560, 1600)), x, y, w, h represent the coordinates of the upper left corner ( start coordinates) and picture width and height. For example, my screen resolution is 2560*1600, then setting it to 0, 0, 2560, 1600 is to record the full screen:

In this way, we can basically realize the function of screen recording with Python. The dynamic image preview does not seem to have a high resolution because the format factory used converts the recorded video to gif, and the recorded video before compression is actually quite clear.

By modifying the value of fps, we can also record some high-refresh rate movies and game screens by ourselves. The higher the fps, the smoother the screen.

Camera recording code

Similarly, you can also use python to realize the function of camera recording:

import cv2
import cv2 as cv

# turn on the camera
cap = cv2.VideoCapture(0)

fourcc = cv.VideoWriter_fourcc(*'XVID')
file_name = 'output'
output = cv.VideoWriter((file_name + '.avi'), fourcc, 24.0, (640, 480)) #Set file name, fps, resolution

while cap.isOpened():

    res, frame = cap. read()
    if not res:
        print("Frame Cannot Be Received")
        break

    # Flipping the frame horizontally to get correct orientation
    frame = cv2. flip(frame, 90)

    # Displaying the current frame
    output. write(frame)
    cv2.imshow('Frame', frame)

    # If no input is received for 1ms, or if the key 'x' is pressed, interpreter goes outside of the loop
    if cv2.waitKey(1) == ord('x'):
        break

# Releasing everything after coming out of loop
cap. release()
output. release()
cv2.destroyAllWindows()

Late demand

Now the problem of screen recording is basically solved. If we want to make a practical screen recording software, we need to add audio recording and design a convenient UI interface.

Reference documents:
[1] python video operation – python realizes reading and saving video
[2] How to choose one willow python to achieve screen video recording?