Article directory
-
- introduction
- The code given by gpt3.5
- Replace screenshot function – ImageGrab.grab
- Disable imshow to solve the recursion phenomenon
- camera recording code
- Late demand
Introduction
Regarding the functional requirement of screen recording, I have used Capture screen recording software based on ffmpeg before, but when the fps is increased, it will become very stuck, and the sound will also freeze. I also tried to call the library function of ffmpeg in python, but the effect was not satisfactory. I have downloaded several screen recording software from the Internet, but it is very tiring to download the bundled software or advertisements, so I want to take this opportunity to re-study the working principle of the screen recording software, and at the same time use it as a small project to practice.
The code given by gpt3.5
When I was still hesitating whether to carefully read the manual of ffmpeg, I didn’t expect that gpt had decisively given a reference answer, and I could use Python’s PyAutoGUI library to realize the screen recording function. Here is a simple sample code:
import pyautogui import cv2 import numpy as np # get screen resolution screen_size = (1920, 1080) # Set the video encoder fourcc = cv2.VideoWriter_fourcc(*"XVID") # create video write object out = cv2.VideoWriter("output.avi", fourcc, 20.0, screen_size) # start recording while True: # get screenshot img = pyautogui. screenshot() # Convert the screenshot to OpenCV format frame = np.array(img) frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # write video out. write(frame) # Press the q key to exit the screen recording if cv2.waitKey(1) == ord("q"): break # Release resources out. release() cv2.destroyAllWindows()
From the given code, the working principle of screen recording is to continuously capture screenshots at a certain frame rate, and then combine the captured pictures into a video at the same frame rate.
After running it in the python environment with pyautogui installed, if there is no accident, there will be an accident. The code given by gpt3.5 is stuck on the running interface, and the interface does not move at all. Pressing ‘q’ to enter does not exit the cycle, and even if you ask gpt, you will not be able to give a modification plan.
After consulting the data, I found that after imshow(), pressing the key in the ui window area can effectively terminate the loop:
import numpy as np import pyautogui import cv2 # Set recording parameters SCREEN_SIZE = (1920, 1080) FILENAME = 'recorded_video.avi' FPS = 30.0 # start recording fourcc = cv2.VideoWriter_fourcc(*"XVID") out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE) while True: # get screenshot img = pyautogui. screenshot() # Convert to OpenCV format frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) # write video file out. write(frame) cv2.imshow('Frame', frame) cv2.resizeWindow('Frame', 1920, 1080) # detect keys if cv2.waitKey(1) == ord('q'): break # stop recording out. release() cv2.destroyAllWindows()
The program can run, but the effect is still not good, the window has always had a recursive effect, and the exported video cannot actually be played.
Replace screenshot function – ImageGrab.grab
Although pyautogui can take screenshots and display them in imshow, the exported video cannot be played. Considering that it may involve specific video codec parameters, please share it in the comment area if you know it. Here, by replacing the screenshot function ImageGrab.grab in the PIL library, the screenshot can be realized and the video can be exported. The next biggest problem is to solve the recursion phenomenon.
import numpy as np from PIL import ImageGrab import cv2 # Set recording parameters SCREEN_SIZE = (1920, 1080) FILENAME = 'recorded_video.avi' FPS = 30.0 # start recording fourcc = cv2.VideoWriter_fourcc(*"XVID") out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE) while True: # get screenshot # img = pyautogui. screenshot() img = ImageGrab.grab(bbox=(0, 0, 1920, 1080)) print('recordin..') # Convert to OpenCV format frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) # write video file out. write(frame) cv2.imshow('Frame', frame) cv2.resizeWindow('Frame', 1920, 1080) # detect keys if cv2.waitKey(1) == ord('q'): break # stop recording out. release() cv2.destroyAllWindows()
Disable imshow to solve recursion
The recursion phenomenon in video processing is actually very common, except for the mirror effect in physics (observing two mirrors placed in parallel will cause recursion),
Point the camera at the display, and the recursive phenomenon will also be observed on the screen on the display:
After trying, after disabling imshow(), change to the frame counting method to customize the termination loop, and there will be no recursion problem:
import numpy as np from PIL import ImageGrab import cv2 # Set recording parameters SCREEN_SIZE = (1920, 1080) FILENAME = 'recorded_video.avi' FPS = 30.0 # start recording fourcc = cv2.VideoWriter_fourcc(*"XVID") out = cv2.VideoWriter(FILENAME, fourcc, FPS, SCREEN_SIZE) cnt = 0 while True: # get screenshot # img = pyautogui. screenshot() img = ImageGrab.grab(bbox=(0, 0, 1920, 1080)) print('recordin..') # Convert to OpenCV format frame = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) # write video file out. write(frame) # cv2.imshow('Frame', frame) # cv2.resizeWindow('Frame', 1920, 1080) # # Detect keys # if cv2.waitKey(1) == ord('q'): #break cnt = cnt + 1 if cnt == 100: #terminate the loop after 100 frames break # stop recording out. release() cv2.destroyAllWindows()
You can customize the screen recording area by modifying the parameters in img = ImageGrab.grab(bbox=(0, 0, 2560, 1600))
, x, y, w, h represent the coordinates of the upper left corner ( start coordinates) and picture width and height. For example, my screen resolution is 2560*1600, then setting it to 0, 0, 2560, 1600 is to record the full screen:
In this way, we can basically realize the function of screen recording with Python. The dynamic image preview does not seem to have a high resolution because the format factory used converts the recorded video to gif, and the recorded video before compression is actually quite clear.
By modifying the value of fps, we can also record some high-refresh rate movies and game screens by ourselves. The higher the fps, the smoother the screen.
Camera recording code
Similarly, you can also use python to realize the function of camera recording:
import cv2 import cv2 as cv # turn on the camera cap = cv2.VideoCapture(0) fourcc = cv.VideoWriter_fourcc(*'XVID') file_name = 'output' output = cv.VideoWriter((file_name + '.avi'), fourcc, 24.0, (640, 480)) #Set file name, fps, resolution while cap.isOpened(): res, frame = cap. read() if not res: print("Frame Cannot Be Received") break # Flipping the frame horizontally to get correct orientation frame = cv2. flip(frame, 90) # Displaying the current frame output. write(frame) cv2.imshow('Frame', frame) # If no input is received for 1ms, or if the key 'x' is pressed, interpreter goes outside of the loop if cv2.waitKey(1) == ord('x'): break # Releasing everything after coming out of loop cap. release() output. release() cv2.destroyAllWindows()
Late demand
Now the problem of screen recording is basically solved. If we want to make a practical screen recording software, we need to add audio recording and design a convenient UI interface.
Reference documents:
[1] python video operation – python realizes reading and saving video
[2] How to choose one willow python to achieve screen video recording?