Python implements screenshot text recognition (packaged into exe program)

Table of Contents

1 Introduction

2. How to use

3. Complete code

4. Free download

Here I am going to give you a self-developed screenshot text recognition program!

The prerequisite for using the program is that the computer has a Python environment installed! (No version limit)

1, Introduction

This code creates a screenshot tool GUI application that allows the user to select an area to screenshot and perform text recognition.

It solves the following problems:

  1. Provides a GUI interface that allows users to select an area on the screen for screenshots.
  2. Use Tesseract OCR for text recognition and copy the intercepted text to the clipboard.
  3. Screenshots can be automatically deleted according to the settings in the configuration file.
  4. It is more accurate than WeChat in extracting text

Here is the comparison:

204ce6881de94796b0853f7d4d2d02f0.png

2How to use

First, an initial folder will be obtained. The functions of each file are as follows (the files not mentioned cannot be moved):

f87d2b56b30f43868582375505213614.png

Remember: Read the documentation carefully!

After executing the initialization preparation environment, the folder result obtained is:

70ab8aedc0c64131b64052f5f2606118.png

After running extraction.exe, you will get a window (this is not very useful):

ce17dcad850c4796880b3dcc73d92de5.png

Seeing this proves that the operation was successful. Next, you can double-click the set shortcut key to start taking screenshots:

9b47c614c58e48ad983982c0d4296433.png

0c0b8b95560946a79520657f332d1c7a.png

This is a screenshot application.

If the screenshot interface does not appear after double-clicking the shortcut key, just open it in the status bar of your computer.

3, complete code

# Import necessary modules
from datetime import datetime # Used to get the current date and time
from tkinter import * # Used to create GUI interface
import pyautogui # for screenshots
from PIL import ImageGrab # used to process image data
import os # for file and directory operations
import pytesseract # used for text recognition
import pyperclip # used for clipboard operations

#Set the path and configuration of Tesseract OCR
tesseract_exe = r'.\Tesseract-OCR\tesseract.exe'
tessdata_dir = r'.\Tesseract-OCR\tessdata'
pytesseract.pytesseract.tesseract_cmd = tesseract_exe
tessdata_dir_config = '--tessdata-dir "{}"'.format(tessdata_dir)

#Create a class named ScreenCaptureApp
class ScreenCaptureApp:
    def __init__(self, root):
        # Initialize the root window of the application
        self.root = root
        self.root.attributes('-fullscreen', True) # Set the window to full screen
        self.root.attributes('-alpha', 0.1) # Set window transparency
        self.root.title("Area capture") # Set window title

        # Get the width and height of the screen
        self.screen_width, self.screen_height = pyautogui.size()

        #Create a Canvas control to display the screenshot area
        self.canvas = Canvas(self.root, cursor="cross")
        self.canvas.pack(fill=BOTH, expand=YES)

        # Bind mouse event handler function
        self.canvas.bind("<ButtonPress-1>", self.on_press)
        self.canvas.bind("<B1-Motion>", self.on_drag)
        self.canvas.bind("<ButtonRelease-1>", self.on_release)

        #Initialize some coordinates and variables
        self.start_x = None
        self.start_y = None
        self.end_x = None
        self.end_y = None
        self.rect = None
        self.mask_rect = None
        self.image_path = None

        # Update window size
        self.update_window_size()

    # Update window size
    def update_window_size(self):
        screen_width = self.root.winfo_screenwidth()
        screen_height = self.root.winfo_screenheight()
        self.root.geometry("%dx%d" % (screen_width, screen_height))

    #Mouse press event handler function
    def on_press(self, event):
        self.start_x = self.canvas.canvasx(event.x)
        self.start_y = self.canvas.canvasy(event.y)
        if self.rect:
            self.canvas.delete(self.rect)
        if self.mask_rect:
            self.canvas.delete(self.mask_rect)
        self.rect = self.canvas.create_rectangle(self.start_x, self.start_y, self.start_x, self.start_y, outline="blue", fill="blue",
                                                 stipple='gray25', width=3)

    #Mouse drag event handler function
    def on_drag(self, event):
        cur_x = self.canvas.canvasx(event.x)
        cur_y = self.canvas.canvasy(event.y)
        self.canvas.coords(self.rect, self.start_x, self.start_y, cur_x, cur_y)
        self.update_mask(cur_x, cur_y)

    #Mouse release event handler function
    def on_release(self, event):
        self.end_x = self.canvas.canvasx(event.x)
        self.end_y = self.canvas.canvasy(event.y)
        # Calculate the coordinates of the screenshot area
        if self.start_x < self.end_x and self.start_y < self.end_y:
            left = self.start_x
            top = self.start_y
            right = self.end_x
            bottom = self.end_y
        # ... (calculations in other cases are omitted)

        # Use ImageGrab.grab to capture screen images
        screenshot = ImageGrab.grab(bbox=(left, top, right, bottom))

        # Get the directory and current date and time of the current script
        script_directory = os.path.dirname(os.path.abspath(__file__))
        current_datetime = datetime.now().strftime("%Y-%m-%d-%H%M%S")

        # Build image file name
        file_name = f"xzlScreenshot-{current_datetime}.png"
        self.image_path = os.path.join(script_directory, "image/" + file_name)

        # Save screenshot to file
        screenshot.save(self.image_path)

        # Configure Tesseract OCR parameters
        custom_config = r'--oem 3 --psm 6 -c preserve_interword_spaces=1'

        # Use pytesseract for text recognition
        text = pytesseract.image_to_string(screenshot, lang=' + '.join(['eng', 'chi_sim']), config=f'--tessdata-dir "{tessdata_dir}" {custom_config}')

        #Copy the recognition results to the clipboard
        pyperclip.copy(text)

        # Read the configuration file and check whether the screenshot needs to be automatically deleted
        with open('config.txt', 'r', encoding="utf-8") as file:
            content = file.read()
        key_value_pairs = content.strip().split('\
')
        for i in range(0, len(key_value_pairs)):
            key = key_value_pairs[i].split('=')
            value = key[1]
            if str(key[0]) == "autoDeleteImg":
                auto_delete_img_value = value
                break
        if int(auto_delete_img_value) == int(1):
            if self.image_path:
                os.remove(self.image_path)

        # Close the application window
        self.root.destroy()

    # Update mask effect
    def update_mask(self, cur_x, cur_y):
        if self.mask_rect:
            self.canvas.delete(self.mask_rect)
        self.mask_rect = self.canvas.create_rectangle(0, 0, self.root.winfo_screenwidth(), self.root.winfo_screenheight(), fill="black", )
        self.canvas.tag_lower(self.mask_rect)
        self.canvas.coords(self.mask_rect, self.start_x, self.start_y, cur_x, cur_y)

# Main function, create application object and run
def main():
    root = Tk()
    app = ScreenCaptureApp(root)
    root.mainloop()

# Check if the script is running as the main program
if __name__ == "__main__":
    main()

4, free download

The downloaded installation package does not require independent installation of third-party modules, as they have been integrated.

The compressed package of the application has been uploaded to the cloud server. You can download it by visiting the public address: Screenshot extraction text application compressed package.zip

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Python entry skill treeHomepageOverview 381538 people are learning the system