Based on the arm architecture diagram, the smart box (T906G) ubuntu20.04 builds open-ai Whisper and realizes speech to text.

Foreword

The arm architecture is really not fun. You can’t just rely on Baidu for strange error reports. Google is a must. Don’t be afraid of foreign blogs.

Text

1. Hardware introduction

The picture shows the built-in ubuntu20.04 system of the smart box. The built-in default python is 3.8. There is an nvidia graphics card, but I have not installed the driver. The implementation of this project is currently only running on the CPU, and the application of GPU will continue to be studied later.

2. Environment setup

2.1 Install python3.10.12

I tried to use the built-in python3.8 to build it directly, but after installing torch and everything was ready, I found that the numpy1.17 version did not meet the requirements. There is a corresponding relationship between the numpy and python versions. I tried to install numpy1.22 directly and it was not compatible, so I suggested Install a higher version of python directly. The version I use is python3.10.12.

2.1.1 Dependency installation

Before installing python, we need to install some necessary dependencies. Execute the following commands to install these dependencies:

# Refresh package directory
sudo apt update
# Install dependencies
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libbz2-dev liblzma-dev sqlite3 libsqlite3-dev tk-dev uuid-dev libgdbm-compat-dev
2.1.2 python installation

First, download the source code.

Download method 1 (not recommended): Download the source code from the Python official website (Python Source Releases | Python.org). Go to the official website to find the version you need and click to download.

Download method 2 (recommended): Download through wget, open the terminal in the directory you want to place it, and execute the following instructions. For other versions of Python, you only need to replace the version number:

# Download Python 3.10.12
sudo wget https://www.python.org/ftp/python/3.10.12/Python-3.10.12.tar.xz
# To download other versions, just replace the version number.

Secondly, decompress the file and enter the first-level directory of the file. All subsequent commands will be executed in this directory:

# Unzip
tar -xf Python-3.10.12.tar.xz
# Enter the directory
cd Python-3.10.12/

Again, configure:

# Check dependencies and configuration compilation
sudo ./configure --enable-optimizations --with-lto --enable-shared

Three configuration items are used here, with the following meanings:

–enable-optimizations: Enable profile-driven optimization (PGO) with PROFILE_TASK
–with-lto: Enable link-time optimization (LTO) during compilation
–enable-shared: Enable compilation of the shared Python library libpython

More information about configuration items is detailed in the official Python documentation.

After running this step, the Makefile is automatically generated.

Then compile, where the number after -j means the number of cores participating in the compilation. The picture shows that the number of cores of Smart Box T906G is 8. I used 6 for compilation. This process takes a long time. , you can start 1-2 games of King of Glory:

# Compile, the number after -j is the number of CPU cores participating in compilation, adjusted according to the computer configuration
sudo make -j 6

After the compilation is completed, pay attention to carefully check the output to check if there are any errors. I went very smoothly at this step and did not report any errors. If there are any errors, please search Baidu and Google.

Then, install, use altinstall instead of install, because python already exists in the system, executing install will directly overwrite the original version, which may cause The environment in which the system was originally built is incompatible and damaged.

sudo make altinstall

Finally, link the dynamic library. There is the --enable-shared option in the compilation configuration, so use the command python3.10 directly at this time. code> will prompt an error that libpython3.10.so.1.0 cannot be found. Just find the so file, copy (or create a symbolic link) to the /usr/lib/ directory, and execute the following two instructions in sequence:

# Find the location of libpython
whereis libpython3.10.so.1.0
#Executing the above command will display: libpython3.10.so.1: /usr/local/lib/libpython3.10.so.1.0
#Create a symbolic link to libpython under /usr/lib/
sudo ln -s /usr/local/lib/libpython3.10.so.1.0 /usr/lib/

Run the commandpython -V. If you can see the version number, the installation is successful.

At this point, there will be multiple pythons in this system. When applying, it is inevitable to be confused about which python to use? Don’t worry, we will set up a virtual environment in the next step, and you can use whichever one you want.

2.2 Virtual environment construction

2.2.1 Installation Tool

Speaking of virtual environment, the first reaction should be anaconda, but unfortunately all versions of anaconda do not support arm64, so I use the virtual environment tool python3-venv. To install it, execute the following instructions:

sudo apt-get install python3-venv
2.2.2 Build a virtual environment

Find a directory. It is recommended that the directory be simpler, preferably under home, so that you can type less characters when activating the environment later. Create and activate a new virtual environment with the following commands:

#Create a virtual environment. The version after python is the python version of the virtual environment. gg_env is the name of the virtual environment. It can be started arbitrarily.
python3.10 -m venv gg_env
#Activate the virtual environment. When activating the virtual environment on other terminals, please note that the name of the environment needs to include the path, such as soure ~/Downloads/gg_env/bin/activate
source gg_env/bin/activate

The version after python in the command is the python version in the virtual environment. At the same time, it should be noted that when activating the virtual environment in other terminals, the name of the environment needs to include the path, such as soure ~/Downloads/gg_env/bin/activate.

2.3 Installation dependencies

2.3.1 Install numpy

Because I encountered a problem with numpy, I would like to explain that I have forgotten which version of python3.10.12 is installed by default. Anyway, it is enough. Pay attention to the following steps after activating the virtual environment to install dependencies in the virtual environment. Execute the following command to install numpy. If you want to specify the version plus the version number, use pip install numpy==1.22. The default here is:

pip install numpy

The successful installation can be verified with the following command. If successful, no other prompts will appear after execution:

#Enter python and load numpy
python
imoprt numpy

2.3.2 Install pytorch

For the installation of pytorch, go to the pytorch official website (Start Locally | PyTorch) and select the following configuration, copy the installation command, and install it. pytorch official website

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

To check whether the installation is successful, execute the following instructions, enter python, load torch, and it will be successful without any error or other prompts:

python
import torch
2.3.3 Install ffmpeg

Execute the following instructions to install ffmpeg, which is used to decode various audio files.

sudo apt install ffmpeg

2.4 Install Whisper

2.4.1 Download source code

Go to the release version of Whisper official website (https://github.com/openai/whisper/releases) to find the desired download and unzip it into the file directory. Or execute the following commands to download and enter the directory:

git clone https://github.com/openai/whisper.git
cd whisper/
2.4.2 Installation

After entering the directory, execute the following command to install:

pip install -r requirements.txt
pip install -U openai-whisper
pip install git + https://github.com/openai/whisper.git
pip install --upgrade --no-deps --force-reinstall git + https://github.com/openai/whisper.git
pip install setuptools-rust

2.5 Using Whisper

Use the whisper –help command to understand the meaning and usage of its related script parameters. Here is an example of a speech recognition script:

#--model indicates the model used. Different models have different sizes and accuracy rates.
#--language indicates the language type of translation
#test.mp3 is the voice file to be recognized, and also supports audio files in wav and other formats.
whisper test.mp3 --model tiny --language Chinese

For other script parameters and functions (translation, etc.), please refer to the official website (GitHub – openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision) file, which will not be described here.

References

Local deployment of Whisper and WhisperDesktop_engchina’s blog-CSDN blog

Installing Python 3.9 on Ubuntu 22.04 (applicable to multiple versions)_ubuntu installation python3.9_muzing_’s blog-CSDN blog