How to configure the CPU and GPU versions of the tensorflow library on Linux Ubuntu

This article introduces the configuration that can be run by CPU or GPU in Ubuntu of the Linux operating system. Methods of Python new version of deep learning library tensorflow.

In the article Anaconda configures the new version of the Python tensorflow library (CPU, GPU common) method (https://blog.csdn.net/zhebushibiaoshifu/article/details/129285815) and the configuration method of the new version of the GPU-accelerated tensorflow library ( https://blog.csdn.net/zhebushibiaoshifu/article/details/129291170), we have introduced Windows platform, configure CPU, GPU< /strong> version of the tensorflow library; and in this article, we will introduce the CPU and in the Linux Ubuntu environment The configuration method of the >GPU version tensorflow library.

This article is divided into two parts. Part 1 is the configuration method of the tensorflow library of CPU version, and part 2 part is the configuration method of the GPU version of the tensorflow library; if your computer has a GPU, then skip the 1 part, just start from the 2 part of this article. It needs to be clear that the version of Python in this article is 3.10, which is a relatively new version; but if your Python is other versions The problem, the idea of the overall configuration is the same.

1 CPU version

First of all, let’s introduce the configuration method of the CPU version of the tensorflow library.

It can be said that it is very simple to configure the CPU version of the tensorflow library. First of all, I suggest that you configure Anaconda first according to the content mentioned in the article How to Configure Anaconda and Python on Linux Ubuntu (https://blog.csdn.net/zhebushibiaoshifu/article/details/130807267) environment; secondly, if you need to configure the tensorflow library in a virtual environment, then you can create a virtual environment by yourself and start the follow-up operations – I am directly in the default environment, which is base environment to be configured. Regarding Anaconda creating a virtual environment, you can refer to the article Creating, using and deleting Python virtual environment in Anaconda (https://blog.csdn.net/zhebushibiaoshifu/article/details/128334614), here No more details.

We can view the environment in the current Anaconda environment by entering the following code in the terminal.

conda info -e

Run the above code, you will get the situation as shown in the figure below. Among them, you can see that because I have not created a virtual environment here, there is only one base environment.

Then, we enter the following code in the terminal to install the tensorflow library.

conda install tensorflow

Run the above code, we will automatically start installing the latest version of the tensorflow library supported by the current environment (that is, the Python version); as shown in the figure below.

After the installation is complete, the interface as shown in the figure below will appear.

So far, we have completed the configuration of the CPU version of the tensorflow library. We follow the method mentioned in the article New version of GPU-accelerated tensorflow library configuration method (https://blog.csdn.net/zhebushibiaoshifu/article/details/129291170), enter the following in Python The code to check whether the current tensorflow library supports GPU operations.

import tensorflow as tf print(tf.config.list_physical_devices("GPU"))

Run the above code, if you get an empty list [] as shown in the figure below, it means that the current tensorflow library does not support GPU operations –Of course, this is for sure. What we configure here is the CPU version of the tensorflow library, so naturally it cannot be operated on the GPU.

So far, the tensorflow library can also be used normally, but it can only support CPU operations. It is necessary to mention here that in fact, the tensorflow library we configured through the aforementioned method also supports GPU operations in principle-because in Linux In the operating system, since the 1.15 version of the tensorflow library, there is no distinction between CPU and GPU versions , as long as the tensorflow library is downloaded, it is supported by both CPU and GPU; the tensorflow we have configured so far The reason why the library cannot run on the GPU is because we have not configured other dependencies required for GPU operations (or the computer is completely No GPU).

2 GPU versions

Next, let’s introduce the configuration method of the GPU version of the tensorflow library.

2.1 NVIDIA Driver Configuration

First, we need to configure the NVIDIA driver. NVIDIA drivers are software for NVIDIA graphics cards that control the capabilities and performance of NVIDIA graphics cards and ensure that they are compatible with the operating system and other The software works fine together.

First, we can enter the following code in the terminal.

nvidia-smi

Then, under normal circumstances, the situation as shown in the figure below should appear. If you have other situations at this time, it means that either you have not installed any NVIDIA driver, or you have installed the NVIDIA driver but there is a problem with the version of this driver. We can ignore it here, and everyone can continue to read down.

Next, we will start to install the NVIDIA driver. Among them, 3 different methods are provided here, but it is recommended that you use the last one.

2.1.1 Method 1 (not recommended)

In the 1 method, we can directly enter the following code in the terminal.

sudo ubuntu-drivers autoinstall

Under normal circumstances, this code will automatically download or update the driver in our computer, and the NVIDIA driver will also be downloaded or updated together. However, after I tried this method, I found that it did not work. Therefore, this method should be related to the status of your computer, and it may not be 100% successful, so it is not recommended.

2.1.2 Method 2 (not recommended)

The 2 method is to directly download the NVIDIA driver from the official website; but this method is more troublesome, so I do not recommend it here.

First, we enter the official website of NVIDIA driver (https://www.nvidia.cn/Download/index.aspx?lang=cn), and at the interface as shown in the figure below , choose according to the model of the graphics card in your computer and the system of your computer.

Then, click the “Search” option, and the most suitable NVIDIA driver will appear, and click “Download“.

Then, in the terminal, you can install the NVIDIA driver you just downloaded.

2.1.3 Method 3 (recommended)

The 3 method is the most recommended method.

First, everyone enters the following code in the terminal.

ubuntu-drivers devices

Then, the interface as shown in the figure below will appear; among them, the NVIDIA driver version of recommended appears, which is the most suitable version in our computer; you need Record this version number, it will be used later.

Next, we enter the following code in the terminal.

sudo apt install nvidia-driver-525

Among them, 525 at the end of the above code is the version number recorded in the picture above. You can modify the above code according to your actual situation. After running the code, the situation as shown in the figure below will appear, that is, this version of the NVIDIA driver will start to download and install.

If everyone’s subsequent download and installation are smooth, it will be fine for a long time; but sometimes, an error message as shown in the figure below will appear.

At this time, it shows that the original NVIDIA driver in our computer conflicts with the newly downloaded version, which makes the new version unable to be installed normally. At this point, we need to enter the following codes in sequence in the terminal, remember to enter one line at a time.

sudo apt-get purge nvidia* sudo apt-get purge libnvidia* sudo apt-get --purge remove nvidia-* sudo dpkg --list | grep nvidia-*

In the above code, the first 3 line means to delete the original NVIDIA driver and its related content, and the last sentence is used to detect that the original NVIDIA< /strong>Has the driver been deleted? If you have the situation shown in the figure below, that is, no prompt message appears after entering the last sentence of the above code, it means that the original NVIDIA driver has been deleted.

At this point, we can execute the following code again.

ubuntu-drivers devices

But at this time, different from the previous article, you may see that the NVIDIA driver version of recommended has changed. For example, I am no longer the previous version. 525, but another version; but here we don’t care about this change, and we can download the 525 version later.

Next, we still run the following code.

sudo apt install nvidia-driver-525

Among them, 525 at the end of the above code is my version number here, you still have to remember to modify it. At this point, we can download and install the NVIDIA driver of the specified version normally.

At this point, we enter the following code in the terminal again.

nvidia-smi

Then, under normal circumstances, the situation as shown in the figure below should appear. Among them, you can pay attention to the upper right corner of the figure below, which means that the CUDA version supports up to 12.0, and the newer version does not support it-of course, this CUDA< /strong> What it is and how to configure it, we will mention it next, just pay attention here.

There is one more thing to note. If the situation shown in the figure below appears after entering the aforementioned code, it still means that the original NVIDIA driver in our computer at this time is different from the newly downloaded version. If there is no conflict, please re-execute the 3 code in the previous article to delete the original NVIDIA driver in the computer.

Subsequently, we can also enter the following code.

nvidia-settings

If the situation shown in the figure below appears, that is, a new window named “NVIDIA X Server Settings” is opened, it means that there is no problem with our previous configuration.

So far, we have completed the configuration of the NVIDIA driver.

2.2 CUDA Configuration

Next, we configure CUDA; CUDA is a parallel computing platform and programming model invented by NVIDIA.

First, we need to go to the official website of the tensorflow library (https://www.tensorflow.org/install/source), pull down to find tensorflowLibrary version and the corresponding CUDA, cuDNN version matching table, combined with your own Python version, choose to determine which version you needtensorflow library, and further determine the versions of CUDA and cuDNN. Among them, as shown in the purple box in the figure below, since my version of Python is 3.10, I can only choose the version in the purple box; then, I want to use the new version of tensorflow library, so I chose to use the CUDA and cuDNN versions corresponding to the first line.

Then, we went to the official website of CUDA (https://developer.nvidia.com/cuda-downloads), firstly follow the method shown in the figure below, and select the corresponding one based on the model of your computer The contents of ; Among them, note that the last option should choose runfile (local).

Then, the website will automatically display the latest version of CUDA according to our selection. But we should pay attention to the choice given to us on the website. The default is the latest version, and we need to use the tensorflow library version mentioned above and the corresponding CUDA, cuDNN version matching table to determine which version we need. For example, as shown in the first 3 purple boxes in the figure below, the CUDA version given on the website is 12.1.1, and what I need The version is 11.8, so you need to find the old version of CUDA through the “Archive of Previous CUDA Releases” option in the figure below.

As shown in the figure below, we find CUDA of 11.8 here, just click it.

Subsequently, the CUDA installation method of 11.8 version will appear. We just enter the two lines of code displayed on the website in the terminal.

Then, you can start installing CUDA. Among them, if you are installing, if the prompt shown in the figure below appears, it is generally caused by the old version of CUDA installed in the computer; >Continue” option.

Then, everyone should pay attention, in the interface shown in the figure below, uncheck the cross in front of Driver, thereby canceling the installation of the NVIDIA driver, because we have already This driver has been installed before. You can then select Install.

Next, we can start installing CUDA. After the installation is complete, the interface as shown in the figure below will appear.

So far, we have completed the installation of CUDA, but we need to further configure the corresponding environment variables. First, enter the following code in the terminal.

vim ~/.bashrc

This code means that we will open the bashrc file and edit it to realize the configuration of environment variables. After running the above code, we will see an interface similar to the one shown below.

Then, we press the i key to start editing the bashrc file. By adjusting the position of the mouse, add the following content at the end of the bashrc file.

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 export PATH=$PATH:/usr/local/cuda/bin export CUDA_HOME=$CUDA_HOME:/usr/local/cuda

At this point, we will get the situation as shown in the figure below.

Next, we first press the Esc key to exit the editing mode; next, enter :wq to save and exit the bashrc document. At this point, an interface like the one shown below should appear.

Next, we enter the following two lines of code in the terminal respectively.

source ~/.bashrc nvcc --version

Among them, the first sentence means to update the bashrc file, so that the environment variables we just modified will take effect immediately; the second sentence is the code to verify the installation of CUDA. If the interface shown in the figure below appears after running the above two lines of code, it means that our CUDA configuration and environment variable configuration have been completed.

So far, we have completed this part of the configuration work.

2.3 cuDNN configuration

Next, we start to configure cuDNN. cuDNN is a GPU accelerated library of deep neural network primitives that implement standard routines (e.g. forward and backward convolution, pooling layers) in a highly optimized manner , normalization and activation layers). Here we still need to look at the tensorflow library version mentioned above and the corresponding CUDA, cuDNN version matching table to clarify which one we need to download version of cuDNN.

First, we enter the official website of cuDNN (https://developer.nvidia.com/rdp/cudnn-download); before downloading cuDNN, we need to Sign up, but the registration process is also relatively fast and can be completed in a few minutes.

Then, we found the corresponding version of cuDNN in the website. It should be noted here that if the cuDNN version we need is not the latest, then we need to find the old version in the “Archived cuDNN Releases” option in the figure below.

I need cuDNN of 8.6 here, so I need to find the download link of this version from the location shown in the picture above, and start the download.

After the download is complete, we first enter the download path through the following command in the terminal; of course, if your download paths are different, you can modify the following code yourself.

cd ~/Downloads

Then, enter the following code; here you need to pay attention to the 8.x.x.x part of the following code, you need to combine your own downloads to obtain the specific version number in the installation package to modify. The function of this line of code is to start our local repository.

sudo dpkg -i cudnn-local-repo-${OS}-8.x.x.x_1.0-1_amd64.deb

Run the above code, as shown in the figure below.

Next, enter the code shown below line by line. Among them, the 8.x.x.x part of the following code, you still need to modify the specific version number in the installation package after downloading; while the X.Y part, We need to modify it according to the previously selected CUDA version. For example, the CUDA version I downloaded earlier is 11.8, so this X.Y is 11.8. The functions of these three lines of code are: import GPG key of CUDA, refresh the metadata of the repository, and install the runtime library.

sudo cp /var/cudnn-local-repo-*/cudnn-local-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get install libcudnn8=8.x.x.x-1 + cudaX.Y

As shown in the first and second lines of code in the figure below, it is the specific content of the third sentence of code I entered here.

Then, we continue to enter the following code in the terminal, and also remember to modify our own version number. The function of this code is to install the developer library.

sudo apt-get install libcudnn8-dev=8.x.x.x-1 + cudaX.Y

As shown in the figure below, it is the specific content I entered here.

Then, we continue to enter the following code in the terminal, and also remember to modify our own version number. The function of this code is to install the code sample.

sudo apt-get install libcudnn8-samples=8.x.x.x-1 + cudaX.Y

As shown in the figure below, it is the specific content I entered here.

The above is the installation process of cuDNN. Next, we need to verify whether it is installed correctly. This verification process is a little more troublesome, but the process is actually faster. We enter the following code line by line in the terminal.

cp -r /usr/src/cudnn_samples_v8/ $HOME cd $HOME/cudnn_samples_v8/mnistCUDNN sudo apt-get install libfreeimage3 libfreeimage-dev make clean & amp; & amp; make ./mnistCUDNN

If you run the above code and get the result shown in the figure below, and the words Test passed! appear, it means that our cuDNN has also been configured.

So far, cuDNN has been successfully configured.

2.4 tensorflow library configuration

Next, we finally reached the last step, which is the configuration of the tensorflow library.

We enter the following code in the terminal.

pip install tensorflow

Subsequently, the situation as shown in the figure below will appear. Here you need to pay attention, please look at the words in the purple box in the picture below, if the tensorflow library we started to download at this time is the version we need, then there is no problem; if it is currently unavailable version (that is, a version that does not match the version of CUDA and cuDNN), then you can re-download the tensorflow library by specifying the version.

After completing the configuration of the tensorflow library, we enter the following code in Python to check whether the current tensorflow library supports GPU< /strong>Operation.

import tensorflow as tf print(tf.config.list_physical_devices("GPU"))

Run the above code, if you get the words shown in the purple box in the figure below, it means that our tensorflow library has been configured and can use GPU to accelerate operations.

So far, you’re done.

Welcome to pay attention: Crazy learning GIS