ubuntu 20.04 + cuda-11.8 + cudnn-8.6+TensorRT-8.6

1. Install graphics card driver

ubuntu20.04 + cuda10.0 + cudnn7.6.4_Who am I? ? Blog-CSDN Blog

View supported driver versions:

Check the driver information that can be configured by the local graphics card

lu@host:/usr/local$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd000021C4sv000010DEsd000021C4bc03sc00i00
vendor: NVIDIA Corporation
model: TU116 [GeForce GTX 1660 SUPER]
driver: nvidia-driver-525-open-distro non-free
driver: nvidia-driver-450-server - distro non-free
driver: nvidia-driver-525-distro non-free
driver: nvidia-driver-525-server - distro non-free
driver: nvidia-driver-535-open-distro non-free
driver: nvidia-driver-470-server - distro non-free
driver: nvidia-driver-535-server-open - distro non-free recommended
driver: nvidia-driver-535-server - distro non-free
driver: nvidia-driver-535-distro non-free
driver: nvidia-driver-470-distro non-free
driver: xserver-xorg-video-nouveau-distro free builtin

lu@host:/usr/local$

Install driver:

What I installed here is nvidia-driver-535

sudo apt install nvidia-driver-535

Restart the computer and check the nvidia driver:

lu@host:/usr/local$ nvidia-smi
Fri Nov 3 00:26:46 2023
 +------------------------------------------------- ----------------------------------------------- +
| NVIDIA-SMI 535.113.01 Driver Version: 535.113.01 CUDA Version: 12.2 |
|----------------------------------------- + ------- --------------- + ---------------------------- +
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|========================================== + ======= =============== + ======================|
| 0 NVIDIA GeForce GTX 1660 ... Off | 00000000:01:00.0 Off | N/A |
| 29% 34C P8 N/A / N/A | 348MiB / 6144MiB | 33% Default |
| | | N/A |
 + ----------------------------------------- + ------- --------------- + ---------------------------- +
                                                                                         
 +------------------------------------------------- ----------------------------------------------- +
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|================================================== ======================================|
| 0 N/A N/A 890 G /usr/lib/xorg/Xorg 45MiB |
| 0 N/A N/A 1432 G /usr/lib/xorg/Xorg 129MiB |
| 0 N/A N/A 1667 G /usr/bin/gnome-shell 30MiB |
| 0 N/A N/A 2352 G...3584735,16244303988823860755,262144 131MiB |
 +------------------------------------------------- ----------------------------------------------- +
lu@host:/usr/local$

It can be seen that 535 supports up to cuda-12.2 version. The cuda-11.8 installed here obviously meets the requirements (the driver version can be higher than the corresponding cuda version).

2. Install CUDA

Download cuda:

Link: CUDA Toolkit Archive | NVIDIA Developer

CUDA is recommended to download. Run can be installed according to the prompts and execute the following command:

sudo sh cuda_11.8.0_520.61.05_linux.run

A choice appears: select Continue

Move the cursor to Driver, CUDA Demo Suite 11.8, and CUDA Documentation 11.8 respectively, press the space bar, and remove the installation selection, as shown below, only install CUDA Toolkit 11.8

Set cuda environment variables:

Open the .bashrc file in the home directory and add the following path. For example, my .bashrc file is under /home/lu/.
 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.8/lib64
export PATH=$PATH:/usr/local/cuda-11.8/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-11.8

Terminal run: source ~/.bashrc
 
Check: nvcc --version

Installation of 3.cudnn

Download the installation file:

Download the cudnn installation file as required: cuDNN Archive | NVIDIA Developer

Install cudnn:

Here I downloaded cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz

Unzip the downloaded file and you can see the cuda folder. Open the terminal in the current directory and execute the following command:

 sudo cp cuda/include/cudnn* /usr/local/cuda/include/
     
   sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
     
   sudo chmod a + r /usr/local/cuda/include/cudnn*
     
   sudo chmod a + r /usr/local/cuda/lib64/libcudnn*

Verify whether the installation is successful:

sudo cat /home/cuda/cuda118/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

4.TensorRT installation

Download:

Website: Log in | NVIDIA Developer

Select TensorTR8.6 GA, the version I downloaded is TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz

Installation:

1. Unzip tar -xzvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz

2. Execute the command sudo cp -rf TensorRT-8.6.1.6 / to copy the decompressed content to the specified directory (I went to the /usr/local/ directory)

Configuring TensorRT:

Open the .bashrc file in the home directory and add the following path. For example, my .bashrc file is under /home/lu/.
 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/TensorRT-8.6.1.6/lib
export C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/TensorRT-8.6.1.6/include
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/TensorRT-8.6.1.6/include

Terminal run: source ~/.bashrc

Note: You can also refer to the installation method of cudnn and copy the header files and libraries directly to the cuda directory, so that there is no need to configure environment variables.

Test whether TensorRT installation is successful

# Enter the /usr/local/TensorRT-8.6.1.6/samples/sampleOnnxMNIST directory
cd /usr/local/TensorRT-8.6.1.6/samples/sampleOnnxMNIST

# Execute make command to compile
make

# The sample_onnx_mnist file will be generated in the /usr/local/TensorRT-8.6.1.6/bin/ directory
/usr/local/TensorRT-8.6.1.6/bin/sample_onnx_mnist

After success, it is displayed as follows:

lu@host:/usr/local/TensorRT-8.6.1.6/samples/sampleOnnxMNIST$ /usr/local/TensorRT-8.6.1.6/bin/sample_onnx_mnist
 & amp; & amp; & amp; & amp; RUNNING TensorRT.sample_onnx_mnist [TensorRT v8601] # /usr/local/TensorRT-8.6.1.6/bin/sample_onnx_mnist
[11/03/2023-00:57:21] [I] Building and running a GPU inference engine for Onnx MNIST
[11/03/2023-00:57:22] [I] [TRT] [MemUsageChange] Init CUDA: CPU + 14, GPU + 0, now: CPU 19, GPU 448 (MiB)
[11/03/2023-00:57:28] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU + 897, GPU + 174, now: CPU 992, GPU 586 (MiB)
[11/03/2023-00:57:28] [I] [TRT] ---------------------------------- ----------------------------------
[11/03/2023-00:57:28] [I] [TRT] Input filename: ../../data/mnist/mnist.onnx
[11/03/2023-00:57:28] [I] [TRT] ONNX IR version: 0.0.3
[11/03/2023-00:57:28] [I] [TRT] Opset version: 8
[11/03/2023-00:57:28] [I] [TRT] Producer name: CNTK
[11/03/2023-00:57:28] [I] [TRT] Producer version: 2.5.1
[11/03/2023-00:57:28] [I] [TRT] Domain: ai.cntk
[11/03/2023-00:57:28] [I] [TRT] Model version: 1
[11/03/2023-00:57:28] [I] [TRT] Doc string:
[11/03/2023-00:57:28] [I] [TRT] ---------------------------------- ----------------------------------
[11/03/2023-00:57:28] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/03/2023-00:57:28] [I] [TRT] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[11/03/2023-00:57:28] [I] [TRT] Graph optimization time: 0.0039157 seconds.
[11/03/2023-00:57:28] [I] [TRT] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[11/03/2023-00:57:28] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[11/03/2023-00:57:29] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[11/03/2023-00:57:29] [I] [TRT] Total Host Persistent Memory: 24224
[11/03/2023-00:57:29] [I] [TRT] Total Device Persistent Memory: 0
[11/03/2023-00:57:29] [I] [TRT] Total Scratch Memory: 0
[11/03/2023-00:57:29] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
[11/03/2023-00:57:29] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 6 steps to complete.
[11/03/2023-00:57:29] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 0.017708ms to assign 3 blocks to 6 nodes requiring 32256 bytes.
[11/03/2023-00:57:29] [I] [TRT] Total Activation Memory: 31744
[11/03/2023-00:57:29] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU + 0, GPU + 4, now: CPU 0, GPU 4 (MiB)
[11/03/2023-00:57:29] [I] [TRT] Loaded engine size: 0 MiB
[11/03/2023-00:57:29] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU + 0, GPU + 0, now: CPU 0, GPU 0 (MiB)
[11/03/2023-00:57:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU + 0, GPU + 0, now: CPU 0, GPU 0 (MiB)
[11/03/2023-00:57:30] [I] Input:
[11/03/2023-00:57:30] [I] @@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@%.:@@@@@@@@@@@
@@@@@@@@@@@@@: *@@@@@@@@@@@@
@@@@@@@@@@@* =@@@@@@@@@@@@
@@@@@@@@@@% :@@@@@@@@@@@@@
@@@@@@@@@@@- *@@@@@@@@@@@@@
@@@@@@@@@# .@@@@@@@@@@@@@@
@@@@@@@@@@: #@@@@@@@@@@@@@@
@@@@@@@@@ + -@@@@@@@@@@@@@@
@@@@@@@@@: %@@@@@@@@@@@@@@@
@@@@@@@@ + + @@@@@@@@@@@@@@@
@@@@@@@@:.%@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@#..:@@@@@@@@
@@@@@@@% + @@@@@- :@@@@@@@
@@@@@@@% =@@@@%.#@@- + @@@@@@
@@@@@@@@..%@@@* + @@@@ :@@@@@@
@@@@@@@@= -%@@@@@@@@ :@@@@@@
@@@@@@@@@- .*@@@@@@ + + @@@@@@
@@@@@@@@@@ + .:- + -: .@@@@@@@
@@@@@@@@@@@@ + : :*@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@

[11/03/2023-00:57:30] [I] Output:
[11/03/2023-00:57:30] [I] Prob 0 0.0000 Class 0:
[11/03/2023-00:57:30] [I] Prob 1 0.0000 Class 1:
[11/03/2023-00:57:30] [I] Prob 2 0.0000 Class 2:
[11/03/2023-00:57:30] [I] Prob 3 0.0000 Class 3:
[11/03/2023-00:57:30] [I] Prob 4 0.0000 Class 4:
[11/03/2023-00:57:30] [I] Prob 5 0.0000 Class 5:
[11/03/2023-00:57:30] [I] Prob 6 1.0000 Class 6: **********
[11/03/2023-00:57:30] [I] Prob 7 0.0000 Class 7:
[11/03/2023-00:57:30] [I] Prob 8 0.0000 Class 8:
[11/03/2023-00:57:30] [I] Prob 9 0.0000 Class 9:
[11/03/2023-00:57:30] [I]
 & amp; & amp; & amp; & amp; PASSED TensorRT.sample_onnx_mnist [TensorRT v8601] # /usr/local/TensorRT-8.6.1.6/bin/sample_onnx_mnist
lu@host:/usr/local/TensorRT-8.6.1.6/samples/sampleOnnxMNIST$