System version compatibility requirements
centos7.2 cuda9.0 cudnn7.4 centos7.5 cuda9.2 cudnn7.4
Install gcc
yum -y install gcc gcc-c++ kernel-devel package manage-overview https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-overview
1. Install gpu graphics card driver
View nvidia gpu information
# nvidia-smi
2. Install nvidia detection
2.1 Add ElRepo source
# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org # rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org # rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
2.2. Install graphics card driver and check
yum install nvidia-detect
2.3 Operation
# nvidia-detect -v Probing for supported NVIDIA devices... [10de:15f8] NVIDIA Corporation Device 15f8 This device requires the current 410.78 NVIDIA driver kmod-nvidia [10de:15f8] NVIDIA Corporation Device 15f8 This device requires the current 410.78 NVIDIA driver kmod-nvidia [102b:0538] Matrox Electronics Systems Ltd. Device 0538
2.4. Edit grub files
vim /etc/default/grub
Add in “GRUB_CMDLINE_LINUX”
rd.driver.blacklist=nouveau nouveau.modeset=0
The modified files are as follows:
GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rd.driver.blacklist=nouveau nouveau.modeset=0 rhgb quiet" GRUB_DISABLE_RECOVERY="true"
Then generate the configuration
grub2-mkconfig -o /boot/grub2/grub.cfg
2.5. Create blacklist
vim /etc/modprobe.d/blacklist.conf
Add to
blacklist nouveau
2.6. Update configuration
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img dracut /boot/initramfs-$(uname -r).img $(uname -r)
2.7. Restart
reboot
2.8. Confirm that nouveau is disabled
lsmod | grep nouveau
If there is no output, the disablement is successful.
3. Install cuda
cuda download address:
https://developer.nvidia.com/cuda-toolkit # sh cuda_9.0.176_384.81_linux.run
If you appear to be running an x server please exit x before installing
Execute init 3 to enter the command line mode, kill the x server, and then execute the installation command
=========== = Summary = =========== Driver: Installed Toolkit: Installed in /usr/local/cuda-9.0 Samples: Installed in /root, but missing recommended libraries Please make sure that - PATH includes /usr/local/cuda-9.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin To uninstall the NVIDIA Driver, run nvidia-uninstall Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA. Logfile is /tmp/cuda_install_7874.log
Verify whether CUDA 9.0 is installed successfully
Terminal input:
nvcc -V
You can see the version information of cuda
Then try to run the example that comes with cuda:
cd /usr/local/cuda-9.0/samples/1_Utilities/deviceQuery make ./deviceQuery
You can see that the output is successful
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 2 Result = PASS
uninstall
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin To uninstall the NVIDIA Driver, run nvidia-uninstall
3. Install cudnnv7
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
After the download is complete, unzip it to the Cuda directory and execute the following commands in sequence:
tar -xzvf cudnn-9.0-linux-x64-v7.4.1.5.tgz sudo cp cuda/include/cudnn.h /usr/local/cuda/include sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 sudo chmod a + r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Just run a small Demo.
If the Examples and User Guide package is installed, we can find the mnistCUDNN small example located at /usr/src/cudnn_samples_v7.
Copy it to any folder in your home/yourdir
$cp -r /usr/src/cudnn_samples_v7/ $HOME
Enter mnistCUDNN
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
compile
$make clean & amp; & amp; make
run
$ ./mnistCUDNN
If the installation is successful, you will see the result like this
Test passed!
In fact, you can also cmake your caffe/build, and you can quickly test whether the installation is successful.
13. Install the gpu version of TensorFlow (configure the accelerator first)
$ sudo pip install tensorflow-gpu
The root user creates a new .pip directory in the root directory and creates the file pip.conf (/root/.pip/pip.conf) in the directory. The configuration content is as follows. The Tsinghua source used here is quite fast:
[global] index-url=https://pypi.tuna.tsinghua.edu.cn/simple
The configuration is complete. No operations are required. You can install any desired tools directly through pip install. Let’s compare again (the screenshot immediately after entering pip install tensorflow is as shown in the figure below).
14. Test TensorFlow
After going through the obstacles ahead, we finally reached the test step. Isn’t it very happy?
[root@gpuserver ~]# python Python 2.7.5 (default, Nov 20 2015, 02:00:19) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() 2018-12-12 17:10:51.572488: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA >>> sess = tf.Session() >>> print(sess.run(hello)) Hello, TensorFlow! >>>
If you can run the above small example correctly, then congratulations, the GPU version of TensorFlow has been installed successfully. What are you waiting for? Let’s build it quickly!
centos7.2 installation pip
yum install -y epel-release yum install -y python-pip
6. Install kernel-devel
yum -y install kernel-devel
centos7.2 configuration graphical interface startup
# systemctl get-default multi-user.target # systemctl set-default graphical.target
appendix:
1. cuda installation process record
Installing the NVIDIA display driver... Installing the CUDA Toolkit in /usr/local/cuda-10.0 ... Missing recommended library: libGLU.so Missing recommended library: libX11.so Missing recommended library: libXi.so Missing recommended library: libXmu.so Installing the CUDA Samples in /root ... Copying samples to /root/NVIDIA_CUDA-10.0_Samples now... Finished copying samples. =========== = Summary = =========== Driver: Installed Toolkit: Installed in /usr/local/cuda-10.0 Samples: Installed in /root, but missing recommended libraries Please make sure that - PATH includes /usr/local/cuda-10.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin To uninstall the NVIDIA Driver, run nvidia-uninstall Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA. Logfile is /tmp/cuda_install_16878.log