Ubuntu reinstalls NVIDIA graphics driver

My computer is rather strange, as long as it is turned off, the graphics card driver will fail and must be reinstalled. I wrote a blog to record the reinstallation process.

1 , disable nouveau

After installing the dependency package, you need to disable nouveau. Only after disabling nouveau can you successfully install the NVIDIA graphics card driver. The way to disable it is to add a disable command to the /etc/modprobe.d/blacklist-nouveau.conf file. First, you need to open the file , opened with the following command:

sudo gedit /etc/modprobe.d/blacklist-nouveau.conf

After opening, it is found that there is nothing in the file, write:

blacklist nouveau
options nouveau modeset=0

After saving, run

sudo update-initramfs -u
Execute after restarting the computer
(Reinstall the graphics card driver, enter the following command to confirm)

lsmod | grep nouveau #There is no output, which means the installation is successful

2,Configure environment variables (reinstallation is not required for this step)

Also use the gedit command to open the configuration file:

sudo gedit ~/.bashrc

After opening, add the following two lines at the end of the file:

export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH 

Save and exit.

source ~/.bashrc

3,Install the graphics card driver

Manually go to the official website to download the .run file and install it yourself

After the download is complete: Uninstall all previous drivers:

sudo ./NVIDIA-Linux-x86_64-390.59.run --uninstall

Disable nouveau (previously done)

 Verify that nouveau is disabled

lsmod | grep nouveau

ctrl + alt + f1, after

sudo service lightdm stop

sudo ./NVIDIA-Linux-x86_64-390.59.run --uninstall

reboot

Install the driver

Enter the command line interface

After Ctrl-Alt + F1, enter the user name and password to log in. Ctrl-Alt + F7 Exit command line interface

sudo service lightdm stop (close the graphical interface, at this time Ctrl-Alt + F7 cannot return to the interface, unless sudo service lightdm stop)

Give execution permission to the driver run file (the installation file is usually placed in the home root directory)

sudo chmod a + x NVIDIA-Linux-x86_64-390.59.run
sudo ./NVIDIA-Linux-x86_64-390.59.run –no-opengl-files
  • –no-opengl-files Only install driver files, not OpenGL files. This parameter is the most important
  • –no-x-check Do not check X services when installing drivers
  • –no-nouveau-check do not check nouveau when installing the driver
    The latter two parameters may not be added.

When installing the graphics card driver, just accept all the way, and the error The distribution-provided pre-install script failed! is reported, ignore it and continue the installation. The most important step, the installation program asks you whether to use nv’s xconfig file, here you must choose yes, otherwise the nv driver will not be used when starting x-window. The installation prompt is basically affirmative. When nvidia-xconfig is prompted, if your computer still has a nuclear or integrated display, choose not to install it, otherwise choose to install it.

After the installation is complete, reboot

reboot

Restart, enter the graphical interface, and there will be no problem of circular login

If it already exists, execute nvidia-smi, and the output similar to the following indicates that the nvidia driver is normal

nvidia-smi

4,The steps to install CUDA 9.0 are the same

How to uninstall cuda

To uninstall the CUDA Toolkit, run the uninstallation script provided in the bin directory of the toolkit. By default, it is located in /usr/local/cuda-9.0/bin:

sudo /usr/local/cuda-9.0/bin/uninstall_cuda_9.0.pl

After uninstalling, there are still some residual folders, which can be deleted together

cd /usr/local/
sudo rm -rf cuda-9.0/

In order to facilitate the path search of the installation process, move the downloaded CUDA installation file to the HOME path, then enter the text mode through Ctrl + Alt + F1, enter the account password to log in, and return to the graphical mode through Ctrl + Alt + F7, in After logging in in text mode, first close the desktop service:

sudo service lightdm stop

Then, through Ctrl + Alt + F7, it is found that the graphical mode cannot be successfully returned, indicating that the desktop service has been successfully closed. Note that this step is particularly important for the next nvidia driver installation, and it is necessary to ensure that the desktop service is closed.

cd to the path where the .run file is located, install cuda

sudo chmod +x cuda_9.0.176_384.81_linux.run
sudo sh cuda_9.0.176_384.81_linux.run --tmpdir=/tmp

Among them, cuda_9.0.176_384.81_linux.run is my CUDA installation file name, and you need to replace it with your own CUDA installation file name. If you forget it at this time, you can directly view the file name through the ls file. This is also my suggestion to put the CUDA installation file Another reason to move under HOME.

Press q to end the description of cuda, etc., then enter accept, and then prompt whether to install the NVIDIA driver. If you have done step 5, enter n, the next prompt is whether to install CUDA Toolkit, enter y, and the following prompts are all using The default or y, after the installation will show which installations are successful and which installations fail, generally there will be no problem

Follow the steps to install. The first step is to ask you whether to install the graphics card driver. Since the graphics card driver has been installed in the previous step, it is not needed here, and the driver version that comes with the runfile is not the latest.

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/zhou ]:

Installing the CUDA Toolkit in /usr/local/cuda-8.0…
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /home/zhou…
Copying samples to /home/zhou/NVIDIA_CUDA-8.0_Samples now…
Finished copying samples.

===========
=Summary=
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/xtu, but missing recommended libraries

Please make sure that
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver

Logfile is /tmp/cuda_install_18572.log

For the remaining options, enter “y” to confirm the installation or confirm the default path installation, and start the installation. At this time, if the installation fails, it may be because the desktop service has not been closed or the nvidia driver has been installed. After the installation is complete, enter the restart command to restart:

reboot

After restarting, log in to the system, configure the CUDA environment variable, the same as step 3, use the gedit command to open the configuration file:

sudo gedit ~/.bashrc

Add the following two lines at the end of the file and save: https://docs.nvidia.com/cuda/archive/9.0/cuda-installation-guide-linux/index.html

export PATH=/usr/local/cuda-9.0/bin${PATH: + :${PATH}}

export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH: + :${LD_LIBRARY_PATH}}

Make this configuration effective:

source ~/.bashrc

5,Verify whether CUDA 9.0 is successfully installed

Execute the following commands respectively:

cd /usr/local/cuda-9.0/samples/1_Utilities/deviceQuery

sudo make

./deviceQuery

If you see information similar to the following, it means that cuda has been installed successfully:

6, install cudnn (v 7)

cudnn can directly delete the corresponding folder

V7

libcudnn.so.7. . . . The following version number is modified according to the actual situation

cd ~/cuda/include
sudo cp cudnn.h /usr/local/cuda/include/ #copy header file

Then enter the cudn/lib64 folder path from the command line and run the following command (CUDA 9.0 is the same command):

cd ~/cuda/lib64
sudo cp lib* /usr/local/cuda/lib64/ #copy dynamic link library
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.7 #Delete the original dynamic file
sudo ln -s libcudnn.so.7.0.5 libcudnn.so.7 #Generate soft connection
sudo ln -s libcudnn.so.7 libcudnn.so #generate soft link

Then you need to add the path /usr/local/cuda/lib64 to the dynamic library in two steps:

1) Install vim. enter:

sudo apt-get install vim-gtk

2) Enter:

sudo vim /etc/ld.so.conf.d/cuda.conf

Press i on the keyboard to enter the editing state and add text:

/usr/local/cuda/lib64

Then press esc and enter: (note the colon)

:wq #Save and exit

Then enter the sudo ldconfig command in the terminal to make the link take effect.

After the soft link, you can use the sudo ldconfig -v command to check whether the link is successful: whether there is a /usr/local/cuda/lib64 folder. The purpose of the ldconfig command is mainly to search for shareable dynamic link libraries (in the format of lib* .so*), and then create the link and cache files needed by the dynamic loader (ld.so).

After the installation is complete, you can use the nvcc -V command to verify whether the installation is successful. If the following information appears, the installation is successful:

yhao@yhao-X550VB:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

7,recompile caffe

First clone under the path you want to install:

git clone https://github.com/BVLC/caffe.git

If the download speed is slow, you can download the zip file first, and then unzip it. Enter caffe, copy the Makefile.config.example file and rename it to Makefile.config, or you can directly call the following command in the caffe directory to complete the copy operation:

sudo cp Makefile.config.example Makefile.config

The reason for copying is that the Makefile.config file is needed to compile caffe, and Makefile.config.example is just an example configuration file given by caffe, and cannot be used to compile caffe.

Then modify the Makefile.config file and open the file in the caffe directory:

sudo gedit Makefile.config

Modify Makefile.config and Makefile to directly copy the original backup

OK, you can start compiling, execute in the caffe directory:

cd ~/caffe
make all -j8

After the compilation is successful, the test can be run:

sudo make runtest -j8

Just recompile.

Write picture description here

If the displayed result is as shown in the figure above, it means that caffe has been successfully installed.

8. Configure the pycaffe interface environment

After successfully installing caffe in the previous step, you can use caffe to do training data sets or predict various related things, but you need to operate through the caffe command under the command line, and the installation of pycaffe and notebook environment configuration in this step are just In order to use caffe more conveniently, in fact, most of them operate caffe through python, while notebook uses a browser as an interface, which can write and execute python code more conveniently.

First compile pycaffe:

cd caffe

sudo make pycaffe -j8

After compiling pycaffe successfully, add the path to the environment variable

sudo echo export pythonPATH="~/caffe/python" >> ~/.bashrc (the path is set according to the actual situation)

source ~/.bashrc

To verify whether the caffe package can be imported in python, first enter the python environment:

python

Then import caffe:

>>> import caffe