Linux offline installation of cuda&cudnn and configuration of the machine and its environment packaging and migration

cuda installation

cuda version adaptation

Check the cuda version number supported by your computer [You can skip this step if you install the cuda toolkit on a supercomputing platform]
CUDA toolkit Download official website download cuda toolkit
Upload the downloaded .run executable file to the platform for offline installation

$ cd /uploaded directory
$ chmod + x cuda_12.2.2_535.104.05_linux.run //Modify the permissions of the .run file and replace it with your own file
$ ./cuda_12.2.2_535.104.05_linux.run

Enter accept in this block to indicate that you accept the agreement, and then press Enter

Only keep the cuda toolkit here and cancel all the rest. **The way to cancel is to press the up and down keys to select one of them and press Enter once** Then select Options and press Enter.

Use the up and down keys to select Toolkit Options and press Enter. Go in and modify the corresponding installation path. If you choose the default one, you will not have installation permissions. You must install it into your own home directory. You can use mkdir to create the directory. After entering, use up and down Press key and press Enter to cancel other storage methods, select Change Toolkit Install Path and press Enter

Delete the content in the picture below and add the folder path you created in your home directory, then press Enter

Select Done to complete, and then select Install to install. The interface to complete the installation is as follows

Next, configure the environment and enter the following command. You can run this command in any folder path.

vi ~/.bashrc

Add these two lines at the end of the file

export PATH = "/your cuda install folder/bin:$PATH"
export LD_LIBRARY_PATH="/your cuda install folder/lib64:$LD_LIBRARY_PATH"

The linux file editing command is as follows

vi XXX //Open and edit files
Press i to enter edit mode
After editing the text, press Esc to exit the insert state.
Save and exit: enter a colon, enter WQ (write, quit) or X (x==wq) and press Enter, or after pressing ESC, directly press shift + zz, or switch to uppercase mode and press ZZ
Exit without changing the text: Esc + colon + q
Exit without saving: Esc + colon + q!

After the above operation is completed, use the following command to make the environment variables take effect

source ~/.bashrc

After completion, use the following command to test whether the installation is successful

nvcc -V

Installation successful!

cudnn installation

cudnn official website download
Be careful to download the version that is compatible with CUDA. As above, I loaded CUDA12.2, so here I also choose 12.2. There are many 12.2 versions here. It doesn’t matter. The main thing is to adapt to your own version. I choose the latest version 12.2 here
The following content selects the corresponding installation package according to the version of the machine to be loaded. The school super computer is centos, so I choose linuxX
This version of 86 is shown below;
Then upload the downloaded installation package to the school’s supercomputing platform, and then decompress it; the decompression command is as follows. There is no need to decompress to a specific folder in this step, because some contents in the compressed package will be copied to the cuda file in subsequent steps. clamp down;

tar -zxvf cudnn-10.2-linux-x64-v7.6.5.32.tgz

Then copy the decompressed file to your cuda installation path according to the following command. Note: your_cudnn_route and your_cuda_route Replace with your own cuda and cudnn paths, and the following files The clip path is fixed.

cp your_cudnn_route/include/cudnn* your_cuda_route/include/
cp your_cudnn_route/lib64/libcudnn* your_cuda_route/lib64/

Modify permissions on copied files

chmod a + r your_cuda_route/include/cudnn* your_cuda_route/lib64/libcudnn* # *Because the files under the cudnn folder all start with cudnn and lincudnn

After the installation is complete, delete the decompressed folder generated by cudnn and other installation packages.

All subsequent installations may be more or less based on conda, so the Anaconda/minconda environment must be installed first

The supercomputing platform has downloaded this installation package [/storage/public/apps/installpkg/conda]. If you don’t want to use his installation package, you can also download it yourself. Official website download
Here we take the platform installation package as an example. First, copy this installation package to your home directory and create a new folder soft in your home directory.

cp /storage/public/apps/installpkg/conda/Miniconda3-py310_23.3.1-0-Linux-x86_64.sh /storage/public/home/student ID/soft

Modify the run permissions of this file

chmod +x Miniconda3-py310_23.3.1-0-Linux-x86_64.sh

Run the installation package program and press Enter

./Miniconda3-py310_23.3.1-0-Linux-x86_64.sh

Press enter key
Continue to press the Enter key until the following content appears
Enter yes
At this time, you need to select the installation path of miniconda. At this time, you can choose the default. Generally, it is enough to install it under your own home directory.
Option 1: Press the Enter key to install it directly to the path “/storage/public/home/student ID/anaconda3”;
Option 2: Enter the absolute path of the location to be installed. For example, if you want to install to the software directory in your home directory, you can fill in:
“/storage/public/home/student ID/xxxx/anaconda3”
To choose whether to initialize conda is to add environment variables to its bashrc. Here we select yes and let it automatically configure it for us.
Execute source ~/.bashrc to make the environment variables take effect;
The installation is complete

cudatoolkit installation

cudatoolkit download path
Activate the Python virtual environment to configure the environment in anaconda. The new environment is configured here, that is, the new virtual environment is activated. If this environment is already active, skip this step. If there is no new environment, it is recommended to create one to prepare for the subsequent environment migration. Because the supercomputing platform does not have a network, we chose to create a virtual environment offline;

conda create --offline -n environment_name python=3.10 # --offline means offline creation
conda activate environment_name #conda install installation command can only install packages in the currently activated python virtual environment, so you must activate your new virtual environment

Install cudatoolkit

conda install --offline cudatoolkit-10.2.89-hfd86e86_1.tar.bz2

Test installation successful

conda activate environment_name #Activate virtual environment

Enter the python interpreter and enter the following code to test that pytorch and cuda are installed successfully.

import torch
print(torch.version.cuda)

Test successful insert picture description here

Package the existing python environment to the remote server, no need to connect to the Internet or configure the environment

Prerequisites for package migration

The source host and target host must have the same platform and operating system, that is, both need to be on Linux systems

Offline packaging tool conda-pack

conda-pack is a command line tool for packaging a conda environment, including all binaries for packages installed in that environment

Tool installation

conda install -c conda-forge conda-pack #conda
pip install conda-pack #pypi

To package the local environment, first switch the path to the location of the env to be packaged, and use the command

conda pack -n your_env_name -o env_name.tar.gz --ignore-editable-packages

Where your_env_name is the name of the environment you want to package, env_name_tar.gz is the name of the file you want to package, and is also the name of your future virtual environment.

Note that --ignore-editable-packages explains that if there is a locally installed environment or relies on other compiled environments, choose to ignore it. If you do not add –ignore-editable-packages, an error will be reported. Usually this In this case, you can choose to ignore it and install it locally in the new environment.

If it appears, don’t panic. This is because we usually use the base slot when installing plug-ins. Why? Because we are used to pip installation, and then some packages cannot be installed, so we use conda to install them. This leads to the conda and pip installation of some packages. There are conflicts between versions, so I have the following command to package, ignore all problems, just package, we don’t care about anything else, just package, hahaha! Cure for all diseases

This is usually due to `pip` uninstalling or clobbering conda managed files,
resulting in an inconsistent environment. Please check your environment for
conda/pip conflicts using `conda list`, and fix the environment by ensuring
only one version of each package is installed (conda preferred).

conda pack -n your_env_name -o env_name.tar.gz --ignore-editable-packages --ignore-missing-files

Above we have completed the virtual environment packaging of one side of the server. Our task is to deploy it to the school supercomputer. Well, I will get the USB disk first and then use xftp to transfer it to the platform. No, no, no, this will slow you down. , considering that both machines are Linux, we can remotely transfer files across platforms, right? Change the following command to your own, and execute it without copying!

scp env_name.tar.gz remote_username@remote_ip:remote_folder

I’d like to add one more sentence along the way. How to pull down the files from the remote host? It’s very simple. Just change the above statements.

scp remote_username@remote_ip:remote_folder/hhh.tar.gz /storage/public/apps/installpkg/ #The spaces are invisible, I will make them bigger

Creating a virtual environment actually means creating a new folder in your conda environment folder.

mkdir env_name

Unzip the files you just packed into the newly created folder.

tar -zxvf nnUNet.tar.gz -C /storage/public/home/202322XxX/xxx/env_name

Execute source ~/.bashrc to make the environment variables take effect;
conda activate env_name
Now you can use this virtual environment normally! If there are missing packages later, you can download the corresponding whl file and upload it to the supercomputing platform, and then use conda/pip to install it offline.