UniADhttps://github.com/OpenDriveLab/UniAD is oriented to driving planning set perception (target detection and tracking) and mapping (not mapping the environment like SLAM, but real-time panoramic segmentation of roads and isolation zones in images) It is a unified large model that integrates multi-task modules such as trajectory planning and occupancy prediction. The installation instructions on the official website are based on the lower versions of CUDA11.1.1 and pytorch1.9.1 used by the author. The corresponding mmcv is also the lower version 1.4. These support tools are used in the nvidia ngc docker environment on our work server. The software was already a higher version, so we wanted to run UniAD in our own environment. We encountered some pitfalls during the installation process, but they were eventually solved one by one. We also tested it and found that UniAD can run normally on CUDA11.6. + pytorch1.12.0 + mmcv1.6 + mmseg0.24.0 + mmdet2.24 + mmdet3d1.0.0rc4 environment.
The steps to install and troubleshoot are as follows:
1. Pull the NVIDIA NGC docker image using CUDA11.6 and create a container as the UniAD operating environment
2. Install pytorch and torchvision:
pip install torch==1.12.0 + cu116 torchvision==0.13.0 + cu116 torchaudio==0.12.0 –extra-index-url https://download.pytorch.org/whl/cu116
3. Check whether the CUDA_HOME environment variable has been set, if not, set it:
export CUDA_HOME=/usr/local/cuda
4. Due to CUDA and pytorch version restrictions, mmcv needs to be installed with a version higher than 1.4 1.6.0 (for the installation version correspondence of openmmlab’s mm sequence framework package, please refer to the correspondence between openMMLab’s mmcv and mmdet, mmdet3d, and mmseg versions – CSDN Blog: )
pip install mmcv-full==1.6.0 -f https://download.openmmlab.com/mmcv/dist/cu116/torch1.12.0/index.html
According to the following correspondence:
Install mmdet2.24.0 and mmseg0.24.0 respectively
pip install mmdet==2.24.0
pip install mmsegmentation==0.24.0
Download the mmdetection3d source code and switch to version v1.0.0rc4:
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout v1.0.0rc4
Install the support package and compile and install mmdet3d from source:
pip install scipy==1.7.3
pip install scikit-image==0.20.0
#Modify the numba version in requirements/runtime.txt:
numba==0.53.1
#numba==0.53.0
pip install -v -e .
After installing the support environment, download and install UniAD:
git clone https://github.com/OpenDriveLab/UniAD.git
cdUniAD
#Modify the numpy version in requirements.txt and install the relevant support packages:
#numpy==1.20.0
numpy==1.22.0
pip install -r requirements.txt
#Download relevant pre-training weight files
mkdir ckpts & amp; & amp; cd ckpts
wget https://github.com/zhiqi-li/storage/releases/download/v1.0/bevformer_r101_dcn_24ep.pth
wget https://github.com/OpenDriveLab/UniAD/releases/download/v1.0/uniad_base_track_map.pth
wget https://github.com/OpenDriveLab/UniAD/releases/download/v1.0.1/uniad_base_e2e.pth
Follow the instructions at https://github.com/OpenDriveLab/UniAD/blob/main/docs/DATA_PREP.md to download and expand the NuScenes data set
Download data infos file:
cd UniAD/data mkdir infos & amp; & amp; cd infos wget https://github.com/OpenDriveLab/UniAD/releases/download/v1.0/nuscenes_infos_temporal_train.pkl # train_infos wget https://github.com/OpenDriveLab/UniAD/releases/download/v1.0/nuscenes_infos_temporal_val.pkl # val_infos
Assume that the NuScenes data set and data infos file have been downloaded, decompressed and stored in ./data/ and stored according to the following structure:
UniAD ├── projects/ ├── tools/ ├── ckpts/ │ ├── bevformer_r101_dcn_24ep.pth │ ├── uniad_base_track_map.pth | ├── uniad_base_e2e.pth ├── data/ │ ├── nuscenes/ │ │ ├── can_bus/ │ │ ├── maps/ │ │ │ ├──36092f0b03a857c6a3403e25b4b7aab3.png │ │ │ ├──37819e65e09e5547b8a3ceaefba56bb2.png │ │ │ ├──53992ee3023e5494b90c316c183be829.png │ │ │ ├──93406b464a165eaba6d9de76ca09f5da.png │ │ │ ├──basemap │ │ │ ├──expansion │ │ │ ├──prediction │ │ ├── samples/ │ │ ├── sweeps/ │ │ ├── v1.0-test/ │ │ ├── v1.0-trainval/ │ ├── infos/ │ │ ├── nuscenes_infos_temporal_train.pkl │ │ ├── nuscenes_infos_temporal_val.pkl │ ├── others/ │ │ ├── motion_anchor_infos_mode6.pkl
Note: The three directories basemap, expansion, and prediction that are expanded after downloading the map (v1.3) extensions compressed package need to be placed in the maps directory, rather than at the same level as samples, sweeps, and other directories. After all NuScenes train data compressed packages are expanded, , each subdirectory at the bottom of samples contains 34149 pictures, and the number of pictures recorded in sweeps varies, for example: 163881, 164274, 164166, 161453, 160856, 164266…etc. After expanding the compressed package of unlabeled test data in the nuscenes directory, the images in the subdirectories of the samples and sweeps directories will be automatically copied to the corresponding subdirectories of nuscenes/samples and nuscenes/sweeps. Statistics will be viewed again. The number of pictures in each subdirectory under samples becomes 40157, and the number of pictures in the subdirectory under sweeps becomes 193153, 189171, 189905, 193082, 193168, 192699…
implement
./tools/uniad_dist_eval.sh ./projects/configs/stage1_track_map/base_track_map.py ./ckpts/uniad_base_track_map.pth 8
Run it and try it. The last parameter is the number of GPUs. My working environment is the same as the author’s working environment, which has 8 A100 cards, so follow the instructions. If there are fewer cards, modify this parameter, for example, use 1, and you can run it. Yes, it’s just slower.
You may encounter the following problems when running the above command for the first time:
1. partially initialized module ‘cv2’ has no attribute ‘_registerMatType’ (most likely due to a circular import)
This is because the opencv-python version in the environment is too high and the version is incompatible. Mine is 4.8.1.78. I checked online and found that it needs to be reduced to 4.5. Execute the following command to reinstall opencv-python4.5. :
pip install opencv-python==4.5.4.58
2. ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Just install libgl:
sudo apt-get update & amp; & amp; sudo apt-get install libgl1
3. AssertionError: MMCV==1.6.0 is used but incompatible. Please install mmcv>=(1, 3, 13, 0, 0, 0), <=(1, 5, 0, 0, 0, 0)
Traceback (most recent call last): File "tools/create_data.py", line 4, in <module> from data_converter import uniad_nuscenes_converter as nuscenes_converter File "/workspace/workspace_fychen/UniAD/tools/data_converter/uniad_nuscenes_converter.py", line 13, in <module> from mmdet3d.core.bbox.box_np_ops import points_cam2img File "/workspace/workspace_fychen/mmdetection3d/mmdet3d/__init__.py", line 5, in <module> import mmseg File "/opt/conda/lib/python3.8/site-packages/mmseg/__init__.py", line 58, in <module> assert (mmcv_min_version <= mmcv_version <= mmcv_max_version), \ AssertionError: MMCV==1.6.0 is used but incompatible. Please install mmcv>=(1, 3, 13, 0, 0, 0), <=(1, 5, 0, 0, 0, 0).</ pre> <p>This error is thrown by python3.8/site-packages/mmseg/__init__.py, indicating that mmseg and mmcv1.6.0 versions are incompatible. It requires the installation of mmcv version 1.3-1.5, indicating that the version of mmseg itself is low. The reason is that it started The installed mmsegmenation version is out of date, just install mmseg0.24.0 instead. If other functional framework packages encounter version issues, similar processing will be done.</p> <p>4.KeyError: 'DiceCost is already registered in Match Cost'</p> <pre>Traceback (most recent call last): File "./tools/test.py", line 16, in <module> from projects.mmdet3d_plugin.datasets.builder import build_dataloader File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/__init__.py", line 3, in <module> from .core.bbox.match_costs import BBox3DL1Cost, DiceCost File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/core/bbox/match_costs/__init__.py", line 2, in <module> from .match_cost import BBox3DL1Cost, DiceCost File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/core/bbox/match_costs/match_cost.py", line 32, in <module> Traceback (most recent call last): class DiceCost(object): File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 337, in _register File "./tools/test.py", line 16, in <module> from projects.mmdet3d_plugin.datasets.builder import build_dataloader File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/__init__.py", line 3, in <module> from .core.bbox.match_costs import BBox3DL1Cost, DiceCost File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/core/bbox/match_costs/__init__.py", line 2, in <module> from .match_cost import BBox3DL1Cost, DiceCost File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/core/bbox/match_costs/match_cost.py", line 32, in <module> class DiceCost(object): File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 337, in _register self._register_module(module=module, module_name=name, force=force)self._register_module(module=module, module_name=name, force=force) File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/misc.py", line 340, in new_func File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/misc.py", line 340, in new_func output = old_func(*args, **kwargs) File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 272, in _register_module raise KeyError(f'{name} is already registered ' KeyError: 'DiceCost is already registered in Match Cost''
This problem of duplicate class registration is because the mmdet3d_plugin of UniAD and the file python3.8/site-packages/mmdet/core/bbox/match_costs/match_cost.py of the mmdetection I installed have the DiceCost class with the same name (mmdetection used by the UniAD author Lower versions should not have this problem). Reading the registration code of python3.8/site-packages/mmcv/utils/registry.py in mmcv can tell that this problem can be solved by setting the parameter force=True:
@deprecated_api_warning(name_dict=dict(module_class='module')) def _register_module(self, module, module_name=None, force=False): if not inspect.isclass(module) and not inspect.isfunction(module): raise TypeError('module must be a class or a function, ' f'but got {type(module)}') if module_name is None: module_name = module.__name__ if isinstance(module_name, str): module_name = [module_name] for name in module_name: if not force and name in self._module_dict: raise KeyError(f'{name} is already registered ' f'in {self.name}') self._module_dict[name] = module
In order to ensure that the UniAD code can run correctly, the DiceCost class of UniAD can be forced to register, that is, modify the decorator statement of the DiceCost class in UniAD/projects/mmdet3d_plugin/core/bbox/match_costs/match_cost.py and add the force=True parameter:
@MATCH_COST.register_module(force=True) class DiceCost(object):
5.TypeError: cannot pickle ‘dict_keys’ object
File "./tools/test.py", line 261, in <module> main() File "./tools/test.py", line 231, in main outputs = custom_multi_gpu_test(model, data_loader, args.tmpdir, File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/apis/test.py", line 88, in custom_multi_gpu_test for i, data in enumerate(data_loader): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 438, in __iter__ return self._get_iterator() File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 384, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1048, in __init__ w.start() File "/opt/conda/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/opt/conda/lib/python3.8/multiprocessing/context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/opt/conda/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/opt/conda/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__ super().__init__(process_obj) File "/opt/conda/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__ self._launch(process_obj) File "/opt/conda/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/opt/conda/lib/python3.8/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: cannot pickle 'dict_keys' object
For solutions, see How to locate the TypeError: cannot pickle dict_keys object error cause and solve this error that occurs during multi-process concurrent training or testing of the NuScenes data set – CSDN Blog
6.protobuf reports TypeError: Descriptors cannot be created directly
Traceback (most recent call last): File "./tools/test.py", line 16, in <module> from projects.mmdet3d_plugin.datasets.builder import build_dataloader File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/__init__.py", line 5, in <module> from .datasets.pipelines import ( File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/datasets/pipelines/__init__.py", line 6, in <module> from .occflow_label import GenerateOccFlowLabels File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/datasets/pipelines/occflow_label.py", line 5, in <module> from projects.mmdet3d_plugin.uniad.dense_heads.occ_head_plugin import calculate_birds_eye_view_parameters File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/__init__.py", line 2, in <module> from .dense_heads import * File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/dense_heads/__init__.py", line 4, in <module> from .occ_head import OccHead File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/dense_heads/occ_head.py", line 16, in <module> from .occ_head_plugin import MLP, BevFeatureSlicer, SimpleConv2d, CVT_Decoder, Bottleneck, UpsamplingAdd, \ File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/dense_heads/occ_head_plugin/__init__.py", line 1, in <module> from .metrics import * File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/uniad/dense_heads/occ_head_plugin/metrics.py", line 10, in <module> from pytorch_lightning.metrics.metric import Metric File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 29, in <module> from pytorch_lightning.callbacks import Callback # noqa: E402 File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/callbacks/__init__.py", line 25, in <module> from pytorch_lightning.callbacks.swa import StochasticWeightAveraging File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/callbacks/swa.py", line 26, in <module> from pytorch_lightning.trainer.optimizers import _get_default_scheduler_config File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/__init__.py", line 18, in <module> from pytorch_lightning.trainer.trainer import Trainer File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 30, in <module> from pytorch_lightning.loggers import LightningLoggerBase File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/loggers/__init__.py", line 18, in <module> from pytorch_lightning.loggers.tensorboard import TensorBoardLogger File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/loggers/tensorboard.py", line 25, in <module> from torch.utils.tensorboard import SummaryWriter File "/opt/conda/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py", line 12, in <module> from .writer import FileWriter, SummaryWriter # noqa: F401 File "/opt/conda/lib/python3.8/site-packages/torch/utils/tensorboard/writer.py", line 9, in <module> from tensorboard.compat.proto.event_pb2 import SessionLog File "/opt/conda/lib/python3.8/site-packages/tensorboard/compat/proto/event_pb2.py", line 17, in <module> from tensorboard.compat.proto import summary_pb2 as tensorboard_dot_compat_dot_proto_dot_summary__pb2 File "/opt/conda/lib/python3.8/site-packages/tensorboard/compat/proto/summary_pb2.py", line 17, in <module> from tensorboard.compat.proto import tensor_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__pb2 File "/opt/conda/lib/python3.8/site-packages/tensorboard/compat/proto/tensor_pb2.py", line 16, in <module> from tensorboard.compat.proto import resource_handle_pb2 as tensorboard_dot_compat_dot_proto_dot_resource__handle__pb2 File "/opt/conda/lib/python3.8/site-packages/tensorboard/compat/proto/resource_handle_pb2.py", line 16, in <module> from tensorboard.compat.proto import tensor_shape_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__shape__pb2 File "/opt/conda/lib/python3.8/site-packages/tensorboard/compat/proto/tensor_shape_pb2.py", line 36, in <module> _descriptor.FieldDescriptor( File "/opt/conda/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 561, in __new__ _message.Message._CheckCalledFromGeneratedFile() TypeError: Descriptors cannot be created directly. If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0. If you cannot immediately regenerate your protos, some other possible workarounds are: 1. Downgrade the protobuf package to 3.20.x or lower. 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
My protobuf version 4.24.4 is too high, it will be fine after downgrading to 3.20:
pip install protobuf==3.20
7. TypeError: expected str, bytes or os.PathLike object, not _io.BufferedReader
Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg return obj_cls(**args) File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/datasets/nuscenes_e2e_dataset.py", line 78, in __init__ super().__init__(*args, **kwargs) File "/workspace/workspace_fychen/mmdetection3d/mmdet3d/datasets/nuscenes_dataset.py", line 131, in __init__ super().__init__( File "/workspace/workspace_fychen/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 88, in __init__ self.data_infos = self.load_annotations(open(local_path, 'rb')) File "/workspace/workspace_fychen/UniAD/projects/mmdet3d_plugin/datasets/nuscenes_e2e_dataset.py", line 152, in load_annotations data = pickle.loads(self.file_client.get(ann_file)) File "/opt/conda/lib/python3.8/site-packages/mmcv/fileio/file_client.py", line 1014, in get return self.client.get(filepath) File "/opt/conda/lib/python3.8/site-packages/mmcv/fileio/file_client.py", line 535, in get with open(filepath, 'rb') as f: TypeError: expected str, bytes or os.PathLike object, not _io.BufferedReader
The cause of the problem occurs in the higher version of mmdetection3d/mmdet3d/datasets/custom_3d.py, which considers supporting local_path to read files. What is passed into load_annotations() is an io handle:
def __init__(self, data_root, ann_file, pipeline=None, classes=None, modality=None, box_type_3d='LiDAR', filter_empty_gt=True, test_mode=False, file_client_args=dict(backend='disk')): super().__init__() self.data_root = data_root self.ann_file = ann_file self.test_mode = test_mode self.modality = modality self.filter_empty_gt = filter_empty_gt self.box_type_3d, self.box_mode_3d = get_box_type(box_type_3d) self.CLASSES = self.get_classes(classes) self.file_client = mmcv.FileClient(**file_client_args) self.cat2id = {name: i for i, name in enumerate(self.CLASSES)} # load annotations if not hasattr(self.file_client, 'get_local_path'): with self.file_client.get_local_path(self.ann_file) as local_path: self.data_infos = self.load_annotations(open(local_path, 'rb')) else: warnings.warn( 'The used MMCV version does not have get_local_path. ' f'We treat the {self.ann_file} as local paths and it ' 'might cause errors if the path is not a local path. ' 'Please use MMCV>= 1.3.16 if you meet errors.') self.data_infos = self.load_annotations(self.ann_file)
The root cause is UniAD/projects/mmdet3d_plugin/datasets/nuscenes_e2e_dataset.py in UniAD
When implementing load_annotations(), the default is to only support the use of ann_file as a string type, so here is a forced modification of mmdetection3d/mmdet3d/datasets/custom_3d.py to use self.data_infos = self.load_annotations(self.ann_file).
8. RuntimeError: DataLoader worker (pid 33959) is killed by signal: Killed
After the previous 7 problems have been solved, if the NuScenes data set is complete and the location is correct, running the following commands should be able to run:
./tools/uniad_dist_eval.sh ./projects/configs/stage1_track_map/base_track_map.py ./ckpts/uniad_base_track_map.pth 8
./tools/uniad_dist_eval.sh ./projects/configs/stage2_e2e/base_e2e.py ./ckpts/uniad_base_e2e.pth 8
However, a timeout error may occur when reading data in a loop, causing the process where the dataloader is located to be killed:
Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1134, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/opt/conda/lib/python3.8/multiprocessing/queues.py", line 107, in get if not self._poll(timeout): File "/opt/conda/lib/python3.8/multiprocessing/connection.py", line 257, in poll return self._poll(timeout) File "/opt/conda/lib/python3.8/multiprocessing/connection.py", line 424, in _poll r = wait([self], timeout) File "/opt/conda/lib/python3.8/multiprocessing/connection.py", line 936, in wait timeout = deadline - time.monotonic() File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 33959) is killed by signal: Killed.
After checking, we found that the reason is that the workers_per_gpu=8 setting in the configuration files projects/configs/stage1_track_map/base_track_map.py and projects/configs/stage2_e2e/base_e2e.py is too much for our server. After changing it to 2, Run the above command again and it will be executed successfully.
The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Python entry skill treeArtificial intelligenceDeep learning 388,000 people are learning the system