Paper address https://arxiv.org/pdf/2309.11331.pdf
Code address https://github.com/huawei-noah/Efficient-Computing
Table of Contents
01 Introduction to the paper
01Abstract
02Model training process
01Installation environment
02Modify parameters in train
01Modify the –data-path parameter
02Modify the –conf-file parameter
03Other parameter settings
03Training
04Something went wrong
02Model verification process
01Parameter modification
01Modify –data, –weights
02–task task mode
03 Others
02Verification
03Model reasoning
01Parameter modification
01Modify –weights, –source, –yaml
02–save-txt
?Edit?Edit
03Others
02 Solve the error report
03 Reasoning
01 Paper Introduction
Gold-YOLO is a solution proposed by Huawei at the NeurIPS Summit in September 2023 for the information transmission problem in FPN that is common in the yolo series.
01Summary
Over the past few years, the YOLO family of models has become a leading method in the field of real-time object detection. Many studies have pushed the baseline to a higher level by modifying the architecture, adding data, and designing new loss functions. However, previous models still have information fusion problems, although Feature Pyramid Network (FPN) and Path Aggregation Network (PANet) have alleviated this problem to a certain extent. Therefore, this study proposes an advanced gathering and distribution mechanism (GD mechanism), which is implemented through convolution and self-attention operations. This newly designed model, called Gold-YOLO, improves multi-scale feature fusion capabilities and achieves an ideal balance of latency and accuracy across all model scales. In addition, this article implements MAE-style pre-training in the YOLO series for the first time, allowing the YOLO series models to benefit from unsupervised pre-training. Gold-YOLO-N achieves an excellent 39.9% AP on the COCO val2017 dataset and achieves 1030 FPS on T4 GPU, surpassing the previous SOTA model YOLOv6-3.0-N with similar FPS but 2.4 performance improvement %.
02 model training process
The model code is as shown below. Just click on the code above and open the Gold-YOLO path in pycharm.
01 Installation Environment
Type in the terminal to install dependencies
pip install -r requirements.txt
02 Modify parameters in train.py
Find the def get_args_parser function and modify the parameters below it
def get_args_parser(add_help=True): parser = argparse.ArgumentParser(description='YOLOv6 PyTorch Training', add_help=add_help)
01 Modify –data-path parameter
Choose your data set format, mine is the voc data set format, so I copied the voc.yaml under the data path and renamed it pole.
parser.add_argument('--data-path', default='./data/coco.yaml', type=str, help='path of dataset')
The above code is changed to
parser.add_argument('--data-path', default='./data/pole.yaml', type=str, help='path of dataset')
Enter data/pole.yaml
# Please insure that your custom_dataset are put in same parent dir with YOLOv6_DIR train: VOCdevkit/voc_07_12/images/train # train images val: VOCdevkit/voc_07_12/images/val # val images test: VOCdevkit/voc_07_12/images/val # test images (optional) # whether it is coco dataset, only coco dataset should be set to True. is_coco: False #Classes nc: 20 # number of classes names: ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat' , 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'] # class names
Modify your path. I used the absolute path, the number of nc categories, and the names of each type. Both of the following are acceptable.
# Please insure that your custom_dataset are put in same parent dir with YOLOv6_DIR train: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # train images val: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # val images test: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # test images (optional) is_coco: False nc: 8 # number of classes names: ['powerdirty', 'powerdirtyl', 'powerdirtys', 'lightdirty', 'lightdirtys', 'powerbreakb', 'powerbreak', 'powerbreakt' ] # class names # 0: powerdirty #1: powerdirtyl #2: powerdirtys #3: lightdirty #4: lightdirtys #5: powerbreakb #6: powerbreak #7: powerbreakt
For data placement, I created VOCdevkit under the data/VOCdevkit path, and the label format is txt.
There are two folders, images and labels, placed in the subdirectory, and the train, val and test folders are placed under these two folders respectively. As well as placing labels separately.
02 Modify –conf-file parameters
This is the file for loading training. There are several files in the configs path. I chose gold_yolo-m.py.
parser.add_argument('--conf-file', default='./configs/yolov6n.py', type=str, help='experiments description file')
Modify it to the same absolute path used
parser.add_argument('--conf-file', default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\configs\gold_yolo-m .py', type=str, help='experiments description file')
03Other parameter settings
The remaining parameter settings are relatively arbitrary.
--batch-size depends on the computer. I set it to 4, 8, or 16.
parser.add_argument('--batch-size', default=4, type=int, help='total batch size for all GPUs')
--epochs, I didn’t change it and let him run 400 rounds to see
parser.add_argument('--epochs', default=400, type=int, help='number of total epochs to run')
--The number of threads for workers. Generally, computers do not have high threads, so just change it to low. I set it to 2.
parser.add_argument('--workers', default=2, type=int, help='number of data loading workers (default: 8)')
--device device, the GPU used by this device, the default is 0, which means the first GPU is used.
parser.add_argument('--device', default='0', type=str, help='cuda device, i.e. 0 or 0,1,2,3 or cpu')< /pre> <pre> --eval-interval, the evaluation is separated by rounds, and the default evaluation is once every 20 rounds,
parser.add_argument('--eval-interval', default=20, type=int, help='evaluate at every interval epochs')
03 training
The path is in tools/train.py. Click Run directly in train.py to start training.
04 A problem occurred
If the following error occurs, this is running on multiple cards and the parameters need to be changed.
File "D:\DBSY\Efficient-Computing-masteryolo\toolstrain.py"line 130, in <module> main(args) File"D:\DBSY\Efficient-Computtine 120, in maintrainer.train()File "D:\DBSY\Efficient-Computing-masten DetectionlGold-YOLOlyolov6 corelenginepy", line 109, in trainself.train_in_loop (self.epoch) File "D:IDBSYIEfficient-Comouting-masteriDetectioniold-YOLO volovo core enaine.py". line 127, in train-in_loo self.print_details() File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO yolov6\coreengine.py"line 339,in print_detailsself.mean_loss = (self.mean_loss * self.step + self. loss_items) (self.step + 1)AttributeError:'Trainer' object has no attribute loss_items'
Which configs/gold_yolo-m.py path you called. Modify the value of type
norm_cfg=dict(type='SyncBN', requires_grad=True),
norm_cfg=dict(type='BN', requires_grad=True),
02 Model Verification Process
The path is in tools/eval.py,
01 parameter modification
01Modify –data, –weights
–data, –weights changed to absolute paths.
parser.add_argument('--data', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\pole .yaml', help='dataset.yaml path') parser.add_argument('--weights', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\weights/best_ckpt.pt' , help='model.pt path(s)')
02–task task mode
Choose validation vs. test vs. speed.
parser.add_argument('--task', default='val', help='val, test, or speed')
03 Others
--Batch-size should be smaller, I set it to 4.
--img-size input image size, default 640
--conf-thres confidence threshold, default
--iou-thres intersection and union ratio threshold--default
--device default 0
--half half-point precision, normal 32-bit precision, fp16-bit precision may be worse
--save_dir save address defaults or sets to the absolute path you want
--save_dir file save name
--test_load_size The size of the image loaded during testing
--letterbox_return_int returns the box offset of the integer
--scale_exact Use exact scale to scale coordinates, default
02 verification
Run eval directly.
03 Model Inference Process
The path is under tools/infer.py,
01 Parameter modification
01Modify –weights, –source, –yaml
They are the model weight file, the inference image address, and the data set configuration file address. All are set to absolute paths
parser.add_argument('--weights', type=str,default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\weights/best_ckpt. pt', help='model path(s) for inference.') parser.add_argument('--source', type=str, default=r'D:\dataset\Hebei data\10.25\10.23\Minjiang Road-side back\sm_y', help='the source path, e.g. image-file/dir.') parser.add_argument('--yaml', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\pole.yaml\ ', help='data yaml file.')
02–save-txt
–save-txt generates a txt file and adds default=True in the following line. This txt file is used for evaluation with tags.
parser.add_argument('--save-txt', default=True, action='store_true', help='save results to *.txt.')
03Others
I’ve said it before, so I won’t repeat it.
--max-det Maximum number of inferences per image, default,
--not-save-img does not save the visual reasoning results, this is the result of the image.
--view-img displays the predicted image results. By default, it is set to True and pops up automatically
--classes predicted categories. There may be some negative samples that you don’t want to predict, so set them to the classes you want to predict
--hide-conf Hide the confidence score, only keep the box without score
02 Solve the error
File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\tools\infer.py", line 103, in run inferer = Inferer(source, weights, device, yaml, img_size, half) TypeError: __init__() missing 2 required positional arguments: 'img_size' and 'half'
The error reported is the following line of code
inferer = Inferer(source, weights, device, yaml, img_size, half)
In class Inferer: under yolov6/core/inferer.py, look at the parameter definitions. There are 8 parameters in total.
class Inferer: def __init__(self, source, webcam, webcam_addr, weights, device, yaml, img_size, half):
When calling with parameters, two parameters are missing. Increase these two parameters.
inferer = Inferer(source,webcam, webcam_addr, weights, device, yaml, img_size, half)
And in the tools/infer.py path, in the run definition, set these two parameters to False.
@torch.no_grad() def run(weights=osp.join(ROOT, 'yolov6s.pt'), source=osp.join(ROOT, 'data/images'), yaml=None, img_size=640, conf_thres=0.4, iou_thres=0.45, max_det=1000, device='', save_txt=False, not_save_img=False, save_dir=None, view_img=True, classes=None, agnostic_nms=False, project=osp.join(ROOT, 'runs/inference'), name='exp', hide_labels=False, hide_conf=False, half=False, webcam=False, webcam_addr=False, ):
Click run and the next error appears
File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\yolov6\core\inferer.py", line 260, in font_check assert osp.exists(font), f'font path not exists: {font}' AssertionError: font path not exists: ./yolov6/utils/Arial.ttf
Click on the error path, which is still in yolov6/core/inferer.py. This cannot be called
def font_check(font='./yolov6/utils/Arial.ttf', size=10):
Write this path as an absolute path.
def font_check(font=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\yolov6\utils/Arial.ttf', size=10):< /pre>03 Reasoning
Run infer directly