Gold-YOLO latest YOLO series model

Paper address https://arxiv.org/pdf/2309.11331.pdf

Code address https://github.com/huawei-noah/Efficient-Computing

Table of Contents

01 Introduction to the paper

01Abstract

02Model training process

01Installation environment

02Modify parameters in train

01Modify the –data-path parameter

02Modify the –conf-file parameter

03Other parameter settings

03Training

04Something went wrong

02Model verification process

01Parameter modification

01Modify –data, –weights

02–task task mode

03 Others

02Verification

03Model reasoning

01Parameter modification

01Modify –weights, –source, –yaml

02–save-txt

?Edit?Edit

03Others

02 Solve the error report

03 Reasoning

01 Paper Introduction

Gold-YOLO is a solution proposed by Huawei at the NeurIPS Summit in September 2023 for the information transmission problem in FPN that is common in the yolo series.

01Summary

Over the past few years, the YOLO family of models has become a leading method in the field of real-time object detection. Many studies have pushed the baseline to a higher level by modifying the architecture, adding data, and designing new loss functions. However, previous models still have information fusion problems, although Feature Pyramid Network (FPN) and Path Aggregation Network (PANet) have alleviated this problem to a certain extent. Therefore, this study proposes an advanced gathering and distribution mechanism (GD mechanism), which is implemented through convolution and self-attention operations. This newly designed model, called Gold-YOLO, improves multi-scale feature fusion capabilities and achieves an ideal balance of latency and accuracy across all model scales. In addition, this article implements MAE-style pre-training in the YOLO series for the first time, allowing the YOLO series models to benefit from unsupervised pre-training. Gold-YOLO-N achieves an excellent 39.9% AP on the COCO val2017 dataset and achieves 1030 FPS on T4 GPU, surpassing the previous SOTA model YOLOv6-3.0-N with similar FPS but 2.4 performance improvement %.

02 model training process

The model code is as shown below. Just click on the code above and open the Gold-YOLO path in pycharm.

01 Installation Environment

Type in the terminal to install dependencies

pip install -r requirements.txt

02 Modify parameters in train.py

Find the def get_args_parser function and modify the parameters below it

def get_args_parser(add_help=True):
    parser = argparse.ArgumentParser(description='YOLOv6 PyTorch Training', add_help=add_help)

01 Modify –data-path parameter

Choose your data set format, mine is the voc data set format, so I copied the voc.yaml under the data path and renamed it pole.

parser.add_argument('--data-path', default='./data/coco.yaml', type=str, help='path of dataset')

The above code is changed to

 parser.add_argument('--data-path', default='./data/pole.yaml', type=str, help='path of dataset')

Enter data/pole.yaml

# Please insure that your custom_dataset are put in same parent dir with YOLOv6_DIR
train: VOCdevkit/voc_07_12/images/train # train images
val: VOCdevkit/voc_07_12/images/val # val images
test: VOCdevkit/voc_07_12/images/val # test images (optional)
# whether it is coco dataset, only coco dataset should be set to True.
is_coco: False
#Classes
nc: 20 # number of classes
names: ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat' , 'chair', 'cow', 'diningtable', 'dog',
        'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'] # class names

Modify your path. I used the absolute path, the number of nc categories, and the names of each type. Both of the following are acceptable.

# Please insure that your custom_dataset are put in same parent dir with YOLOv6_DIR
train: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # train images
val: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # val images
test: D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\VOCdevkit\images/train # test images (optional)
is_coco: False
nc: 8 # number of classes
names: ['powerdirty', 'powerdirtyl', 'powerdirtys', 'lightdirty', 'lightdirtys', 'powerbreakb', 'powerbreak', 'powerbreakt' ] # class names
# 0: powerdirty
#1: powerdirtyl
#2: powerdirtys
#3: lightdirty
#4: lightdirtys
#5: powerbreakb
#6: powerbreak
#7: powerbreakt

For data placement, I created VOCdevkit under the data/VOCdevkit path, and the label format is txt.

There are two folders, images and labels, placed in the subdirectory, and the train, val and test folders are placed under these two folders respectively. As well as placing labels separately.

02 Modify –conf-file parameters

This is the file for loading training. There are several files in the configs path. I chose gold_yolo-m.py.

parser.add_argument('--conf-file', default='./configs/yolov6n.py', type=str, help='experiments description file')

Modify it to the same absolute path used

 parser.add_argument('--conf-file', default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\configs\gold_yolo-m .py', type=str, help='experiments description file')

03Other parameter settings

The remaining parameter settings are relatively arbitrary.

 --batch-size depends on the computer. I set it to 4, 8, or 16.

parser.add_argument('--batch-size', default=4, type=int, help='total batch size for all GPUs')

 --epochs, I didn’t change it and let him run 400 rounds to see

 parser.add_argument('--epochs', default=400, type=int, help='number of total epochs to run')

 --The number of threads for workers. Generally, computers do not have high threads, so just change it to low. I set it to 2.

 parser.add_argument('--workers', default=2, type=int, help='number of data loading workers (default: 8)')

 --device device, the GPU used by this device, the default is 0, which means the first GPU is used.

 parser.add_argument('--device', default='0', type=str, help='cuda device, i.e. 0 or 0,1,2,3 or cpu')< /pre>
<pre> --eval-interval, the evaluation is separated by rounds, and the default evaluation is once every 20 rounds,

 parser.add_argument('--eval-interval', default=20, type=int, help='evaluate at every interval epochs')

03 training

The path is in tools/train.py. Click Run directly in train.py to start training.

04 A problem occurred

If the following error occurs, this is running on multiple cards and the parameters need to be changed.

File "D:\DBSY\Efficient-Computing-masteryolo\toolstrain.py"line 130, in <module>
main(args)
File"D:\DBSY\Efficient-Computtine 120, in maintrainer.train()File "D:\DBSY\Efficient-Computing-masten DetectionlGold-YOLOlyolov6 corelenginepy", line 109, in trainself.train_in_loop (self.epoch)
File "D:IDBSYIEfficient-Comouting-masteriDetectioniold-YOLO volovo core enaine.py". line 127, in train-in_loo
self.print_details()
File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO yolov6\coreengine.py"line 339,in print_detailsself.mean_loss = (self.mean_loss * self.step + self. loss_items) (self.step + 1)AttributeError:'Trainer' object has no attribute loss_items'

Which configs/gold_yolo-m.py path you called. Modify the value of type

 norm_cfg=dict(type='SyncBN', requires_grad=True),

norm_cfg=dict(type='BN', requires_grad=True),

02 Model Verification Process

The path is in tools/eval.py,

01 parameter modification

01Modify –data, –weights

–data, –weights changed to absolute paths.

parser.add_argument('--data', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\pole .yaml', help='dataset.yaml path')
parser.add_argument('--weights', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\weights/best_ckpt.pt' , help='model.pt path(s)')

02–task task mode

Choose validation vs. test vs. speed.

 parser.add_argument('--task', default='val', help='val, test, or speed')

03 Others

--Batch-size should be smaller, I set it to 4.

--img-size input image size, default 640

--conf-thres confidence threshold, default

--iou-thres intersection and union ratio threshold--default

--device default 0

--half half-point precision, normal 32-bit precision, fp16-bit precision may be worse

--save_dir save address defaults or sets to the absolute path you want

--save_dir file save name

--test_load_size The size of the image loaded during testing

--letterbox_return_int returns the box offset of the integer

--scale_exact Use exact scale to scale coordinates, default

02 verification

Run eval directly.

03 Model Inference Process

The path is under tools/infer.py,

01 Parameter modification

01Modify –weights, –source, –yaml

They are the model weight file, the inference image address, and the data set configuration file address. All are set to absolute paths

 parser.add_argument('--weights', type=str,default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\weights/best_ckpt. pt', help='model path(s) for inference.')
    parser.add_argument('--source', type=str, default=r'D:\dataset\Hebei data\10.25\10.23\Minjiang Road-side back\sm_y', help='the source path, e.g. image-file/dir.')
    parser.add_argument('--yaml', type=str, default=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\data\pole.yaml\ ', help='data yaml file.')

02–save-txt

–save-txt generates a txt file and adds default=True in the following line. This txt file is used for evaluation with tags.

 parser.add_argument('--save-txt', default=True, action='store_true', help='save results to *.txt.')

03Others

I’ve said it before, so I won’t repeat it.

--max-det Maximum number of inferences per image, default,

--not-save-img does not save the visual reasoning results, this is the result of the image.

--view-img displays the predicted image results. By default, it is set to True and pops up automatically

--classes predicted categories. There may be some negative samples that you don’t want to predict, so set them to the classes you want to predict

 --hide-conf Hide the confidence score, only keep the box without score

02 Solve the error

File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\tools\infer.py", line 103, in run
    inferer = Inferer(source, weights, device, yaml, img_size, half)
TypeError: __init__() missing 2 required positional arguments: 'img_size' and 'half'

The error reported is the following line of code

 inferer = Inferer(source, weights, device, yaml, img_size, half)

In class Inferer: under yolov6/core/inferer.py, look at the parameter definitions. There are 8 parameters in total.

class Inferer:
    def __init__(self, source, webcam, webcam_addr, weights, device, yaml, img_size, half):

When calling with parameters, two parameters are missing. Increase these two parameters.

 inferer = Inferer(source,webcam, webcam_addr, weights, device, yaml, img_size, half)

And in the tools/infer.py path, in the run definition, set these two parameters to False.

@torch.no_grad()
def run(weights=osp.join(ROOT, 'yolov6s.pt'),
        source=osp.join(ROOT, 'data/images'),
        yaml=None,
        img_size=640,
        conf_thres=0.4,
        iou_thres=0.45,
        max_det=1000,
        device='',
        save_txt=False,
        not_save_img=False,
        save_dir=None,
        view_img=True,
        classes=None,
        agnostic_nms=False,
        project=osp.join(ROOT, 'runs/inference'),
        name='exp',
        hide_labels=False,
        hide_conf=False,
        half=False,
        webcam=False,
        webcam_addr=False,
        ):

Click run and the next error appears

File "D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\yolov6\core\inferer.py", line 260, in font_check
    assert osp.exists(font), f'font path not exists: {font}'
AssertionError: font path not exists: ./yolov6/utils/Arial.ttf

Click on the error path, which is still in yolov6/core/inferer.py. This cannot be called

 def font_check(font='./yolov6/utils/Arial.ttf', size=10):

Write this path as an absolute path.

 def font_check(font=r'D:\DBSY\Efficient-Computing-master\Detection\Gold-YOLO\yolov6\utils/Arial.ttf', size=10):< /pre>
03 Reasoning
        Run infer directly