[YOLO&Pytorch] YOLO series project theory based on Pytorch starting from scratch + practical deployment 03 VoTT data annotation tool annotation video data set + the whole process of YOLOX model deployment and training (all pitfalls were encountered and successfully resolved)

VOTT data annotation tool annotates data sets and YOLOX model deployment training

Download and install VoTT
- Download and install
- use
Training with YOLOX model
- Download and configure YOLOX model
- Train and test with your own data sets
- - Preparing for training: putting our own data set in
  - Training preparation: modify training configuration parameters
  - Start training
  - carry out testing

The first two articles in the series:

[YOLO & Pytorch] YOLO series project theory based on Pytorch starting from scratch + practical deployment 01 Configuration and installation of pytorch environment

[YOLO & Pytorch] YOLO series project theory based on Pytorch from scratch + practical deployment 02 Editor configuration

This document fully demonstrates how to use the VoTT tool to annotate your own video data set and deploy the YOLOX model locally for training and testing. I will fully demonstrate all the pitfalls encountered during the entire process, including how to solve the problems. I believe it will help everyone actually complete the deployment of their own projects.

Download and install VoTT

This is an open source data annotation tool for search and organization
At present, it mainly includes four aspects: image, video, text and sound.
The image annotation tools include the following:

The video annotation tools include the following:

Text annotation tools include the following:

Audio annotation tools include the following:

This article mainly introduces the installation and use of VoTT.

Download and install

Click on the VoTT link to enter the page below
VOTT

You can see that there is an introduction and usage tutorial for VoTT. You don’t need to read the tutorial carefully. I will explain it clearly in this article.

Click release in the picture above, click

The downloaded exe executable file can be used directly. I put it in the program files on the D drive.

Use

First we create two new folders under the project folder
One named source holds the source file, and the other named target holds the processed file.

The interface of VoTT looks like this

We click on the New Project icon on the left

Display name Name it yourself
Security Token selects its default
Click Add Connection later in Source Connection

Click Save Connection

Configure target Connection

After configuring, don’t forget to select in the drop-down box

The following Video settings are to set the FPS of the video, that is, how many frames per second. Generally, in order to ensure the flow of the video, the minimum is 30 frames, that is, it cuts the video into 30 images per second, and then we mark them one by one. If If this is the case, once the video is relatively long, our annotation workload will be very heavy. Here we first use the default setting of 15.

For the following tags, we can add them now or later. We will add them later.

Click Save project
Let’s enter our project, but there is a problem here
The prompt Unable to load assets means that the resources cannot be loaded.
I checked, and many of them said it was a video format. It could not read videos that were not in mp4 format, but the videos I had were clearly .mp4. I downloaded the format conversion factory software with the intention of giving it a try.

I found that my original video file was in mpeg4 encoding format, and then I tried to convert it to h264 encoding, and the result was just fine!
Of course, there are other things to pay attention to, that is, there should be no spaces or Chinese characters in the source and target folder paths.

You can see that we now have two videos in the upper left corner. The one with an orange label indicates the video being accessed and played, and the one with a green label indicates the video that has been tagged. It can be seen that our first video has not been tagged yet. We first Click on the upper left corner to enter the annotation for it.

When you click it, it looks like this:

You can first look at the 1 I checked, click the play icon, and the video will start playing.
2 has two arrows, and then you see the yellow vertical line mark on the progress bar behind it. Do you still remember the FPS we set when we started creating the project? Here is how many times the video is divided into per second according to the FPS we set. Frames, how many seconds the total video is *The FPS we set is the number of frames to be marked for our video. It is marked frame by frame.
Then look at 3. The Tags label on the left shows how many frames we have marked in the current video, and the eye-like Visited label on the right shows how many frames we have visited (played to).

Click the plus sign in the upper right corner to add the name of the label.
We added car

Click on the rectangular box mark above to mark each frame of the video.

We select the object we want to label in the middle, drag the labeling box to cover it, and then click car on the right to label it. Then we click the right arrow below to select the next frame. The shortcut key is After we select the target, Ctrl + 1. For example, the label car we selected is the first label, then Ctrl + 1 means that the selected target is labeled with car. The left and right directions of the keyboard can select the previous frame and the next frame.

We have labeled 1500 frames
Next, the annotation data will be exported.

Before exporting the project, first set the export options

The first is what are the export options.
Usually I choose Json, but here because I want to use YOLOX model training next, I choose to export it into VOC format.

Then export the resources, we select the Only tagged assets we have marked
The division ratio of the training data set and the test data set is automatically generated below.

Then we click this to export our annotated data set

It prompts that it is being exported. Let’s go to the target folder to take a look.

You can see that a “VoTT project name” + PascalVOC-export folder is generated

Then let’s go into the folder and take a look

Friends who have used the Pascal VOC data set must be very familiar with this folder directory.
The export process is relatively long, we can do other things and wait.
OK, everything is exported. Let’s take a look at each folder.
Annotations xml file corresponding to the picture
This is the annotation data saved in an XML file

IMageSet folder divides the data set into training set and validation set, so the generated train.txt and val.txt

Then there is the JPEGImages folder data set pictures

Training with YOLOX model

Download and configure YOLOX model

Search YOLOX on github and click on the project with the most stars below
Let’s click to download the project

Then put the project folder into the pycharm project path

Create a new virtual environment

conda create --prefix=D:\Anaconda3\envs\YOLOX python=3.8

Then activate this environment

conda activate D:\Anaconda3\envs\YOLOX

Open requirement.txt in vscode to see its environmental requirements

pip install -r requirements.txt -i https://pypi.douban.com/simple

Download completed

Then configure this environment for the project

python tools/demo.py image -f exps/default/yolox_s.py -c ./weights/yolox_s.pth --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 -- save_result --device gpu

import sys
import os
oscurPath = os.path.abspath(os.path.dirname(__file__))
rootPath = os.path.split(curPath)[0]
sys.path.append(rootPath)

>python demo.py image -f exps/default/yolox_s.py -c ./weights/yolox_s.pth --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result --device gpu

Report an error,

pip show Pillow

Suddenly I thought it was because matplotlib was not installed.

pip install matplotlib -i https://pypi.tuna.tsinghua.edu.cn/simple

I installed it, but it still doesn’t work.
Close pycharm and restart

Reinstall Pillow

>pip install Pillow==9.3.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

Enter again

python demo.py image -f exps/default/yolox_s.py -c ./weights/yolox_s.pth --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result - -device gpu

It works this time
Predictions for instance images are generated below YOLOX_output

Use your own data set for training and testing

Place the files in the target folder generated by VoTT into the yolo project

Training preparation: put our own data set in it

Training preparation: modify training configuration parameters

1. Modify the label and number of classes
Note the path
My path is: YOLOX-main/yolox/data/datasets/voc_classes.py

Modify the class label here
We only have one class, car
So change it to the following form, be careful not to forget the comma after car

Go to this path again: YOLOX-main/exps/example/yolox_voc/yolox_voc_s.py

Change self.num_classes under the Exp class to 1

Then change YOLOX-main/yolox/exp/yolox_base.py

Or change self.num_classed=80 to =1
2. Modify the information of the training set
YOLOX-main/exps/example/yolox_voc/yolox_voc_s.py

Change data_dir to the path where the data set is placed
Change max_labels below to 1

The other is reading txt YOLOX-main/yolox/data/datasets/voc.py
Found line 127
Change to the following

3. Modify the information of the verification set
In YOLOX-main/exps/example/yolox_voc/yolox_voc_s.py
Start at line 44

4. Modify network parameters
YOLOX-main/yolox/exp/yolox_base.py
Line 24

self.depth=0.33
self.width=0.50

5. Modify other parameters
YOLOX-main/yolox/data/datasets/voc.py file
Line 244

Change to the following

At that time, the results will be output in the results folder in the datasets folder.

YOLOX-main/yolox/exp/yolox_base.py file
Lines 92 and 95

Change to the following

Set to one epoch per iteration, that is, use the verification set to verify once

6. Modify relevant information during verification
YOLOX-main/yolox/data/datasets/voc.py

Lines 278, 279 283

Change it to the following:

Line 289

Change to

After modification, execute python setup.py install to reload the code.

python setup.py install

Start training

The following is the key step. Enter the command in the terminal to start training.

python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b
8 -c weights/yolox_s.pth

Let’s talk about each one below
tools/train.py is the training program
-f exps/example/yolox_voc/yolox_voc_s.py is the model of voc
-d 1 is the device device. How many GPUs are there? If I have one, I will write 1.
-b 8 is batchsize
-c weights/yolox_s.pth is the pre-trained model

An error was reported
AttributeError: VOCDetection’ object has no attribute cache’

Reconfigure environment and project files
Reported two errors

Search online, refer to this link
Add link description
Change the command line to

python train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 8 -c weights/yolox_s.pth

To solve the problem, -f -c I copied the path directly before

Two more errors were reported

first error
Rewrite the txt file of the data set. My code is as follows for your reference.

import os
import random

trainval_percent = 0.5
train_percent = 0.9
xmlfilepath = 'VOCdevkit/VOC2007/Annotations'
txtsavepath = 'VOCdevkit/VOC2007/ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('VOCdevkit/VOC2007/ImageSets/Main/trainval.txt', 'w')
ftest = open('VOCdevkit/VOC2007/ImageSets/Main/test.txt', 'w')
ftrain = open('VOCdevkit/VOC2007/ImageSets/Main/train.txt', 'w')
fval = open('VOCdevkit/VOC2007/ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\
'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftest.write(name)
        else:
            fval.write(name)
    else:
        ftrain.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

Second mistake! ! ! Many people have encountered this. Just go to the dataset_wrapper and comment out the three lines of functions located.

Later, an eps error was reported again.
In YOLOX-main/yolox/data/datasets/voc.py
Line 138

 # path_filename = [(self._imgpath % self.ids[i]).split(self.root + "/")[1] for i in range(self.num_imgs)]

Comment out
Change to

path_filename = []
        for i in range(self.num_imgs):
            path_filename.append((self._imgpath % self.ids[i]).split(self.root + "/")[0])

Then run train.py again
It’s ok
We can see what the problem is with the last code

(self._imgpath % self.ids[i])

This is a Python code snippet for concatenating string and object property values.

The code uses string formatting (also called string interpolation) syntax to replace placeholders in a string with object property values. In Python, string formatting uses the percent sign (%) as a placeholder, followed by a colon (:) to specify the attribute name to be replaced.

In this example, self._imgpath is a string containing placeholders that need to be replaced with object property values. self.ids[i] is an object attribute value of type integer or string, which will be inserted into the corresponding position of the self._imgpath string.

So what the entire code snippet means is to replace the placeholders (such as %s) in the self._imgpath string with self.ids[i] and get a new string.

For example, suppose there is a class called MyClass that has a list called ids and a string attribute called _imgpath. If the instance of MyClass is named my_instance, and my_instance.ids[0] is equal to ‘image1.jpg’, then executing (self._imgpath % self.ids[i]) will return ‘image1.jpg’.

Then when the verification started in the tenth round of training, the following error was reported:

One is AP=0 and the other is ValueError: invalid literal for int() with base 10: 571.5333215440583’
I checked online and found that many people have encountered this problem, such as this blog
ValueError: invalid literal for int() with base 10: Solution

Reason: Because Python cannot directly convert a string containing a decimal point into an integer, and the format of the original data is often inconsistent, a ValueError exception is caused during type conversion.
Solution: First convert the string into a floating point number float, and then convert the floating point number into an integer.

We locate the program YOLOX-main/yolox/evaluators/voc_eval.py

 obj_struct["bbox"] = [
            int(bbox.find("xmin").text),
            int(bbox.find("ymin").text),
            int(bbox.find("xmax").text),
            int(bbox.find("ymax").text),
        ]

These are obviously the four parameters of the bounding box.
We change it to

obj_struct["bbox"] = [
            int(float(bbox.find("xmin").text)),
            int(float(bbox.find("ymin").text)),
            int(float(bbox.find("xmax").text)),
            int(float(bbox.find("ymax").text)),
        ]

As for the problem of AP=0, many people have also encountered it. This article is very comprehensive and worth referring to.

https://blog.csdn.net/weixin_42166222/article/details/119637797

The obvious problem arises in our data set
Let’s check one by one
1. Whether the VOC2007 under the dataset folder is deployed according to the VOC format, and whether there are train.txt and test.txt files under the Main file of the ImageSet file. We have them.

2. Under the yolox/data/datasets/voc_classes.py file is the name of our own class. We also have it, car

3.yolox/exp/yolox_base.py self.num_classes=1, this is also correct

4. Modify the _do_python_eval method under yolox/data/datasets/voc.py

I’m right here too

5.exps/example/yolox_voc/yolox_voc_s.py
The first is self.num_classes=1,

Then image_sets under VOCDetection under get_data_loder
image_sets=[(2007’, trainval’)],

Finally found the problem here! ! !
image_sets under VOCDetection under get_eval_loader
It can’t be a test! ! ! ! You can see that I was wrong

should be changed to

Be sure to change it to image_sets=[(2007’, val’)], because the data we verify is in val.txt, not test.txt. This should also be the reason why AP is always 0

Then we tried again, and sure enough it worked.

Enter in the pycharm terminal

tensorboard --logdir=YOLOX-main/YOLOX_outputs/yolox_voc_s/train_log.txt

Visualization process

Report an error
First try changing the absolute path

>tensorboard --logdir=E:\PycharmProjects\YOLOX\YOLOX-main\YOLOX_outputs\yolox_voc_s\train_log.txt

Or not

Change to

tensorboard --logdir=E:\PycharmProjects\YOLOX\YOLOX-main\YOLOX_outputs\yolox_voc_s\

OK

Then just wait until it is trained
You can see that under the YOLOX_outputs path there is

Test

First find YOLOX-main/tools/demo.py
Line 15

from yolox.data.datasets import COCO_CLASSES

Change to:

from yolox.data.datasets import voc_classes

Line 105

class Predictor(object):
    def __init__(
        self,
        model,
        exp,
        cls_names=COCO_CLASSES,
        trt_file=None,
        decoder=None,
        device="cpu",
        fp16=False,
        legacy=False,
    ):

Change to:

cls_names=voc_classes.VOC_CLASSES,

And line 306

 predictor = Predictor(
        model, exp, COCO_CLASSES, trt_file, decoder,
        args.device, args.fp16, args.legacy,
    )

Change COCO_CLASSES to
voc_classes.VOC_CLASSES

After the modification, it is as follows:

predictor = Predictor(
        model, exp, voc_classes.VOC_CLASSES, trt_file, decoder,
        args.device, args.fp16, args.legacy,
    )

Then we go to the YOLOX_outputs path and find:

Then copy it to the weights folder

Then copy the demo.py file to the root directory of the YOLOX project, which can avoid some path errors.

Check again:
YOLOX-main/exps/example/yolox_voc/yolox_voc_s.py
Is there a VOCDetection on line 28?

Line 43

Line 54

No problem
then check

YOLOX-main/yolox/data/datasets/init.py

have

from .voc import VOCDetection

no problem

Then YOLOX-main/yolox/evaluators/init.py

have

from .voc_evaluator import VOCEvaluator

no problem
join in

from .voc_class import VOC_CLASSES

at last! ! ! Ready to start testing. three methods
1. Via command line
As mentioned before, demo.py is placed in the project root directory.

demo.py video -f exps/example/yolox_voc/yolox_voc_s.py -c weights/best_ckpt.pth --path assets/bayland_occlusion_plane_gimbal_1_camera_image_raw_compressed_Cut01.mp4 --conf 0.3 --nms 0.65 --tsize 640 --save_result -- device gpu

Report an error

Go to YOLOX-main/yolox/evaluators/init.py
Change to

from yolox.data.datasets.voc_classes import VOC_CLASSES