Directory
1 Introduction
2. About PASCAL VOC dataset xml –> YOLO txt format
2.1 Path setting
2.2 Function to read xml file
2.3 xml —> yolo txt
2.4 yolo’s label file
2.6 Results
2.7 Code
3. Custom YOLO dataset
3.1 Preparatory work
3.2 open labelimg
3.3 Drawing
The code reference is the boss of station b: 3.2 YOLOv3 SPP source code analysis (Pytorch version)
Link to PASCAL VOC dataset: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
The converted yolo format data set is divided into two, one is too large to upload
Training set: training set in yolo format for PASCAL VOC target detection
Verification set: Verification set of yolo format for PASCAL VOC target detection
1. Foreword
The label file of target detection is different from classification and segmentation. Generally speaking, in classification tasks, pictures of the same category are placed in the same directory, and the index of the file name is the name of the category. In the segmentation task, different training images correspond to different multi-threshold images, that is, the training is an image, and the label is also an image.
The label of target detection is divided into two types, one is the category of the target to be detected, such as cats, dogs, etc. The other is the position of the target, marked with a bounding box, often a rectangular box of xmin, xman, ymin, ymax.
Usually, the label of target detection is annotated with xml file
For example, in the object below, there are two categories of horse and person, and there are four parameters below the corresponding category that are the information of the bounding box
However, the yolo algorithm causes such xml to not satisfy the yolo format, so an xml-to-yolo format operation is required
As follows, 12 refers to the category of detection, and the next four parameters are the information of the x, y, w, h bounding box
The yolo bounding box is based on the center coordinates of the bounding box, w, h relative to the entire image
2. About PASCAL VOC dataset xml –> YOLO txt format
This chapter only completes the work of data conversion
In the beginning, my_yolo_dataset and my_data_label.names did not exist, but were generated by trans_voc2yolo.py to convert the data of VOCdevkit
2.1 Path Setting
The VOC data set is separate and used for different tasks, here only for target detection tasks
- Annotations put the xml tag file for target detection
- train.txt, val.txt put the file name of the training set and validation set
- JPEGImages put all VOC pictures
2.2 Function to read xml file
as follows:
The code here is implemented recursively. I don’t understand it very much. Just know how to use it.
The following is to read an xml file and return the dictionary information
{‘annotation’: {‘folder’: ‘VOC2012’, ‘filename’: ‘2008_000008.jpg’, ‘source’: {‘database’: ‘The VOC2008 Database’, ‘annotation’: ‘PASCAL VOC2008’, ‘image’: ‘flickr’}, ‘size’: {‘width’: ‘500’, ‘height’: ‘442’, ‘depth’: ‘3’}, ‘segmented’: ‘0’, ‘object’: [{‘name’ : ‘horse’, ‘pose’: ‘Left’, ‘truncated’: ‘0’, ‘occluded’: ‘1’, ‘bndbox’: { ‘xmin’: ’53’, ‘ymin’: ’87’, ‘xmax’: ‘471’, ‘ymax’: ‘420’}, \ ‘difficult’: ‘0’}, {‘name’: ‘person’, ‘pose’: ‘Unspecified’, ‘truncated’: ‘1’, ‘pose’: ‘Unspecified’, ‘truncated’: ‘1’, \ ‘occluded’: ‘0’, ‘bndbox’: {‘xmin’: ‘158’, ‘ymin’: ’44’, ‘xmax’: ‘ 289’, ‘ymax’: ‘167’}, ‘difficult’: ‘0’}]}}
Then traverse the bounding box under the key as object
Note that index here is the index, starting from 0. Here are the values of the first index and obj
Finally, convert the bounding box to the width and height of the center point coordinates, and then change it to the relative value of the entire image.
2.4 yolo’s label file
The implementation code is as follows:
It is also very simple here, just take out the VOC key and store it
2.6 Results
The operation process is as follows
The generated yolo dataset directory is as follows:
yolo’s label information:
2.7 Code
The converted code is as follows:
""" This script has two functions: 1. Convert the voc dataset annotation information (.xml) to yolo annotation format (.txt), and copy the image file to the corresponding folder 2. According to the json label file, generate the corresponding names label (my_data_label.names) """ import os from tqdm import tqdm from lxml import etree import json import shut-off # Read the xml file information and return it in dictionary form def parse_xml_to_dict(xml): """ Parse the xml file into a dictionary, refer to recursive_parse_xml_to_dict of tensorflow Args: xml: xml tree obtained by parsing XML file contents using lxml.etree Returns: Python dictionary holding XML contents. """ if len(xml) == 0: # Traverse to the bottom layer and directly return the information corresponding to the tag return {xml. tag: xml. text} result = {} for child in xml: child_result = parse_xml_to_dict(child) # Recursively traverse label information if child.tag != 'object': result[child.tag] = child_result[child.tag] else: if child.tag not in result: # Because there may be multiple objects, they need to be put in the list result[child. tag] = [] result[child.tag].append(child_result[child.tag]) return {xml. tag: result} # Convert the xml file to yolo's txt file def translate_info(file_names: list, save_root: str, class_dict: dict, train_val='train'): """ :param file_names: path of all training set/validation set images :param save_root: corresponding yolo file with save :param class_dict: json label of voc data :param train_val: Determine whether the input is a training set or a verification set """ save_txt_path = os.path.join(save_root, train_val, "labels") # save yolo's txt label file if os.path.exists(save_txt_path) is False: os.makedirs(save_txt_path) save_images_path = os.path.join(save_root, train_val, "images") # save the training image file of yolo if os.path.exists(save_images_path) is False: os.makedirs(save_images_path) for file in tqdm(file_names, desc="translate {} file...".format(train_val)): # Check if the image file exists img_path = os.path.join(voc_images_path, file + ".jpg") assert os.path.exists(img_path), "file:{} not exist...".format(img_path) # Check if the xml file exists xml_path = os.path.join(voc_xml_path, file + ".xml") assert os.path.exists(xml_path), "file:{} not exist...".format(xml_path) # read xml with open(xml_path) as fid: xml_str = fid. read() xml = etree. fromstring(xml_str) data = parse_xml_to_dict(xml)["annotation"] # read xml file information img_height = int(data["size"]["height"]) # read in the h of the image img_width = int(data["size"]["width"]) # read in the w of the image # Determine whether the xml has ground truth assert "object" in data.keys(), "file: '{}' lack of object key.".format(xml_path) if len(data["object"]) == 0: # If there is no target in the xml file, return the image path and ignore the sample print("Warning: in '{}' xml, there are no objects.".format(xml_path)) continue # Create a new yolo txt annotation file corresponding to xml, and write it with open(os. path. join(save_txt_path, file + ".txt"), "w") as f: for index, obj in enumerate(data["object"]): # index is the index starting from 0, obj is the dictionary file of object # Get the box information of each object xmin = float(obj["bndbox"]["xmin"]) xmax = float(obj["bndbox"]["xmax"]) ymin = float(obj["bndbox"]["ymin"]) ymax = float(obj["bndbox"]["ymax"]) class_name = obj["name"] # Get the classification of the bounding box class_index = class_dict[class_name] - 1 # target id starts from 0 # Further check the data, some label information may have w or h as 0, such data will cause the calculation regression loss to be nan if xmax <= xmin or ymax <= ymin: print("Warning: in '{}' xml, there are some bbox w/h <=0".format(xml_path)) continue # Convert box information to yolo format xcenter = xmin + (xmax - xmin) / 2 # center point coordinates ycenter = ymin + (ymax - ymin) / 2 w = xmax - xmin # w and h of the bounding box h = ymax - ymin # Convert absolute coordinates to relative coordinates, save 6 decimal places xcenter = round(xcenter / img_width, 6) ycenter = round(ycenter / img_height, 6) w = round(w / img_width, 6) h = round(h / img_height, 6) info = [str(i) for i in [class_index, xcenter, ycenter, w, h]] if index == 0: f.write(" ".join(info)) else: # automatic line break f.write("\\ " + " ".join(info)) # Copy the image to the corresponding set path_copy_to = os.path.join(save_images_path, img_path.split(os.sep)[-1]) if os.path.exists(path_copy_to) is False: shutil. copyfile(img_path, path_copy_to) # Create a label file for yolo def create_class_names(class_dict: dict): keys = class_dict.keys() with open("./data/my_data_label.names", "w") as w: for index, k in enumerate(keys): if index + 1 == len(keys): w. write(k) else: w.write(k + "\\ ") def main(): # Read the json label file of the original voc data json_file = open(label_json_path, 'r') class_dict = json. load(json_file) # Read all the line information in the training set path file train.txt of the voc dataset, and delete the blank line with open(train_txt_path, "r") as r: train_file_names = [i for i in r.read().splitlines() if len(i.strip()) > 0] # voc information to yolo, and copy the image file to the corresponding folder translate_info(train_file_names, save_file_root, class_dict, "train") # Read all the line information in the voc dataset path file val.txt and delete the blank lines with open(val_txt_path, "r") as r: val_file_names = [i for i in r.read().splitlines() if len(i.strip()) > 0] # voc information to yolo, and copy the image file to the corresponding folder translate_info(val_file_names, save_file_root, class_dict, "val") # Create my_data_label.names file create_class_names(class_dict) if __name__ == "__main__": # voc dataset root directory and version voc_root = "VOCdevkit" voc_version = "VOC2012" # Converted training set and validation set correspond to txt files train_txt = "train.txt" val_txt = "val.txt" # Converted file save directory, yolo format save_file_root = "./my_yolo_dataset" if os.path.exists(save_file_root) is False: os.makedirs(save_file_root) # The label tag corresponds to the json file label_json_path = './data/pascal_voc_classes.json' voc_images_path = os.path.join(voc_root, voc_version, "JPEGImages") # voc training image path voc_xml_path = os.path.join(voc_root, voc_version, "Annotations") # xml tag file path of voc train_txt_path = os.path.join(voc_root, voc_version, "ImageSets", "Main", train_txt) # voc training set path file val_txt_path = os.path.join(voc_root, voc_version, "ImageSets", "Main", val_txt) # voc validation set path file # Check if the file/folder exists assert os.path.exists(voc_images_path), "VOC images path not exist..." assert os.path.exists(voc_xml_path), "VOC xml path not exist..." assert os.path.exists(train_txt_path), "VOC train txt file not exist..." assert os.path.exists(val_txt_path), "VOC val txt file not exist..." assert os.path.exists(label_json_path), "label_json_path does not exist..." # start conversion main()
3. Custom YOLO dataset
The labelimg is used here, and the installation is as follows
pip install labelimg
Enter labelimg in the terminal to enter, the interface is as follows:
3.1 Preparations
Create a new demo folder, and store these three files below
- annotation is the saved yolo bounding box file
- img is an image
- labels.txt is the label file
The label is stored as follows:
3.2 Open labelimg
Open the terminal in the demo, the first parameter is the folder of the image, and the second is the path of labels
3.3 Drawing
It will be displayed like this after opening. First, change the saved format to yolo. Then select the annotation folder in save dir
On the right is the img file, where two images are placed
When drawing, just select which category
The end result is this