Farm newborn piglet monitoring system based on YOLOv5 of ODConv & ConvNeXt

1. Research background and significance

Project ReferenceAAAI Association for the Advancement of Artificial Intelligence

research background and meaning:

Agriculture is one of the important pillars of human society, and breeding industry occupies an important position in agriculture. With population growth and economic development, the demand for agricultural products is also increasing. However, farm management faces many challenges, one of which is how to effectively monitor and manage the health and growth of farmed animals.

In the breeding industry, piglets are an important breeding animal. The health and growth of piglets directly affect the efficiency and output of the breeding industry. Therefore, the development of an efficient and accurate piglet birth monitoring system is of great significance to improve the management level and economic benefits of the breeding industry.

The traditional discovery of piglet births mainly relies on manual inspection and observation, which has many problems. First of all, manual inspection requires a lot of human resources and time, and the cost is high. Secondly, manual inspections are prone to omissions and misjudgments, and cannot accurately monitor the health and growth of piglets in real time. In addition, traditional monitoring systems lack in-depth analysis and research on the behavior and activities of piglets and cannot provide valuable data and information.

Therefore, piglet monitoring systems based on deep learning and computer vision technology have become a research hotspot. Among them, YOLOv5 is a deep learning model based on target detection, which is fast and accurate. ODConv and ConvNeXt are improved versions of YOLOv5, which can further improve the performance and effect of the model.

This study aims to develop a newborn piglet monitoring system based on the YOLOv5 model of ODConv and ConvNeXt to achieve real-time and accurate monitoring of piglet health and growth. Specifically, this research will achieve its goals through the following aspects:

Data collection and preprocessing: Use cameras or sensors and other equipment to collect images and video data of piglets, and perform preprocessing, including image denoising, image enhancement, etc.
Target detection and tracking: The YOLOv5 model based on ODConv and ConvNeXt is used to detect and track the piglets’ image and video data to achieve real-time monitoring and positioning of the piglets.
Behavior analysis and early warning: By analyzing and researching the behavior and activities of piglets, we can extract valuable data and information, such as piglet activity frequency, diet, etc., and implement monitoring of piglets based on preset rules and models. Early warnings and predictions of health and growth.
Data visualization and management: Visually display the monitored data and information, and establish a database and management system to facilitate data query and analysis by farm managers and improve the scientificity and accuracy of decision-making.

By developing a farm newborn piglet monitoring system based on YOLOv5 based on ODConv and ConvNeXt, real-time and accurate monitoring of piglet births can be achieved, improving the management level and economic benefits of the breeding industry. At the same time, the system can also provide valuable data and information to farm managers to help them make scientific decisions and adjust breeding strategies. In addition, this system also has the potential for promotion and application, and can provide reference for the monitoring and management of other farmed animals.

2. Picture demonstration

3. Video demonstration

Farm newborn piglet monitoring system based on YOLOv5 of ODConv & ConvNeXt_bilibili_bilibili

4. Introduction to YOLOv5 algorithm

YOLOv5 is a deep learning model based on target detection. It has made significant breakthroughs in the field of computer vision and is widely used in various application scenarios. The model is known for its excellent performance and efficient inference speed, and is particularly suitable for tasks requiring real-time object detection. The following will discuss the characteristics, network structure and application of YOLOv5 in farm newborn piglet monitoring systems in more detail.

Key features of YOLOv5

Target detection capability
YOLOv5 is able to detect multiple objects in an image or video and provide each object with the location of its bounding box and corresponding class label. This makes it very useful in monitoring systems as it can be used to track piglets and identify their status. By identifying piglets in real time in each video frame, farm managers can take timely steps to ensure their health and growth.
High speed and accuracy
YOLOv5 stands out for its superior speed and accuracy. Compared with many other object detection models, YOLOv5 can achieve faster inference speed when processing large-scale data sets, which is crucial for real-time monitoring and tracking tasks. Farm environments are often complex, and YOLOv5 can maintain high detection accuracy in this complexity.
Scalability
The network structure of YOLOv5 is scalable and can adapt to different input sizes and tasks. This means it can be easily adapted to the output of various surveillance cameras or sensors and used in different farm environments. This flexibility makes it a versatile tool suitable for different types of farms.

YOLOv5 network structure

The network structure of YOLOv5 adopts a lightweight convolutional neural network (CNN) architecture with multi-layer convolution and pooling layers. These layers are used to extract features in the image for better object recognition. Among them, YOLOv5 adopts improved versions based on ODConv and ConvNeXt. These improvements further optimize the performance and effect of the model. ODConv (Object Detection Convolution) introduces a new convolution structure to help improve the accuracy of object detection. ConvNeXt further enhances the performance of the model through multi-channel convolution.

The network structure of YOLOv5 also includes an output layer for regressing the target bounding box position and predicting the category. These output layers generate coordinates and category information for each detected target, allowing the system to accurately identify and track piglets.

As an advanced target detection model, YOLOv5 provides powerful technical support for the farm’s newborn piglet monitoring system. Its speed and accuracy, scalability, and improved network structure make it an important innovation in the agricultural sector. By monitoring the birth of piglets in real time, this system is expected to improve the management level and economic benefits of the breeding industry, provide valuable data and information to farm managers, and also has extensive promotion potential and can be used for the monitoring and control of other farmed animals. management and promote the modernization and sustainable development of agriculture. This research is of great significance to the agricultural field and will bring more technological innovation and efficiency improvements to agriculture.

5. Core code explanation

5.1 fit.py


class LayerNorm_s(nn.Module):
    def __init__(self, normalized_shape, eps=1e-6, data_format="channels_last"):
        super().__init__()
        self.weight = nn.Parameter(torch.ones(normalized_shape))
        self.bias = nn.Parameter(torch.zeros(normalized_shape))
        self.eps = eps
        self.data_format = data_format
        if self.data_format not in ["channels_last", "channels_first"]:
            raise NotImplementedError
        self.normalized_shape = (normalized_shape,)

    def forward(self, x):
        if self.data_format == "channels_last":
            return F.layer_norm(x, self.normalized_shape, self.weight, self.bias, self.eps)
        elif self.data_format == "channels_first":
            u = x.mean(1, keepdim=True)
            s = (x - u).pow(2).mean(1, keepdim=True)
            x = (x - u) / torch.sqrt(s + self.eps)
            x = self.weight[:, None, None] * x + self.bias[:, None, None]
            return x

class ConvNextBlock(nn.Module):
    def __init__(self, dim, drop_path=0., layer_scale_init_value=1e-6):
        super().__init__()
        self.dwconv = nn.Conv2d(dim, dim, kernel_size=7, padding=3, groups=dim) # depthwise conv
        self.norm = LayerNorm_s(dim, eps=1e-6)
        self.pwconv1 = nn.Linear(dim, 4 * dim)
        self.act = nn.GELU()
        self.pwconv2 = nn.Linear(4 * dim, dim)
        self.gamma = nn.Parameter(layer_scale_init_value * torch.ones((dim)),
                                  requires_grad=True) if layer_scale_init_value > 0 else None
        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()

    def forward(self, x):
        input = x
        x = self.dwconv(x)
        x = x.permute(0, 2, 3, 1) # (N, C, H, W) -> (N, H, W, C)
        x = self.norm(x)
        x = self.pwconv1(x)
        x = self.act(x)
        x = self.pwconv2(x)
        if self.gamma is not None:
            x = self.gamma * x
        x = x.permute(0, 3, 1, 2) # (N, H, W, C) -> (N, C, H, W)

        x = input + self.drop_path(x)
        return x

class DropPath(nn.Module):
    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob

    def forward(self, x):
        return drop_path_f(x, self.drop_prob, self.training)

def drop_path_f(x, drop_prob: float = 0., training: bool = False):
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_() # binarize
    ...

The fit.py file in this project defines several classes, including LayerNorm_s, ConvNextBlock, DropPath and CNeB.

The LayerNorm_s class is a custom normalization layer that inherits from the nn.Module class. The initialization method of this class contains weight parameters and bias parameters, and can specify the normalized dimensions and data format. The forward propagation method uses different normalization methods depending on the data format.

The ConvNextBlock class is a convolution block, which inherits from the nn.Module class. The initialization method of this class includes operations such as depth convolution, normalization, full connection and activation. In the forward propagation method, depth convolution is first performed, then normalization, full connection and activation operations are performed, and finally the result is added to the input and the DropPath operation is performed.

The DropPath class is a Drop Path layer, which inherits from the nn.Module class. The initialization method of this class contains the probability of Drop Path. The forward propagation method performs Drop Path operations based on the probability of Drop Path.

The CNeB class is a CSP ConvNextBlock class, which inherits from the nn.Module class. The initialization method of this class contains multiple instances of the ConvNextBlock class, and can specify the number of input channels, the number of output channels, the number of blocks, whether to use shortcut connection, the number of groups and the expansion coefficient. In the forward propagation method, two different convolution operations are first performed on the input, and then the results are spliced together and processed through multiple ConvNextBlock blocks, and finally a convolution operation is performed.

Generally speaking, the fit.py file defines some commonly used neural network layers and modules that can be used to build deep learning models.

5.2 models\experimental.py

import math
import numpy as np
import torch
import torch.nn as nn

class CrossConv(nn.Module):
    #CrossConvolutionDownsample
    def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
        # ch_in, ch_out, kernel, stride, groups, expansion, shortcut
        super().__init__()
        c_ = int(c2 * e) # hidden channels
        self.cv1 = Conv(c1, c_, (1, k), (1, s))
        self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


class Sum(nn.Module):
    # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
    def __init__(self, n, weight=False): # n: number of inputs
        super().__init__()
        self.weight = weight # apply weights boolean
        self.iter = range(n - 1) # iter object
        if weight:
            self.w = nn.Parameter(-torch.arange(1.0, n) / 2, requires_grad=True) # layer weights

    def forward(self, x):
        y = x[0] # no weight
        if self.weight:
            w = torch.sigmoid(self.w) * 2
            for i in self.iter:
                y = y + x[i + 1] * w[i]
        else:
            for i in self.iter:
                y = y + x[i + 1]
        return y


class MixConv2d(nn.Module):
    # Mixed Depth-wise Conv https://arxiv.org/abs/1907.09595
    def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): # ch_in, ch_out, kernel, stride, ch_strategy
        super().__init__()
        n = len(k) # number of convolutions
        if equal_ch: # equal c_ per group
            i = torch.linspace(0, n - 1E-6, c2).floor() # c2 indices
            c_ = [(i == g).sum() for g in range(n)] # intermediate channels
        else: # equal weight.numel() per group
            b = [c2] + [0] * n
            a = np.eye(n + 1, n, k=-1)
            a -= np.roll(a, 1, axis=1)
            a *= np.array(k) ** 2
            a[0] = 1
            c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b

        self.m = nn.ModuleList(
            [nn.Conv2d(c1, int(c_), k, s, k // 2, groups=math.gcd(c1, int(c_)), bias=False) for k, c_ in zip(k, c_) ])
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.SiLU()

    def forward(self, x):
        return self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))


class Ensemble(nn.ModuleList):
    # Ensemble of models
    def __init__(self):
        super().__init__()

    def forward(self, x, augment=False, profile=False, visualize=False):
        y = []
        for module in self:
            y.append(module(x, augment, profile, visualize)[0])
        # y = torch.stack(y).max(0)[0] # max ensemble
        # y = torch.stack(y).mean(0) # mean ensemble
        y = torch.cat(y, 1) # nms ensemble
        return y, None # inference, train output


def attempt_load(weights, map_location=None, inplace=True, fuse=True):
    from models.yolo import Detect, Model

    # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
    model = Ensemble()
    for w in weights if isinstance(weights, list) else [weights]:
        ckpt = torch.load(attempt_download(w), map_location=map_location) # load
        if fuse:
            model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval()) # FP32 model
        else:
            model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().eval()) # without layer fuse

    # Compatibility updates
    for m in model.modules():
        if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Model]:
            m.inplace = inplace #pytorch 1.7.0 compatibility
            if type(m) is Detect:
                if not isinstance(m.anchor_grid, list): # new Detect Layer compatibility
                    delattr(m, 'anchor_grid')
                    setattr(m, 'anchor_grid', [torch.zeros(1)] * m.nl)
        elif type(m) is Conv:
            m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility

    if len(model) == 1:
        return model[-1] # return model
    else:
        print(f'Ensemble created with {<!-- -->weights}\\
')
        for k in ['names']:
            setattr(model, k, getattr(model[-1], k))
        model.stride = model[torch.argmax(torch.tensor([m.stride.max() for m in model])).int()].stride # max stride
        return model # return ensemble

This program file is the experimental module of YOLOv5. It contains some experimental models and features.

The following classes are defined in the file:

CrossConv: Cross-convolution downsampling module. It takes an input tensor x, performs a series of convolution operations on it, and then adds the result to the input tensor. You can choose whether to use shortcut connection.
Sum: Weighted sum module of multiple layers. It takes a list of tensors x and does a weighted sum of the tensors in the list. It is optional to apply weights or not.
MixConv2d: Mixed depth convolution module. It takes an input tensor x, performs a series of convolution operations on it, and then stitches the results together.
Ensemble: model collection module. It is a list of models that passes input tensors to each model in the list and concatenates their outputs together.

In addition, the file also defines an auxiliary function attempt_load for loading model weights.

Overall, this program file contains some experimental models and functions for the development and research of YOLOv5.

6. Overall system structure

Based on the above analysis, this program is a farm newborn piglet monitoring system based on YOLOv5 of ODConv & ConvNeXt. Its overall function is to use the YOLOv5 model to perform object detection on images in the farm to monitor and identify newborn piglets.

The following is a summary of the functions of each file:

td>

File path	Function
fit.py	Defines some custom neural network layers and modules
torch_utils.py	Contains some PyTorch tool functions and classes
train.py	Script for training model
ui.py	Interface of farm newborn piglet monitoring system implemented using PyQt5
models\common.py	Implementation of YOLOv5 model
models\experimental.py	Contains some experimental models and functions
models\tf.py	Implementation of TensorFlow model
models\yolo.py	Implementation of YOLO model
models_init_.py	Initialization file of the model module
tools\activations.py	Implementation of activation function
tools\augmentations.py	Implementation of data enhancement
tools\autoanchor.py	Implementation of automatic anchor box
tools\autobatch.py	Implementation of automatic batch processing
tools\callbacks.py	Implementation of callback function
tools\datasets.py	Implementation of data set processing
tools\downloads.py	Download the data set and Implementation of weights
tools\general.py	General tool functions and classes
tools\loss.py	Implementation of loss function
tools\metrics.py	Implementation of evaluation indicators
tools\plots.py	Implementation of drawing functions
tools\torch_utils.py	PyTorch tool functions and classes
tools_init_.py	Initialization file of the tool module
tools\aws\resume.py	Implementation of model recovery function on AWS
tools\aws_init_.py	Initialization file of AWS module
tools\flask_rest_api\example_request .py	Sample request for Flask REST API
tools\flask_rest_api\restapi.py	Implementation of Flask REST API
tools\loggers_init_.py	Initialization file of the logging module
tools\loggers\wandb\log_dataset.py	Implementation of recording data set using WandB
tools\loggers \wandb\sweep.py	Implementation of hyperparameter search using WandB
tools\loggers\wandb\wandb_utils.py	Use WandB’s utility functions and classes
tools\loggers\wandb_init_.py	Initialization file of WandB module
utils\activations.py	Implementation of activation function
utils\augmentations.py	Implementation of data enhancement
utils\autoanchor.py	Automatic anchor box Implementation
utils\autobatch.py	Implementation of automatic batch processing
utils\ callbacks.py	Implementation of callback function
utils\datasets.py	Implementation of data set processing
utils\downloads.py	Implementation of downloading data sets and weights
utils\general.py	Common utility functions and classes
utils\loss.py	Implementation of loss function
utils\metrics.py	Implementation of evaluation indicators
utils\plots.py	Implementation of drawing functions
utils\torch_utils.py	PyTorch tool functions and classes
utils_init_.py	Initialization file of tool module
utils\aws\resume.py	Implementation of model recovery function on AWS
utils\aws_init_.py	Initialization file of AWS module
utils\flask_rest_api\example_request.py	Example request of Flask REST API
utils\flask_rest_api\restapi.py	Implementation of Flask REST API
utils\loggers_init< /em>_.py	Initialization file of the logging module
utils\loggers\wandb\log_dataset.py	Implementation of using WandB to record data sets
utils\loggers\wandb\sweep.py	Implementation of using WandB for hyperparameter search
utils\loggers\wandb\wandb_utils.py	Use WandB’s utility functions and classes
utils\loggers\wandb_init_.py	Initialization file of WandB module

7. Improvement module

ConvNeXt

Since ViT (Vision Transformer) shines in the field of CV, more and more researchers have begun to embrace Transformer. Looking back on the past year, most of the articles published in the CV field are based on Transformer, such as the best paper Swin Transformer of ICCV in 2021, and the convolutional neural network has begun to slowly fade out of the center of the stage. Will convolutional neural networks be replaced by Transformers? Maybe in the near future. In January of this year (2022), Facebook AI Research and UC Berkeley published an article A ConvNet for the 2020s, in which the ConvNeXt pure convolutional neural network was proposed. It targets the very popular Swin Transformer in 2021. Through A series of experimental comparisons show that under the same FLOPs, ConvNeXt has faster inference speed and higher accuracy than Swin Transformer. ConvNeXt-XL reached an accuracy of 87.8% on ImageNet 22K, see the figure below (original text Table 12). It seems that the proposal of ConvNeXt has forcibly renewed the life of convolutional neural networks.

ConvNeXt is a convolutional neural network model jointly proposed by Facebook AI Research and UC Berkeley. It is a pure convolutional neural network, composed of standard convolutional neural network modules, with the characteristics of high accuracy, high efficiency, strong scalability and very simple design. ConvNeXt published a paper at CVPR 2022 titled “Convolutional Neural Networks for the 2020s.” ConvNeXt has been trained on ImageNet-1K and ImageNet-22K datasets and achieved excellent performance on multiple tasks. The training code and pre-trained models of ConvNeXt are made public on GitHub.
ConvNeXt is improved based on ResNet50. Like Swin Transformer, it has 4 stages; the difference is that ConvNeXt changes the ratio of the number of blocks in each stage from 3:4:6:3 to the same 1:1 as Swin Transformer: 3:1. In addition, in terms of feature map downsampling, ConvNeXt uses a convolution kernel with a stride of 4 and a size of 4×4 consistent with Swin Transformer.

8.Advantages of ConvNeXt

ConvNeXt is a pure convolutional neural network, which is composed of standard convolutional neural network modules and has the characteristics of high accuracy, high efficiency, strong scalability and very simple design.
ConvNeXt is trained on ImageNet-1K and ImageNet-22K datasets and achieves excellent performance on multiple tasks.
ConvNeXt adopts some advanced ideas of the Transformer network to make some adjustments and improvements to the existing classic ResNet50/200 network, and introduces some of the latest ideas and technologies of the Transformer network into the existing modules of the CNN network to combine the advantages of the two networks. , improve the performance of CNN network.
Disadvantages of ConvNeXt include:
ConvNeXt has not made major innovations in the overall network framework and construction ideas. It only makes some adjustments and improvements to the existing classic ResNet50/200 network based on some advanced ideas of the Transformer network.
ConvNeXt requires more computing resources in some cases relative to other CNN models.

9. System integration

The complete source code & environment deployment video tutorial & custom UI interface shown below

Reference blog “Farm newborn piglet monitoring system based on YOLOv5 based on ODConv & ConvNeXt”

10. References

[1] Li Bin, Liu Dongyang, Shi Guolong, et al. Posture detection of group-raised pigs based on improved YOLOv4 model [J]. Journal of Zhejiang Agriculture. 2023, 35(1). DOI: 10.3969/j.issn.1004-1524.2023 .01.23 .

[2] Liu Yanan, Shen Mingxia, Liu Longshen, et al. Research on monitoring method of sow lactation behavior based on machine vision [J]. Journal of Nanjing Agricultural University. 2022, 45(2). DOI: 10.7685/jnau.202106027.

[3] Dong Lizhong, Meng Xiangbao, Pan Ming, et al. Pig behavior recognition method based on posture and timing characteristics [J]. Transactions of the Chinese Society of Agricultural Engineering. 2022, 38(5). DOI: 10.11975/j.issn.1002-6819.2022. 05.018.

[4] Xie Qiuju, Wu Mengru, Bao Jun, et al. Individual pig face recognition integrating attention mechanism [J]. Transactions of the Chinese Society of Agricultural Engineering. 2022, 38(7). DOI: 10.11975/j.issn.1002-6819.2022.07.020.

[5] Tu Shuqin, Liu Xiaolong, Liang Yun, et al. Behavior identification and tracking method of group-raised pigs based on improved DeepSORT [J]. Journal of Agricultural Machinery. 2022, 53(8). DOI: 10.6041/j.issn.1000- 1298.2022.08.037.

[6] Ding Qian, Liu Longshen, Chen Jia, et al. Target detection of lactating piglets based on Jetson Nano + YOLO v5 [J]. Journal of Agricultural Machinery. 2022, 53(3). DOI: 10.6041/j.issn.1000-1298.2022 .03.029 .

[7] Zhang Wei, Shen Mingxia, Liu Longshen, et al. Research on weaned piglet target tracking method based on CenterNet collocation optimization DeepSORT algorithm [J]. Journal of Nanjing Agricultural University. 2021, (5). DOI: 10.7685/jnau.202011017.

[8] Zhang Hongming, Wang Run, Dong Peijie, et al. Multi-target tracking method of beef cattle based on DeepSORT algorithm [J]. Journal of Agricultural Machinery. 2021, (4). DOI: 10.6041/j.issn.1000-1298.2021.04.026.

[9] Lin Chengchuang, Chun Chun, Zhao Gansen, et al. A review of image data augmentation in machine vision applications [J]. Computer Science and Exploration. 2021, (4). DOI: 10.3778/j.issn.1673-9418.2102015.

[10] Miao Xiaohui. A brief discussion on the problems and countermeasures of large-scale pig breeding [J]. China Animal Health. 2021, (11). DOI: 10.3969/j.issn.1008-4754.2021.11.050.