[Target Detection] Flask+Docker deploys YOLOv5 applications on the server

Foreword

I have seen many articles explaining the use of Flask to deploy YOLOv5, but they basically stopped running locally. Therefore, I plan to go one step further and use Docker to deploy YOLOv5 on the cloud server, so that it can be opened to others for use.

Code repository: https://github.com/zstar1003/yolov5-flask

Local deployment

The local project mainly refers to the project of robmarkcole[1]. The original project was released more than a year ago and probably used an earlier version of YOLOv5. Some problems will occur if you download it directly. So I refactored it using YOLOv5-5.0 version.

Project structure

The overall project structure is shown below:

models: stores the programs related to model construction, cloned directly from the yolov5-5.0 version
utils: stores drawing, data loading and other related tools, cloned directly from yolov5-5.0 version
static: stores static files such as front-end rendering
templates: front-end page html file
webapp.py: entry program

Overall, the entire project is relatively simple as a demo. Of course, there is a certain degree of redundancy in utils and models. Some tool classes provide services for training and testing, and only inference is required here. I originally thought about further streamlining it, but found that the correlation between the functions was quite large, so I did not modify it.

Quickly run

The yolov5s.pt file has already been stored in the warehouse, so there is no need to download additional model files.
Run python webapp.py in the terminal, wait a moment, and you can access http://127.0.0.1:5000

Select the file on the homepage and upload it to return the model prediction results. The predicted images will be saved in the static folder.

Code brief analysis

Core code:

@app.route("/", methods=["GET", "POST"])
def predict():
    if request.method == "POST":
        if "file" not in request.files:
            return redirect(request.url)
        file = request.files["file"]
        if not file:
            return

        img_bytes = file.read()
        img = Image.open(io.BytesIO(img_bytes))
        img = cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

        if img is not None:
            showimg = img
            with torch.no_grad():
                img = letterbox(img, new_shape=imgsz)[0]
                #Convert
                # BGR to RGB, to 3x416x416
                img = img[:, :, ::-1].transpose(2, 0, 1)
                img = np.ascontiguousarray(img)
                img = torch.from_numpy(img).to(device)
                img = img.half() if model.half() else img.float() # uint8 to fp16/32
                img /= 255.0 # 0 - 255 to 0.0 - 1.0
                if img.ndimension() == 3:
                    img = img.unsqueeze(0)
                #Inference
                pred = model(img)[0]

                # Apply NMS
                pred = non_max_suppression(pred, conf_thres, iou_thres)
                #Process detections
                for i, det in enumerate(pred): # detections per image
                    if det is not None and len(det):
                        # Rescale boxes from img_size to im0 size
                        det[:, :4] = scale_coords(
                            img.shape[2:], det[:, :4], showimg.shape).round()
                        # Write results
                        for *xyxy, conf, cls in reversed(det):
                            label = '%s %.2f' % (names[int(cls)], conf)
                            plot_one_box(
                                xyxy, showimg, label=label, color=colors[int(cls)], line_thickness=2)

        imgFile = "static/img.jpg"
        cv2.imwrite(imgFile, showimg)

        return redirect(imgFile)

    return render_template("index.html")

The front end submits the image to the back end through POST. First, it needs to determine whether the image is empty. If it is empty, it will return a null value, that is, an error interface; if it is not a null value, it will pass file.read() to read the image byte string. The original code is converted into an image through PIL.Image. In order to be compatible with the subsequent reasoning process, it is converted into OpenCV format.
The code for the inference part is basically completely copied from detect.py of YOLOv5. The image after inference is first saved and then returned to the front end for direct display.

Cloud deployment

There are many options for server deployment. The easiest one to think of is to build a python environment directly on the server. However, considering that you also need to install a large library such as torch, the probability of error is high, so it is more convenient to use Docker for deployment.

To simply understand, Docker is like a container that comes with its own virtual environment and programs. You only need to package it on the server and run it directly.

Generate requirements.txt

The first step is to generate a dependency file list requirements.txt so that the required dependencies can be configured in the Docker Image.
The usual approach is to generate it like this:

pip freeze > requirements.txt

Then you can quickly install it in the new environment like this:

pip install -r requirements.txt

But a huge problem with this is that it will output all the library names and versions in the environment. Some libraries are not used in the project, but they will still be output.

In order to avoid this situation, someone developed a pipreqs library, which can perform some filtering and only output the libraries and versions used in the project.

Pipreqs can be installed in two ways:
method one:

pip install pipreqs

Method two:
If pip fails, you can clone the project on Github and run setup.py

git clone https://github.com/bndr/pipreqs.git
python setup.py install

After installation, run it in the current directory

pipreqs . --encoding=utf8 --force

This will generate requirement.txt

coremltools==5.2.0
Flask==2.2.2
matplotlib==3.5.2
numpy==1.21.5
onnx==1.12.0
pafy==0.5.5
pandas==1.3.5
Pillow==9.2.0
PyYAML==6.0
requests==2.28.1
scipy==1.7.3
seaborn==0.11.2
setuptools==63.2.0
torch==1.11.0
torchvision==0.12.0
tqdm==4.64.0
opencv-python==4.6.0.66

Note that you need to check after generation. For example, if the cloud server does not have a GPU environment, then manually change the torch to the CPU version.

Build DockerFile

DockerFile is a build file that contains all environment configuration steps, such as installing libraries.

Commonly used instructions include these [2]:

FROM #Basic image, everything starts from here to build centos
MAINTAINER # Who wrote the mirror, name + email
RUN # Commands that need to be run when building the image
ADD # Steps, tomcat image, this tomcat compressed package! Add content Add the same directory
WORKDIR # Mirror’s working directory
VOLUME # Mounted directory
EXPOSE # Exposed port configuration is the same as our -p
CMD # Specify the command to be run when this container starts. Only the last one will take effect and can be replaced.
ENTRYPOINT # Specify the command to be run when this container starts. You can append the command
ONBUILD # When building an inherited DockerFile, the ONBUILD command will be run and the command will be triggered.
COPY # Similar to ADD, copy our files to the image
ENV # Set environment variables when building

The contents of the built DockerFile are as follows:

FROM python:3.7-slim-buster

RUN apt-get update
RUN apt-get install ffmpeg libsm6 libxext6 -y

WORKDIR/app
ADD ./app
RUN pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/

EXPOSE 5000

CMD ["python", "webapp.py"]

First, introduce the python environment from the public image python:3.7-slim-buster, then update the data source and install the three necessary tools.

After that, specify the working path under the folder /app. This path setting is very important and will be used later.

Then install all the dependencies listed in requirements.txt. Note that Alibaba sources are used here, which can be accelerated.

Then the port 5000 is exposed, because it will be accessed through this port number later.

Finally, CMD specifies the command to be executed after the container is running. That is, once the container is running, python webapp.py will be executed to run the program.

Docker package upload

Before registering, you need to install Docker locally and register it. Windows systems can install the Docekr client, which will not be described here.

After opening the client, you can start the Docker service locally, then enter the project terminal and enter

docker build --tag zstar1003/yolov5-flask .

Note that there is a dot at the end, which means packaging everything into an image. --tag specifies the image name. Note that it must be preceded by the user name, otherwise it will not be able to be pulled later. If you forget to add the user name to the previous tag, you can change the name after packaging. Use docker tag original name zstar1003/yolov5-flask

Regarding the naming rules of Docker, you can read this article [3] for a more detailed explanation.

The packaging process is relatively long, because the system needs to go online to download the previous dependencies. After the packaging is completed, enter docker images in the terminal to see all local images.

After you have the image locally, push it to the public warehouse to facilitate subsequent pulls. Execute the command:

docker push zstar1003/yolov5-flask

The upload process is also relatively long, mainly depending on the size of the image and the network speed. After the upload is completed, it can be seen at this location on the client.

Docker image pulling

The following is the operation on the cloud server. It is recommended to use FinalShell to connect to the cloud server.

First, you need to install Docker on the cloud server. The cloud server system I am using is Centos 7.6.
Install some necessary software packages first

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

Set up a stable warehouse

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

Install the community version of Docker

sudo yum install docker-ce docker-ce-cli containerd.io

After installation, start Docker

systemctl start docker

Then you can pull the previously uploaded image

docker pull zstar1003/yolov5-flask

After pulling it, enter docker image -a to see if the image exists. If it exists, it means the pulling is successful.

Docker creates the container and starts it

After pulling the image, you need to create a container to mount the image. The main command is docker run. There are the following optional parameters. For more commands, please refer to [4]

docker run [Options] image

#Parameter Description
--name="name" specifies the container name
-d runs in background mode
-it runs in interactive mode and enters the container to view the content
-p specifies the port of the container
-p ip:host port:container port Configure the host port to be mapped to the container port
-p host port: container port (commonly used)
-p container port
-P specifies a random port
-e environment settings
-v container data volume mount

So I entered docker run -p 5000:5000 zstar1003/yolov5-flask to create a container to run this image. The result was an error because port 5000 was occupied. This is because I have run other projects on this server before, and other processes are working on port 5000.

In this case, you can use lsof to query which process exists on the conflicting port.

First install lsof

yum -y install lsof

Then enter

lsof -i :5000

As shown in the figure below, you can see that port 5000 is being used by the gunicorn process, so kill it.

After killing, restart the container. First enter docker ps -a to view the container ID just created.

You can see that its ID is 34960ff95951, then start it

docker start 34960ff95951

After starting, if you see the following input in the terminal, it means that the program is running normally.

At this time, access the server’s public IP: port 5000, and you can see that the front-end interface has been displayed.

Error solving

However, when I uploaded the image and clicked the button, an error suddenly occurred:

RuntimeError: “slow_conv2d_cpu” not implemented for Half’

I found the answer to this question in Github issue[5]. The original answer is as follows:

Q: Error “slow_conv2d_cpu” not implemented for Half’
A: In order to save GPU memory consumption and speed up inference, Real-ESRGAN uses half precision (fp16) during inference by default. However, some operators for half inference are not implemented in CPU mode. You need to add –fp32 option for the commands. For example, python inference_realesrgan.py -n RealESRGAN_x4plus.pth -i inputs –fp32

To translate, in order to speed up inference on this machine, model.half() half precision (fp16) is used for conversion. Then, this can only be used in the GPU version of Pytorch, and in the CPU version An error will be reported in Pytorch.

Therefore, we have to find a way to modify the files in docker and remove the half operation.

Remember the path specified in DockerFile before? Previously, the Docker working path was specified in the app folder, so you can use the following command to copy it.

docker cp 34960ff95951:/app/webapp.py /home/torch/

As shown in the figure below, make modifications in the two half() places. The image directly uses the float() type.

After modification, copy the file back, which will overwrite the original file and achieve the purpose of modification.

docker cp /home/torch/webapp.py 34960ff95951:/app/

After modification, restart the container:

docker restart 34960ff95951

However, when running this, I encountered the following error:

AttributeError: Upsample’ object has no attribute recompute_scale_factor’

After consulting relevant information, this is a bug in the pytorch version. In the upsampling.py file, there is recompute_scale_factor parameter redundancy.

Similar to the above operation, just copy the file, modify it, and then copy it back. Note that this file is a dependent file, and you need to modify the read and write permissions of the file before copying it.

Operating effect

After eliminating these two errors, restart the container again, upload the image, and you can see that the inference results have been correctly presented!

Summary

This deployment using Docker encountered many obstacles. When deploying next time, if the server is a CPU environment, it is best to run it locally on the CPU first, and then package the image if it runs smoothly.

References

[1]https://github.com/robmarkcole/yolov5-flask
[2]https://liuhuanhuan.blog.csdn.net/article/details/123256877
[3]https://www.zsythink.net/archives/4302
[4]https://blog.csdn.net/weixin_45698637/article/details/124213429
[5]https://github.com/xinntao/Real-ESRGAN/blob/master/docs/FAQ.md