One Demo handles the front-end and back-end multi-part uploading of large files, resumed uploading at breakpoints, and instantaneous uploading.

1Foreword

File uploading is very common in project development. Most projects will involve the uploading of pictures, audios, videos, and files. Usually a simple Form can upload small files, but when encountering large files, such as more than 1GB, or When the user’s network is relatively slow, simple file upload is not applicable. The user has to upload for dozens of minutes and finally finds that the upload fails. The user experience of such a system is very poor.

Or when the user is halfway through uploading, he exits the application and comes in again to upload again. It would be unreasonable to ask him to start uploading from the beginning. This article mainly uses a Demo to demonstrate the development principles of small file upload, large file segmented upload, breakpoint resume, and instant transfer from the front end and back end with actual code.

2Small file upload

Small file transfer is very simple. We use SrpingBoot 3.1.2 + JDK17 on the back end of this project, and we use native JavaScript + spark-md5.min.js on the front end.
Backend code
POM.xml uses springboot3.1.2 JAVA version uses JDK8

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.1.2</version>
    <relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.example</groupId>
<artifactId>uploadDemo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>uploadDemo</name>
<description>uploadDemo</description>
<properties>
    <java.version>17</java.version>
</properties>
<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>
<build>
    <plugins>
        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin</artifactId>
        </plugin>
    </plugins>
</build>

JAVA file interface:

@RestController
public class UploadController {<!-- -->

    public static final String UPLOAD_PATH = "D:\upload\";

    @RequestMapping("/upload")
    public ResponseEntity<Map<String, String>> upload(@RequestParam MultipartFile file) throws IOException {<!-- -->
        File dstFile = new File(UPLOAD_PATH, String.format("%s.%s", UUID.randomUUID(), StringUtils.getFilename(file.getOriginalFilename())));
        file.transferTo(dstFile);
        return ResponseEntity.ok(Map.of("path", dstFile.getAbsolutePath()));
    }

}

Front-end code

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>upload</title>
</head>
<body>
upload

<form enctype="multipart/form-data">
    <input type="file" name="fileInput" id="fileInput">
    <input type="button" value="Upload" onclick="uploadFile()">
</form>

Upload results
<span id="uploadResult"></span>

<script>
    var uploadResult=document.getElementById("uploadResult")
    function uploadFile() {<!-- -->
        var fileInput = document.getElementById('fileInput');
        var file = fileInput.files[0];
        if (!file) return; // No file selected

        var xhr = new XMLHttpRequest();
        // Handle upload progress
        xhr.upload.onprogress = function(event) {<!-- -->
            var percent = 100 * event.loaded / event.total;
            uploadResult.innerHTML='Upload progress:' + percent + '%';
        };
        // Called when the upload is complete
        xhr.onload = function() {<!-- -->
            if (xhr.status === 200) {<!-- -->
                uploadResult.innerHTML='Upload successful' + xhr.responseText;
            }
        }
        xhr.onerror = function() {<!-- -->
            uploadResult.innerHTML='Upload failed';
        }
        // send request
        xhr.open('POST', '/upload', true);
        var formData = new FormData();
        formData.append('file', file);
        xhr.send(formData);
    }
</script>

</body>
</html>


Notes
A file size limit error will be reported during the upload process. There are three main parameters that need to be set:

org.apache.tomcat.util.http.fileupload.impl.SizeLimitExceededException: the request was rejected because its size (46302921) exceeds the configured maximum (10485760)

Here you need to add the max-file-size and max-request-size configuration items in springboot’s application.properties or application.yml. The default sizes are 1M and 10M respectively, which definitely cannot meet our upload needs.

spring.servlet.multipart.max-file-size=1024MB
spring.servlet.multipart.max-request-size=1024MB

If you use nginx to report 413 status code 413 Request Entity Too Large, Nginx will upload a maximum file of 1MB by default. You need to add the configuration item: client_max_body_size 1024m to http{ } in the nginx.conf configuration file.

Upload 3 large files in parts

Front end
Front-end upload process
There are three main steps in the front-end for large file multi-part upload:

The front-end upload code uses the spark-md5 library to calculate the file MD5 value, which is relatively simple to use. Let me briefly explain why MD5 is calculated here. Because errors may occur during the file transfer and writing process, the final synthesized file may be different from the original file. Therefore, it is necessary to compare the MD5 calculated by the front-end and the MD5 calculated by the back-end. Same, ensuring the consistency of uploaded data.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Multiple upload</title>
    <script src="//i2.wp.com/cdn.bootcdn.net/ajax/libs/spark-md5/3.0.2/spark-md5.min.js"></script>
</head>
<body>
Multipart upload

<form enctype="multipart/form-data">
    <input type="file" name="fileInput" id="fileInput">
    <input type="button" value="Calculate File MD5" onclick="calculateFileMD5()">
    <input type="button" value="Upload" onclick="uploadFile()">
    <input type="button" value="Check file integrity" onclick="checkFile()">
</form>

<p>
    File MD5:
    <span id="fileMd5"></span>
</p>
<p>
    Upload results:
    <span id="uploadResult"></span>
</p>
<p>
    Check file integrity:
    <span id="checkFileRes"></span>
</p>


<script>
    //The size of each piece
    var chunkSize = 1 * 1024 * 1024;
    var uploadResult = document.getElementById("uploadResult")
    var fileMd5Span = document.getElementById("fileMd5")
    var checkFileRes = document.getElementById("checkFileRes")
    var fileMd5;


    function calculateFileMD5(){<!-- -->
        var fileInput = document.getElementById('fileInput');
        var file = fileInput.files[0];
        getFileMd5(file).then((md5) => {<!-- -->
            console.info(md5)
            fileMd5=md5;
            fileMd5Span.innerHTML=md5;
        })
    }

    function uploadFile() {<!-- -->
        var fileInput = document.getElementById('fileInput');
        var file = fileInput.files[0];
        if (!file) return;
        if (!fileMd5) return;


        //Get the file
        let fileArr = this.sliceFile(file);
        //Save file name
        let fileName = file.name;

        fileArr.forEach((e, i) => {<!-- -->
            //Create formdata object
            let data = new FormData();
            data.append("totalNumber", fileArr.length)
            data.append("chunkSize", chunkSize)
            data.append("chunkNumber", i)
            data.append("md5", fileMd5)
            data.append("file", new File([e],fileName));
            upload(data);
        })


    }

    /**
     * Calculate file md5 value
     */
    function getFileMd5(file) {<!-- -->
        return new Promise((resolve, reject) => {<!-- -->
            let fileReader = new FileReader()
            fileReader.onload = function (event) {<!-- -->
                let fileMd5 = SparkMD5.ArrayBuffer.hash(event.target.result)
                resolve(fileMd5)
            }
            fileReader.readAsArrayBuffer(file)
        })
    }


   function upload(data) {<!-- -->
       var xhr = new XMLHttpRequest();
       // Called when the upload is complete
       xhr.onload = function () {<!-- -->
           if (xhr.status === 200) {<!-- -->
               uploadResult.append( 'Upload successful fragment:' + data.get("chunkNumber") + '\t' );
           }
       }
       xhr.onerror = function () {<!-- -->
           uploadResult.innerHTML = 'Upload failed';
       }
       // send request
       xhr.open('POST', '/uploadBig', true);
       xhr.send(data);
    }

    function checkFile() {<!-- -->
        var xhr = new XMLHttpRequest();
        // Called when the upload is complete
        xhr.onload = function () {<!-- -->
            if (xhr.status === 200) {<!-- -->
                checkFileRes.innerHTML = 'File integrity check successful:' + xhr.responseText;
            }
        }
        xhr.onerror = function () {<!-- -->
            checkFileRes.innerHTML = 'Failed to check file integrity';
        }
        // send request
        xhr.open('POST', '/checkFile', true);
        let data = new FormData();
        data.append("md5", fileMd5)
        xhr.send(data);
    }

    function sliceFile(file) {<!-- -->
        const chunks = [];
        let start = 0;
        let end;
        while (start < file.size) {<!-- -->
            end = Math.min(start + chunkSize, file.size);
            chunks.push(file.slice(start, end));
            start = end;
        }
        return chunks;
    }

</script>

</body>
</html>

Front-end considerations
The front-end calls the uploadBig interface with four parameters:

Calculating the MD5 of large files may be slow. This can be optimized from a process perspective. For example, uploading uses asynchronous calculation to calculate the file MD5. Instead of calculating the MD5 of the entire file, it calculates the MD5 of each piece to ensure the consistency of each piece of data.
Backend
The backend has two interfaces: /uploadBig is used to upload each file and /checkFile detects the MD5 of the file.
/uploadBig interface design ideas
Overall interface process:

Things to note here:

  • MD5.conf creates an empty file every time it detects that the file does not exist. Use byte[] bytes = new byte[totalNumber]; to set the status of each bit to 0, starting from the 0th bit, and the Nth bit represents the Nth minute. The upload status of the piece, 0 – not uploaded, 1 – uploaded, use randomAccessConfFile.seek(chunkNumber) to set it to 1 after each upload is successful.
  • randomAccessFile.seek(chunkNumber * chunkSize); You can move the cursor to the specified position of the file and start writing data. Each file will be uploaded with a different chunk number, chunkNumber, so each will write its own file block, and multiple threads will write the same file. There will be no thread safety issues.
  • Using RandomAccessFile may be slower when writing large files. You can use MappedByteBuffer memory mapping to speed up writing large files. However, if you use MappedByteBuffer to delete a file, it may not be deleted because the file on the disk is deleted, but the file in the memory still exists. of.
    Usage of MappedByteBuffer for writing files:
FileChannel fileChannel = randomAccessFile.getChannel();
MappedByteBuffer mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, chunkNumber * chunkSize, fileData.length);
mappedByteBuffer.put(fileData);

/checkFile interface design ideas
/checkFile interface process:

Complete JAVA code for large file upload:

@RestController
public class UploadController {<!-- -->

    public static final String UPLOAD_PATH = "D:\upload\";

    /**
     * @param chunkSize size of each fragment
     * @param chunkNumber current fragment
     * @param md5 file total MD5
     * @param file current fragmented file data
     * @return
     * @throwsIOException
     */
    @RequestMapping("/uploadBig")
    public ResponseEntity<Map<String, String>> uploadBig(@RequestParam Long chunkSize, @RequestParam Integer totalNumber, @RequestParam Long chunkNumber, @RequestParam String md5, @RequestParam MultipartFile file) throws IOException {<!-- -->
        //File storage location
        String dstFile = String.format("%s\%s\%s.%s", UPLOAD_PATH, md5, md5, StringUtils.getFilenameExtension(file.getOriginalFilename()));
        //Upload fragment information storage location
        String confFile = String.format("%s\%s\%s.conf", UPLOAD_PATH, md5, md5);
        //Create the shard record file for the first time
        //Create a directory
        File dir = new File(dstFile).getParentFile();
        if (!dir.exists()) {<!-- -->
            dir.mkdir();
            //Set all shard status to 0
            byte[] bytes = new byte[totalNumber];
            Files.write(Path.of(confFile), bytes);
        }
        //Write files into random slices
        try (RandomAccessFile randomAccessFile = new RandomAccessFile(dstFile, "rw");
             RandomAccessFile randomAccessConfFile = new RandomAccessFile(confFile, "rw");
             InputStream inputStream = file.getInputStream()) {<!-- -->
            //Locate the offset of the fragment
            randomAccessFile.seek(chunkNumber * chunkSize);
            //Write the fragment data
            randomAccessFile.write(inputStream.readAllBytes());
            //Locate the current fragment status position
            randomAccessConfFile.seek(chunkNumber);
            //Set the current multipart upload status to 1
            randomAccessConfFile.write(1);
        }
        return ResponseEntity.ok(Map.of("path", dstFile));
    }


    /**
     * Obtain file fragmentation status and detect file MD5 legality
     *
     * @param md5
     * @return
     * @throwsException
     */
    @RequestMapping("/checkFile")
    public ResponseEntity<Map<String, String>> uploadBig(@RequestParam String md5) throws Exception {<!-- -->
        String uploadPath = String.format("%s\%s\%s.conf", UPLOAD_PATH, md5, md5);
        Path path = Path.of(uploadPath);
        //MD5 directory does not exist and the file has never been uploaded
        if (!Files.exists(path.getParent())) {<!-- -->
            return ResponseEntity.ok(Map.of("msg", "File not uploaded"));
        }
        //Determine whether the file is uploaded successfully
        StringBuilder stringBuilder = new StringBuilder();
        byte[] bytes = Files.readAllBytes(path);
        for (byte b : bytes) {<!-- -->
            stringBuilder.append(String.valueOf(b));
        }
        //All multi-part uploads are completed and the file MD5 is calculated.
        if (!stringBuilder.toString().contains("0")) {<!-- -->
            File file = new File(String.format("%s\\%s\", UPLOAD_PATH, md5));
            File[] files = file.listFiles();
            String filePath = "";
            for (File f : files) {<!-- -->
                //Calculate whether the file MD5 is equal
                if (!f.getName().contains("conf")) {<!-- -->
                    filePath = f.getAbsolutePath();
                    try (InputStream inputStream = new FileInputStream(f)) {<!-- -->
                        String md5pwd = DigestUtils.md5DigestAsHex(inputStream);
                        if (!md5pwd.equalsIgnoreCase(md5)) {<!-- -->
                            return ResponseEntity.ok(Map.of("msg", "File upload failed"));
                        }
                    }
                }
            }
            return ResponseEntity.ok(Map.of("path", filePath));
        } else {<!-- -->
            //The file has not been uploaded, and the status of each fragment is returned. The front end will continue to upload the unuploaded fragments.
            return ResponseEntity.ok(Map.of("chucks", stringBuilder.toString()));
        }

    }
    
}

Cooperate with the front-end upload demo to upload in parts, and click the button according to the following process:


Resume interruption
With the above design, it is relatively simple to resume the upload. The back-end code does not need to be changed. Just modify the front-end upload process:


Use the /checkFile interface. If there are unfinished uploaded fragments in the file, the interface returns the corresponding position value of the chunks field as 0. The front end will continue to upload the unuploaded fragments. After completion, call /checkFile to complete the breakpoint resume upload.

{<!-- -->
    "chucks": "111111111100000000001111111111111111111111111"
}

Second transfer

Secondary upload is also relatively simple. Just modify the front-end code process. For example, if Zhang San uploads a file, and then Li Si uploads a file with the same content, the MD5 value of the same file can be considered the same (although there will be differences The MD5 of the files is the same, but the probability is very small. You can think that the files with the same MD5 are the same). The probability of the same MD5 of 100,000 different files is 110000000000000000000000000000\frac{1}{100000000000000000000000000000}1000000000000000000000000000 001, the probability of winning the jackpot in welfare lottery is generally 11000000 \frac{1}{1000000}10000001. For the specific calculation method, please refer to the probability of approaching message summary-Md5 duplication, so the probability of MD5 conflict can be ignored.

When Li Si calls the /checkFile interface, the backend directly returns the file path uploaded by Li Si, and Li Si completes the instant transfer. The idea of most cloud disk instant transfers should be the same, except that the file HASH calculation algorithm is more complicated, and the file path returned to the user is also safer to prevent others from calculating the file path.
The process of uploading front-end code in seconds:

4Summary

This article introduces the design ideas and implementation code of large file multi-part upload, breakpoint continuation, and instant transfer from both the front-end and back-end aspects. All codes have been tested by myself and can be used directly.