[springboot practice] Breakpoint continuation is done like this–attached code

Table of Contents

    • background
    • play
      • RandomAccessFile
        • APIs
    • the code
      • file chunking
      • Breakpoint resume upload, file transfer in seconds
      • Multi-part upload, file merging
    • Summarize

What I want to share with you today is another practical article, which I also encountered in my private work recently. The omnipotent Internet gave me a way, so let me share it.

Background

Recently, I received a new request, which needs to upload a video file of about 2G. I tried it with OSS in the test environment. It takes more than ten minutes to upload, and considering the company’s Due to resource issues, the program was decisively abandoned.

When it comes to uploading large files, the first thing I think of is all kinds of network disks. Now everyone likes to upload their favorite **”small movies”** to network disks for storage. Network disks generally support resumable uploads from breakpoints and instant file transfers, which reduces network fluctuations and network bandwidth restrictions on files, greatly improves user experience, and makes people fondle admiringly.

Speaking of which, let’s first understand these concepts:

  • “File Partition”: Split large files into small files, upload\download small files, and finally assemble small files into large files;
  • “Breakpoint Resume”: On the basis of file division, each small file is uploaded\downloaded in a separate thread. If there is a network failure, it can be uploaded\downloaded The part starts to continue uploading\downloading the unfinished part, and there is no need to start uploading\downloading from the beginning;
  • “File transfer in seconds”: The file already exists in the resource server, and the URI of the file is returned directly when uploaded by others.

Get started

RandomAccessFile

Usually we will use IO streams such as FileInputStream, FileOutputStream, FileReader and FileWriter to Read files, let’s take a look at RandomAccessFile today.

It is an independent class that directly inherits Object, and it implements DataInput and DataOutput interfaces in the underlying implementation. This class supports random reading of files, which are similar to large byte arrays stored in the file system.

Its implementation is based on **”file pointer”** (a cursor or index pointing to an implicit array), the file pointer can be read by the getFilePointer method, or by seek code> method settings.

On input, bytes are read from the beginning of the file pointer, and the file pointer is past the bytes read, and an output operation that writes past the current end of the implied array causes the array to be extended. This class has four modes to choose from:

  • r: Open the file in read-only mode, if the write operation is performed, IOException will be thrown;
  • rw: Open the file for reading and writing, if the file does not exist, try to create the file;
  • rws: Open the file in read and write mode, requiring each update of the file content or metadata to be written to the underlying storage device synchronously;
  • rwd: Open the file in read and write mode, requiring each update of the file content to be written to the underlying storage device synchronously;

In rw mode, the default is to use buffer, only if cache is full or use RandomAccessFile.close() The actual writing to the file is done when the stream is closed.

API

1. void seek(long pos): Set the file pointer offset for the next read or write. In layman’s terms, it is to specify the position of the next read file data.

?

The offset can be set beyond the end of the file, and the file length can only be changed by writing after the offset is set beyond the end of the file;

?

2. native long getFilePointer(): returns the cursor position of the current file;

3. native long length(): returns the length of the current file;

4. **「Read」** method! [Image](data:image/svg + xml,< /svg>)

5. **“Write”** method

Image

6. readFully(byte[] b): The function of this method is to fill the buffer b with the contents of the text. If the buffer b cannot be filled, the process of reading the stream will be blocked, and if it is found to be the end of the stream, an exception will be thrown;

7. FileChannel getChannel(): returns the unique FileChannel object associated with this file;

8. int skipBytes(int n): try to skip the input of n bytes, and discard the skipped bytes;

?

Most of the functions of RandomAccessFile have been replaced by JDK1.4 NIO’s **”memory mapping”** file, that is, the file is mapped to the memory before operation , eliminating frequent disk io.

?

Code

File chunking

File chunking needs to be processed at the front end, and you can use a powerful js library or ready-made components for chunking processing. It is necessary to determine the size of the block and the number of blocks, and then specify an index value for each block.

In order to prevent the block of the uploaded file from being confused with other files, the md5 value of the file is used to distinguish it. This value can also be used to verify whether the file exists on the server and the upload status of the file.

  • If the file exists, return the file address directly;
  • If the file does not exist, but there is an upload status, that is, some parts are uploaded successfully, return the unuploaded part index array;
  • If the file does not exist and the upload status is empty, all parts need to be uploaded.
fileRederInstance.readAsBinaryString(file);
fileRederInstance.addEventListener("load", (e) => {<!-- -->
    let fileBolb = e.target.result;
    fileMD5 = md5(fileBolb);
    const formData = new FormData();
    formData.append("md5", fileMD5);
    axios
        .post(http + "/fileUpload/checkFileMd5", formData)
        .then((res) => {<!-- -->
            if (res.data.message == "The file already exists") {<!-- -->
                //The file already exists, do not go to the subsequent fragmentation, directly return the file address to the foreground page
                success & amp; & amp; success(res);
            } else {<!-- -->
                // There are two situations where the file does not exist, one is to return data: null means it has not been uploaded, the other is data: [xx, xx] and which pieces have not been uploaded
                if (!res.data.data) {<!-- -->
                    //There are still a few pieces that have not been uploaded, and the breakpoint resumes
                    chunkArr = res.data.data;
                }
                readChunkMD5();
            }
        })
        .catch((e) => {<!-- -->});
});

Before calling the upload interface, use the slice method to retrieve the block corresponding to the index in the file.

const getChunkInfo = (file, currentChunk, chunkSize) => {<!-- -->
       // Get the file fragment under the corresponding subscript
       let start = currentChunk * chunkSize;
       let end = Math.min(file.size, start + chunkSize);
       // split the file into chunks
       let chunk = file. slice(start, end);
       return {<!-- --> start, end, chunk };
   };

Then call the upload interface to complete the upload.

Breakpoint resume upload, file transfer in seconds

The backend is developed based on spring boot, and uses redis to store the status of the uploaded file and the address of the uploaded file.

If the file is completely uploaded, the file path is returned; if the file is partially uploaded, the unuploaded block array is returned; if the file has not been uploaded, a prompt message is returned.

?

Two files will be generated when uploading chunks, one is the main body of the file and the other is a temporary file. The temporary file can be regarded as an array file, and a byte with a value of 127 is allocated for each block.

?

Two values are used when verifying the MD5 value:

  • File upload status: as long as the file has been uploaded, it will not be empty, if it is completely uploaded, it will be true, and if it is partially uploaded, it will return false;
  • File upload address: If the file is uploaded completely, return the file path; partial upload returns the temporary file path.
/**
 * Check the MD5 of the file
 **/
public Result checkFileMd5(String md5) throws IOException {<!-- -->
    //Whether the file is uploaded or not: as long as the file has been uploaded, the value must exist
    Object processingObj = stringRedisTemplate.opsForHash().get(UploadConstants.FILE_UPLOAD_STATUS, md5);
    if (processingObj == null) {<!-- -->
        return Result.ok("This file has not been uploaded before");
    }
    boolean processing = Boolean. parseBoolean(processingObj. toString());
    //When the complete file upload is completed, it is the path of the file. If it is not completed, it returns the path of the temporary file (the temporary file is equivalent to an array, and a byte with a value of 127 is assigned to each block)
    String value = stringRedisTemplate.opsForValue().get(UploadConstants.FILE_MD5_KEY + md5);
    //Complete file upload is true, if not complete return false
    if (processing) {<!-- -->
        return Result.ok(value,"The file already exists");
    } else {<!-- -->
        File confFile = new File(value);
        byte[] completeList = FileUtils. readFileToByteArray(confFile);
        List<Integer> missChunkList = new LinkedList<>();
        for (int i = 0; i < completeList. length; i ++ ) {<!-- -->
            if (completeList[i] != Byte.MAX_VALUE) {<!-- -->
                // pad with spaces
                missChunkList. add(i);
            }
        }
        return Result.ok(missChunkList,"Part of the file was uploaded");
    }
}

Speaking of this, you will definitely ask: After all the multi-part uploads of this file are completed, how to get the complete file? Next, let’s talk about the problem of block merging.

Chunk upload, file merge

We mentioned above that we use the md5 value of the file to maintain the relationship between the block and the file, so we will merge the blocks with the same md5 value, because each block Each block has its own index value, so we will insert the blocks into the file by index like inserting an array to form a complete file.

When uploading in chunks, it should correspond to the chunk size, number of chunks, and current chunk index of the front end, so that they can be used when merging files. Here we use **”disk mapping”** to merge document.

 // Both read and write operations are allowed
RandomAccessFile tempRaf = new RandomAccessFile(tmpFile, "rw");
//It returns the only channel of file in nio communication
FileChannel fileChannel = tempRaf. getChannel();

//Write the fragment data Fragment size * How many fragments get the offset
long offset = CHUNK_SIZE * multipartFileDTO.getChunk();
//Split file size
byte[] fileData = multipartFileDTO.getFile().getBytes();
// Map the region of the file directly into memory
MappedByteBuffer mappedByteBuffer = fileChannel. map(FileChannel. MapMode. READ_WRITE, offset, fileData. length);
mappedByteBuffer.put(fileData);
// freed
FileMD5Util. freedMappedByteBuffer(mappedByteBuffer);
fileChannel. close();

Whenever a block upload is completed, it is also necessary to check the progress of the file upload to see if the file upload is complete.

RandomAccessFile accessConfFile = new RandomAccessFile(confFile, "rw");
//Mark this segment as true to indicate completion
accessConfFile.setLength(multipartFileDTO.getChunks());
accessConfFile.seek(multipartFileDTO.getChunk());
accessConfFile.write(Byte.MAX_VALUE);

//completeList checks whether all are completed, if all are in the array (all fragments are successfully uploaded)
byte[] completeList = FileUtils. readFileToByteArray(confFile);
byte isComplete = Byte.MAX_VALUE;
for (int i = 0; i < completeList.length & amp; & amp; isComplete == Byte.MAX_VALUE; i ++ ) {
    //AND operation, if some parts are not completed, isComplete is not Byte.MAX_VALUE
    isComplete = (byte) (isComplete & completeList[i]);
}
accessConfFile. close();

Then update the upload progress of the file to Redis.

//Update the status in redis: if it is true, it proves that all the large files have been uploaded
if (isComplete == Byte.MAX_VALUE) {<!-- -->
    stringRedisTemplate.opsForHash().put(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5(), "true");
    stringRedisTemplate.opsForValue().set(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5(), uploadDirPath + "/" + fileName);
} else {<!-- -->
    if (!stringRedisTemplate.opsForHash().hasKey(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5())) {<!-- -->
        stringRedisTemplate.opsForHash().put(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5(), "false");
    }
    if (!stringRedisTemplate.hasKey(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5())) {<!-- -->
        stringRedisTemplate.opsForValue().set(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5(), uploadDirPath + "/" + fileName + ".conf");
    }
}

Summary

The general content of today is as above, let’s practice it, pay attention to this column, there are more practices, let’s learn together.