Java implements file uploading in parts and resumes uploading at breakpoints

Tips: The following is the text of this article, the following cases are for reference

Regarding the first problem, if the file is too large and is disconnected halfway through the upload, restarting the upload will be very time-consuming, and you don’t know which part has been uploaded since the last disconnection. Therefore, we should fragment the large files first to prevent the problems mentioned above.

Front-end code:

<!DOCTYPE html>
<html>
<head>
    <title>File upload example</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<form>
    <input type="file" id="fileInput" multiple>
    <button type="button" onclick="upload()" >Upload</button>
</form>
<script>
    function upload() {
        var fileInput = document.getElementById('fileInput');
        var fileName = document.getElementById("fileInput").files[0].name;
        var files = fileInput.files;
        var chunkSize = 1024 * 10;
        var totalChunks = Math.ceil(files[0].size / chunkSize);
        var currentChunk = 0;

        
        function uploadChunk() {
            var xhr = new XMLHttpRequest();
            var formData = new FormData();

            
            formData.append('currentChunk', currentChunk);
            formData.append('totalChunks', totalChunks);
            formData.append('fileName',fileName);

            
            var start = currentChunk * chunkSize;
            var end = Math.min(files[0].size, start + chunkSize);
            var chunk = files[0].slice(start, end);

            
            formData.append('chunk', chunk);

            
            xhr.open('POST', '/file/upload');
            xhr.send(formData);

            xhr.onload = function() {
                
                currentChunk + + ;

                
                if (currentChunk < totalChunks) {
                    uploadChunk();
                } else {
                    
                    mergeChunks(fileName);
                }
            }
        }

        
        function mergeChunks() {
            var xhr = new XMLHttpRequest();
            xhr.open("POST", "/file/merge", true);
            xhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
            xhr.onreadystatechange = function() {
                if (xhr. readyState === 4) {
                    if (xhr.status === 200) {
                        console.log("File upload completed:", xhr.responseText);
                    } else {
                        console.error(xhr.responseText);
                    }
                }
            };
            xhr.send("fileName=" + fileName);
        }

        
        uploadChunk();
    }
</script>
</body>
</html>

ps: The above code is completed using html + js, and the request is made using xhr to send the request. The address of xhr.open is your local interface address. Since testing does not require actual uploading of large files, the size of each fragment is defined as 10KB to simulate large file uploads.

Backend code:

@RestController
@RequestMapping("/file")
public class FileController {
    @Autowired
    private ResourceLoader resourceLoader;

    @Value("${my.config.savePath}")
    private String uploadPath;

    private Map<String, List<File>> chunksMap = new ConcurrentHashMap<>();

    @PostMapping("/upload")
    public void upload(@RequestParam int currentChunk, @RequestParam int totalChunks,
                       @RequestParam MultipartFile chunk,@RequestParam String fileName) throws IOException {

        
        String chunkName = chunk.getOriginalFilename() + "." + currentChunk;
        File chunkFile = new File(uploadPath, chunkName);
        chunk.transferTo(chunkFile);

        
        List<File> chunkList = chunksMap.get(fileName);
        if (chunkList == null) {
            chunkList = new ArrayList<>(totalChunks);
            chunksMap.put(fileName, chunkList);
        }
        chunkList.add(chunkFile);
    }

    @PostMapping("/merge")
    public String merge(@RequestParam String fileName) throws IOException {

        
        List<File> chunkList = chunksMap.get(fileName);
        if (chunkList == null || chunkList.size() == 0) {
            throw new RuntimeException("The shard does not exist");
        }

        File outputFile = new File(uploadPath, fileName);
        try (FileChannel outChannel = new FileOutputStream(outputFile).getChannel()) {
            for (int i = 0; i < chunkList.size(); i + + ) {
                try (FileChannel inChannel = new FileInputStream(chunkList.get(i)).getChannel()) {
                    inChannel.transferTo(0, inChannel.size(), outChannel);
                }
                chunkList.get(i).delete();
            }
        }

        chunksMap.remove(fileName);
        
        Resource resource =
           resourceLoader.getResource("file:" + uploadPath + fileName);
        return resource.getURI().toString();
    }
}

ps: Use a map to record which shards have been uploaded. Here, the shards are stored in a local folder. After all shards are uploaded, they are merged and deleted. Use ConcurrentHashMap instead of HashMap because it is safe under multi-threading.

The above is just a simple file upload code, but as long as you make other modifications to it, you can solve the problems mentioned above.

1. How to avoid a large number of hard disk reads and writes

One drawback of the above code is that the sharded content is stored in a local folder. And when merging, judging whether the upload is complete also reads the file from the folder. A large number of read and write operations on the disk are not only slow, but also cause the server to crash. Therefore, the following code uses redis to store sharding information to avoid excessive reads and writes to the disk. (You can also use mysql or other middleware to store information. Since reading and writing should not be done in mysql, I used redis).

2. The target file is too large, what to do if it is disconnected during the upload process

Use redis to store shard content. After disconnection, the file information is still stored in redis. When the user uploads again, check whether redis has the content of the shard, and skip it if so.

3. How to find out if the file data uploaded on the front-end page is inconsistent with the original file data

When the front-end calls the upload interface, it first calculates the checksum of the file, and then passes the file and the checksum to the back-end. The back-end calculates the checksum again for the file, and compares the two checksums. If they are equal , it means the data is consistent. If it is inconsistent, an error will be reported and the front end will re-upload the fragment. js calculation checksum code:

    function calculateHash(fileChunk) {
        return new Promise((resolve, reject) => {
            const blob = new Blob([fileChunk]);
            const reader = new FileReader();
            reader.readAsArrayBuffer(blob);
            reader.onload = () => {
                const arrayBuffer = reader.result;
                const crypto = window.crypto || window.msCrypto;
                const digest = crypto.subtle.digest("SHA-256", arrayBuffer);
                digest.then(hash => {
                    const hashArray = Array.from(new Uint8Array(hash));
                    const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
                    resolve(hashHex);
                });
            };
            reader.onerror = () => {
                reject(new Error('Failed to calculate hash'));
            };
        });
    }

 public static String calculateHash(byte[] fileChunk) throws Exception {
        MessageDigest md = MessageDigest.getInstance("SHA-256");
        md.update(fileChunk);
        byte[] hash = md.digest();
        ByteBuffer byteBuffer = ByteBuffer.wrap(hash);
        StringBuilder hexString = new StringBuilder();
        while (byteBuffer.hasRemaining()) {
            hexString.append(String.format(" x", byteBuffer.get()));
        }
        return hexString.toString();
    }

Note:

The algorithms for calculating checksums here at the front-end and back-end must be consistent, otherwise you will not get the same results.
Crypto is used in the front end to calculate files, and the relevant js needs to be introduced. You can use script to import or directly download js

<script src="//i2.wp.com/cdn.bootcss.com/crypto-js/3.1.9-1/crypto-js.min.js"></script>

Crypto download address [1] If github cannot be opened, you may need to use npm to download it.

4. If the upload process is disconnected, how to determine which fragments have not been uploaded

Use redis to detect which shard subscript does not exist. If it does not exist, store it in the list, and finally return the list to the front end.

 boolean allChunksUploaded = true;
 List<Integer> missingChunkIndexes = new ArrayList<>();
  for (int i = 0; i < hashMap.size(); i + + ) {
      if (!hashMap.containsKey(String.valueOf(i))) {
          allChunksUploaded = false;
          missingChunkIndexes.add(i);
      }
  }
  if (!allChunksUploaded) {
      return ResponseEntity.status(HttpStatus.BAD_REQUEST).body(missingChunkIndexes);
  }

1. Introduce dependencies

 <dependency>
   <groupId>io.lettuce</groupId>
    <artifactId>lettuce-core</artifactId>
    <version>6.1.4.RELEASE</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>

lettuce is a Redis client. You can also use redisTemplat directly without introducing it.

2. Front-end code




    
    File Upload Demo

3. Backend interface code

@RestController
@RequestMapping("/file2")
public class File2Controller {

    private static final String FILE_UPLOAD_PREFIX = "file_upload:";

    @Autowired
    private ResourceLoader resourceLoader;

    @Value("${my.config.savePath}")
    private String uploadPath;
    @Autowired
    private ThreadLocal<RedisConnection> redisConnectionThreadLocal;
    



    @PostMapping("/upload")
    public ResponseEntity<?> uploadFile(@RequestParam("chunk") MultipartFile chunk,
                                        @RequestParam("chunkIndex") Integer chunkIndex,
                                        @RequestParam("chunkSize") Integer chunkSize,
                                        @RequestParam("chunkChecksum") String chunkChecksum,
                                        @RequestParam("fileId") String fileId) throws Exception {
        if (StringUtils.isBlank(fileId) || StringUtils.isEmpty(fileId)) {
            fileId = UUID.randomUUID().toString();
        }
        String key = FILE_UPLOAD_PREFIX + fileId;
        byte[] chunkBytes = chunk.getBytes();
        String actualChecksum = calculateHash(chunkBytes);
        if (!chunkChecksum.equals(actualChecksum)) {
            return ResponseEntity.status(HttpStatus.BAD_REQUEST).body("Chunk checksum does not match");
        }



        RedisConnection connection = redisConnectionThreadLocal.get();

        Boolean flag = connection.hExists(key.getBytes(), String.valueOf(chunkIndex).getBytes());
        if (flag==null || flag == false) {
            connection.hSet(key.getBytes(), String.valueOf(chunkIndex).getBytes(), chunkBytes);
        }

        return ResponseEntity.ok(fileId);

    }

    public static String calculateHash(byte[] fileChunk) throws Exception {
        MessageDigest md = MessageDigest.getInstance("SHA-256");
        md.update(fileChunk);
        byte[] hash = md.digest();
        ByteBuffer byteBuffer = ByteBuffer.wrap(hash);
        StringBuilder hexString = new StringBuilder();
        while (byteBuffer.hasRemaining()) {
            hexString.append(String.format(" x", byteBuffer.get()));
        }
        return hexString.toString();
    }

    @PostMapping("/merge")
    public ResponseEntity<?> mergeFile(@RequestParam("fileId") String fileId, @RequestParam("fileName") String fileName) throws IOException {
        String key = FILE_UPLOAD_PREFIX + fileId;
        RedisConnection connection = redisConnectionThreadLocal.get();
        try {
            Map<byte[], byte[]> chunkMap = connection.hGetAll(key.getBytes());

            if (chunkMap.isEmpty()) {
                return ResponseEntity.status(HttpStatus.NOT_FOUND).body("File not found");
            }

            Map<String,byte[]> hashMap = new HashMap<>();
            for(Map.Entry<byte[],byte[]> entry :chunkMap.entrySet()){
                hashMap.put((new String(entry.getKey())),entry.getValue());
            }
            
            boolean allChunksUploaded = true;
            List<Integer> missingChunkIndexes = new ArrayList<>();
            for (int i = 0; i < hashMap.size(); i + + ) {
                if (!hashMap.containsKey(String.valueOf(i))) {
                    allChunksUploaded = false;
                    missingChunkIndexes.add(i);
                }
            }
            if (!allChunksUploaded) {
                return ResponseEntity.status(HttpStatus.BAD_REQUEST).body(missingChunkIndexes);
            }

            File outputFile = new File(uploadPath, fileName);
            boolean flag = mergeChunks(hashMap, outputFile);
            Resource resource = resourceLoader.getResource("file:" + uploadPath + fileName);


            if (flag == true) {
                connection.del(key.getBytes());

                return ResponseEntity.ok().body(resource.getURI().toString());
            } else {
                return ResponseEntity.status(555).build();
            }
        } catch (Exception e) {
            e.printStackTrace();
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(e.getMessage());
        }
    }

    private boolean mergeChunks(Map<String, byte[]> chunkMap, File destFile) {
        try (FileOutputStream outputStream = new FileOutputStream(destFile)) {
            
            for (int i = 0; i < chunkMap.size(); i + + ) {
                byte[] chunkBytes = chunkMap.get(String.valueOf(i));
                outputStream.write(chunkBytes);
            }
            return true;
        } catch (IOException e) {
            e.printStackTrace();
            return false;
        }
    }
}

4. redis configuration

@Configuration
public class RedisConfig {
    @Value("${spring.redis.host}")
    private String host;

    @Value("${spring.redis.port}")
    private int port;

    @Value("${spring.redis.password}")
    private String password;

    @Bean
    public RedisConnectionFactory redisConnectionFactory() {
        RedisStandaloneConfiguration config = new RedisStandaloneConfiguration();
        config.setHostName(host);
        config.setPort(port);
        config.setPassword(RedisPassword.of(password));
        return new LettuceConnectionFactory(config);
    }
    @Bean
    public ThreadLocal<RedisConnection> redisConnectionThreadLocal(RedisConnectionFactory redisConnectionFactory) {
        return ThreadLocal.withInitial(() -> redisConnectionFactory.getConnection());
    }
}

Use redisConnectionThreadLocal to avoid establishing connections multiple times, which is very time-consuming

The above is the complete code for this function. When using the code, remember to modify the uploadPath to prevent the code from finding the directory path. At the end of the code, you can use mysql to calculate the checksum for the entire file, store the checksum result, file name, file size, and file type in the database, and determine whether it exists before uploading the next large file. If it exists, do not upload it to avoid taking up space.

Reference materials

[1]

https://github.com/brix/crypto-js/releases/tag/4.1.0:https://link.juejin.cn/?target=https://github.com/brix/crypto- js/releases/tag/4.1.0

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 138234 people are learning the system