Efficient solution: It only takes 13 seconds to insert 300,000 pieces of data into MySQL

Click on “Yao Dao Source Code” above and select “Set as Star”

Does she care about the front wave or the back wave?

A wave that can wave is a good wave!

Update articles every day at 10:33, lose a million bits of hair every day…

Source code boutique column

  • Original | Java 2021 Super God Road, very liver~

  • An open source project with detailed annotations in Chinese

  • RPC framework Dubbo source code analysis

  • Network application framework Netty source code analysis

  • Message middleware RocketMQ source code analysis

  • Database middleware Sharding-JDBC and MyCAT source code analysis

  • Job Scheduling Middleware Elastic-Job Source Code Analysis

  • Distributed transaction middleware TCC-Transaction source code analysis

  • Eureka and Hystrix source code analysis

  • Java Concurrency Source Code

Source: blog.csdn.net/qq_35427589/

article/details/129665307

  • 300,000 pieces of data are inserted into the database for verification

  • Entity class, mapper and configuration file definition

  • Stud directly regardless of batches

  • Insert one by one in a loop

  • MyBatis inserts 300,000 pieces of data

  • JDBC realizes inserting 300,000 pieces of data

  • Summarize

a5eb3401c825b8bb4620e68154f615db.jpeg

This article mainly describes the cases and results of large data volume data insertion through MyBatis, JDBC, etc.

300,000 pieces of data are inserted into the database for verification

  • Entity class, mapper and configuration file definition

    • User entity

    • mapper interface

    • mapper.xml file

    • jdbc.properties

    • sqlMapConfig.xml

  • Stud directly regardless of batches

  • Insert one by one in a loop

  • MyBatis inserts 300,000 pieces of data

  • JDBC realizes inserting 300,000 pieces of data

  • Summarize

The verified database table structure is as follows:

CREATE TABLE `t_user` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'user id',
  `username` varchar(64) DEFAULT NULL COMMENT 'username',
  `age` int(4) DEFAULT NULL COMMENT 'age',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='User Information Table';

Without further ado, let’s start!

Background management system + user applet based on Spring Boot + MyBatis Plus + Vue & amp; Element, supports RBAC dynamic permissions, multi-tenancy, data permissions, workflow, three-party login, payment, SMS, mall and other functions

  • Project address: https://github.com/YunaiV/ruoyi-vue-pro

  • Video tutorial: https://doc.iocoder.cn/video/

Entity class, mapper and configuration file definition

User entity

/**
 * <p>user entity</p>
 *
 * @Author zjq
 */
@Data
public class User {

    private int id;
    private String username;
    private int age;

}

mapper interface

public interface UserMapper {

    /**
     * Batch insert users
     * @param userList
     */
    void batchInsertUser(@Param("list") List<User> userList);


}

mapper.xml file

<!-- Batch insert user information -->
<insert id="batchInsertUser" parameterType="java.util.List">
    insert into t_user(username, age) values
    <foreach collection="list" item="item" index="index" separator=",">
        (
        #{item. username},
        #{item. age}
        )
    </foreach>
</insert>

jdbc.properties

jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/test
jdbc.username=root
jdbc.password=root

sqlMapConfig.xml

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd">
<configuration>

    <!--Load external properties file through properties tag-->
    <properties resource="jdbc.properties"></properties>


    <!--Custom Alias-->
    <typeAliases>
        <typeAlias type="com.zjq.domain.User" alias="user"></typeAlias>
    </typeAliases>


    <!--Data source environment-->
    <environments default="development">
        <environment id="development">
            <transactionManager type="JDBC"></transactionManager>
            <dataSource type="POOLED">
                <property name="driver" value="${jdbc.driver}"/>
                <property name="url" value="${jdbc.url}"/>
                <property name="username" value="${jdbc.username}"/>
                <property name="password" value="${jdbc.password}"/>
            </dataSource>
        </environment>
    </environments>


    <!--Load mapping file-->
    <mappers>
        <mapper resource="com/zjq/mapper/UserMapper.xml"></mapper>
    </mappers>


</configuration>

Background management system + user applet based on Spring Cloud Alibaba + Gateway + Nacos + RocketMQ + Vue & amp; Element, supporting RBAC dynamic permissions, multi-tenancy, data permissions, workflow, three-party login, payment, SMS, mall and other functions

  • Project address: https://github.com/YunaiV/yudao-cloud

  • Video tutorial: https://doc.iocoder.cn/video/

Direct stud without batch

MyBatis directly inserts 300,000 records in batches at one time. The code is as follows:

@Test
public void testBatchInsertUser() throws IOException {
    InputStream resourceAsStream =
            Resources.getResourceAsStream("sqlMapConfig.xml");
    SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
    SqlSession session = sqlSessionFactory. openSession();
    System.out.println("===== start inserting data =====");
    long startTime = System. currentTimeMillis();
    try {
        List<User> userList = new ArrayList<>();
        for (int i = 1; i <= 300000; i ++ ) {
            User user = new User();
            user.setId(i);
            user.setUsername("Share a drink without " + i);
            user.setAge((int) (Math.random() * 100));
            userList. add(user);
        }
        session.insert("batchInsertUser", userList); // Finally insert the remaining data
        session.commit();

        long spendTime = System. currentTimeMillis()-startTime;
        System.out.println("Successfully inserted 300,000 pieces of data, time-consuming: " + spendTime + "milliseconds");
    } finally {
        session. close();
    }
}

You can see the console output:

Cause: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (27759038 >yun 4194304). You can change this value on the server by setting the max_allowed_packet’ variable.

b424b8a15bdf93516b6ff02bfadd8a66.png

The limit of the maximum data packet is exceeded. You can adjust the limit of max_allowed_packet to increase the content that can be transmitted. However, because 300,000 pieces of data exceed too much, this is not advisable. It seems that Stud will not work

Since Stud doesn’t work, let’s insert them one by one in a loop, okay?

Cycle insert one by one

The newly added content of a single user in the mapper interface and mapper file is as follows:

/**
 * Add a single user
 * @param user
 */
void insertUser(User user);
<!-- Add user information -->
<insert id="insertUser" parameterType="user">
    insert into t_user(username, age) values
        (
        #{username},
        #{age}
        )
</insert>

Adjust the execution code as follows:

@Test
public void testCirculateInsertUser() throws IOException {
    InputStream resourceAsStream =
            Resources.getResourceAsStream("sqlMapConfig.xml");
    SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
    SqlSession session = sqlSessionFactory. openSession();
    System.out.println("===== start inserting data =====");
    long startTime = System. currentTimeMillis();
    try {
        for (int i = 1; i <= 300000; i ++ ) {
            User user = new User();
            user.setId(i);
            user.setUsername("Share a drink without " + i);
            user.setAge((int) (Math.random() * 100));
            // add one by one
            session.insert("insertUser", user);
            session.commit();
        }

        long spendTime = System. currentTimeMillis()-startTime;
        System.out.println("Successfully inserted 300,000 pieces of data, time-consuming: " + spendTime + "milliseconds");
    } finally {
        session. close();
    }
}

After execution, it can be found that the proportion of disk IO has soared and has been at a high level.

0e70714603318df3aae0686ebe752910.png

Wait, wait, wait, it’s been a long time

b2725b5fdafc1341853b30902695cdeb.gif

Let’s ignore him, it’s too slow, let’s do other things first, and let’s see the result later.

two thousand years later…

The console output is as follows:

a642ee4f79ec391f0ae2c66511c25ffd.png

A total of 14909367 milliseconds were executed, which is converted to 4 hours and 8 minutes. too slow. .

5c11526a9a00b433620f946c428b1b83.png

Let’s optimize the previous batch processing scheme

MyBatis inserts 300,000 pieces of data

First clean up the table data, and then optimize the batch execution insert:

-- Clear user table
TRUNCATE table t_user;

The following is the implementation of 300,000 data insertion codes through MyBatis:

/**
 * Batch insertion in batches
 * @throws IOException
 */
@Test
public void testBatchInsertUser() throws IOException {
    InputStream resourceAsStream =
            Resources.getResourceAsStream("sqlMapConfig.xml");
    SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
    SqlSession session = sqlSessionFactory. openSession();
    System.out.println("===== start inserting data =====");
    long startTime = System. currentTimeMillis();
    int waitTime = 10;
    try {
        List<User> userList = new ArrayList<>();
        for (int i = 1; i <= 300000; i ++ ) {
            User user = new User();
            user.setId(i);
            user.setUsername("Share a drink without " + i);
            user.setAge((int) (Math.random() * 100));
            userList. add(user);
            if (i % 1000 == 0) {
                session.insert("batchInsertUser", userList);
                // Submit a transaction every 1000 pieces of data
                session.commit();
                userList. clear();

                // wait for a while
                Thread. sleep(waitTime * 1000);
            }
        }
        // finally insert the remaining data
        if(!CollectionUtils. isEmpty(userList)) {
            session.insert("batchInsertUser", userList);
            session.commit();
        }

        long spendTime = System. currentTimeMillis()-startTime;
        System.out.println("Successfully inserted 300,000 pieces of data, time-consuming: " + spendTime + "milliseconds");
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        session. close();
    }
}

Using the batch processing operation of MyBatis, inserting every 1000 pieces of data in a batch can effectively improve the insertion speed. At the same time, please pay attention to the appropriate waiting time and batch size when inserting loops to prevent problems such as excessive memory usage. In addition, it is also necessary to set reasonable connection pool and database parameters in the configuration file to obtain better performance.

dc2facbfd30a7b875c725599958c48d5.png

In the example above, we do a batch commit every 1000 rows inserted and wait 10 seconds. This helps control memory usage and ensures that insert operations run smoothly.

ea9c431807a23a4e4739b68c826f4469.png

The execution was completed in fifty minutes, and the time was mainly spent waiting.

If it is executed during the low period, and the CPU and disk performance are sufficient, the direct batch processing does not wait for execution:

/**
 * Batch insertion in batches
 * @throws IOException
 */
@Test
public void testBatchInsertUser() throws IOException {
    InputStream resourceAsStream =
            Resources.getResourceAsStream("sqlMapConfig.xml");
    SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
    SqlSession session = sqlSessionFactory. openSession();
    System.out.println("===== start inserting data =====");
    long startTime = System. currentTimeMillis();
    int waitTime = 10;
    try {
        List<User> userList = new ArrayList<>();
        for (int i = 1; i <= 300000; i ++ ) {
            User user = new User();
            user.setId(i);
            user.setUsername("Share a drink without " + i);
            user.setAge((int) (Math.random() * 100));
            userList. add(user);
            if (i % 1000 == 0) {
                session.insert("batchInsertUser", userList);
                // Submit a transaction every 1000 pieces of data
                session.commit();
                userList. clear();
            }
        }
        // finally insert the remaining data
        if(!CollectionUtils. isEmpty(userList)) {
            session.insert("batchInsertUser", userList);
            session.commit();
        }

        long spendTime = System. currentTimeMillis()-startTime;
        System.out.println("Successfully inserted 300,000 pieces of data, time-consuming: " + spendTime + "milliseconds");
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        session. close();
    }
}

Then the data insertion operation can be completed in 24 seconds:

f18d0dd34bbdf1ddc7211b2c13d51fcf.png

0cc2c7d815500bf6f6683d9fe25c1f07.png

It can be seen that the short-term CPU and disk usage will soar.

Increase the amount of batch processing to 5000, and execute:

cabe71c2018f0fac655230bc9e85ddc5.png

300,000 records were successfully inserted in 13 seconds, and took off directly from Wuhu

JDBC realizes inserting 300,000 pieces of data

The JDBC loop insertion is similar to the above mybatis insertion one by one, so I won’t repeat it here.

The following is a sample code of Java using JDBC batch processing to insert 300,000 records. Please note that this code only provides an idea, and the specific implementation needs to be modified according to the actual situation.

/**
 * JDBC batch insert in batches
 * @throws IOException
 */
@Test
public void testJDBCBatchInsertUser() throws IOException {
    Connection connection = null;
    PreparedStatement preparedStatement = null;

    String databaseURL = "jdbc:mysql://localhost:3306/test";
    String user = "root";
    String password = "root";

    try {
        connection = DriverManager.getConnection(databaseURL, user, password);
        // Turn off auto-commit transactions and manually submit them instead
        connection.setAutoCommit(false);
        System.out.println("===== start inserting data =====");
        long startTime = System. currentTimeMillis();
        String sqlInsert = "INSERT INTO t_user ( username, age) VALUES ( ?, ?)";
        preparedStatement = connection. prepareStatement(sqlInsert);

        Random random = new Random();
        for (int i = 1; i <= 300000; i ++ ) {
            preparedStatement.setString(1, "Total drink without " + i);
            preparedStatement.setInt(2, random.nextInt(100));
            // add to batch
            preparedStatement. addBatch();

            if (i % 1000 == 0) {
                // Submit every 1000 pieces of data
                preparedStatement. executeBatch();
                connection.commit();
                System.out.println("Successfully inserted the first " + i + " data");
            }

        }
        // process the rest of the data
        preparedStatement. executeBatch();
        connection.commit();
        long spendTime = System. currentTimeMillis()-startTime;
        System.out.println("Successfully inserted 300,000 pieces of data, time-consuming: " + spendTime + "milliseconds");
    } catch (SQLException e) {
        System.out.println("Error: " + e.getMessage());
    } finally {
        if (preparedStatement != null) {
            try {
                preparedStatement. close();
            } catch (SQLException e) {
                e.printStackTrace();
            }
        }

        if (connection != null) {
            try {
                connection. close();
            } catch (SQLException e) {
                e.printStackTrace();
            }
        }
    }
}

76e3fef0c212053c31903647af77fd30.gif
18603d79b64833cfb315aeae65f9f867.png

In the above sample code, we connect to the MySQL database through JDBC and perform batch operations to insert data. The specific implementation steps are as follows:

  • Get a database connection.

  • Create a Statement object.

  • Define SQL statement, use PreparedStatement object to precompile SQL statement and set parameters.

  • Perform batch operations.

  • Process the remaining data.

  • Closes the Statement and Connection objects.

Use setAutoCommit(false) to disable automatic transaction commit, and then manually commit the transaction after each batch insert. A new PreparedStatement object is created every time data is inserted to avoid state inconsistencies. In the loop of inserting data, execute executeBatch() to insert data every 10000 pieces of data.

In addition, you need to optimize the configuration of the connection pool and database according to the actual situation to prevent problems such as connection timeout.

Summary

To achieve efficient large-scale data insertion requires the combination of the following optimization strategies (combined use is recommended):

1. Batch processing: Submitting SQL statements in batches can reduce network transmission and processing overhead, and reduce the number of interactions with the database. In Java, you can use the addBatch() method of Statement or PreparedStatement to add multiple SQL statements, and then execute the executeBatch() method once to submit Batch SQL statements.

  • With appropriate waiting time and batch size during loop insertion, so as to avoid problems such as high memory usage:

    • Set an appropriate batch size: The batch size refers to how many rows of data are inserted in one insert operation. If the batch size is too small, the frequency of insert operations will be high, while if the batch size is too large, memory usage may be high. Generally, it is recommended to set the batch size to 1000-5000 rows, which will reduce the frequency of insert operations and reduce memory footprint.

    • Use appropriate wait times: Wait times refer to the amount of time to wait between batch operations. Waiting too short can lead to high memory usage, while waiting too long can slow down insert operations. Generally, it is recommended to set the waiting time between a few seconds and tens of seconds, which will make the operation smooth and avoid problems such as high memory usage.

    • You can consider using some memory optimization techniques, such as using an in-memory database or using a cursor to insert data, to reduce memory usage.

  • In general, choosing an appropriate batch size and wait time can help you perform insert operations smoothly and avoid problems such as high memory usage.

2. Index: Temporarily remove the index before inserting a large amount of data, and then add it at the end, which can greatly reduce the time to update the index when writing.

3. Database connection pool: Using a database connection pool can reduce the overhead of establishing and closing a database connection and improve performance. In the case of not using the database connection pool, remember to close the relevant connection in finally.

Database parameter adjustment: increase the MySQL database buffer size, configure high-performance disk and I/O, etc.

Welcome to join my knowledge planet, discuss architecture and exchange source code together. How to join, Long press the QR code below:

200a61d06880974462f861d46af1a478.png

The source code has been updated on Knowledge Planet and the analysis is as follows:

a3719ff0aba4cd81cb6de6b972bba1c1.jpeg

9d18ef554078f15c9460883d51212bac.jpeg

4f7f38c951c28e59550f325dcdaf653e.jpeg

11909da99de57abfb44eaa5569746a04.jpeg

The recently updated series “Introduction to Taro SpringBoot 2.X” has more than 101 articles, covering MyBatis, Redis, MongoDB, ES, sub-database and sub-table, read-write separation, SpringMVC, Webflux, permissions, WebSocket, Dubbo, RabbitMQ, RocketMQ , Kafka, performance testing, etc.

Provides a SpringBoot example with nearly 3W lines of code, and an e-commerce microservice project with more than 4W lines of code.

How to get it: Click “Looking“, follow the official account and reply to 666 to receive, more content will be provided one after another.

If the article is helpful, please read it and forward it.
Thank you for your support (*^__^*)