MySQL master-slave replication (based on GTID–transaction ID method)

1. GTID related concepts

1.What is GTID?

Introduction to gitd

It is supported starting from MySQL-5.6.5 and will be improved after MySQL-5.6.10;

MySQL database has added a GDIT-based replication method starting from 5.6.5. GDIT ensures that each transaction submitted on the main database has a unique ID in the cluster. This method strengthens the main-standby consistency, failure recovery and fault tolerance of the database.

GTID (Global Transaction ID) Each transaction ID is globally unique in the entire transaction process, and the ID will not be the same in the entire master-slave replication architecture.

GTID is actually composed of UUID + TID. Where UUID is the unique identifier of a MySQL instance. The TID represents the number of transactions that have been committed on this instance, and increases monotonically as transactions are committed.

server_uuid: server_uuid is a uuid (128-bit) random value automatically generated during the first startup of Mysql. After generation, the value will be stored in auto.cnf in the data directory. Because it is a random value, the server_uuid of Mysql on different servers is different.

transaction_id (tid): represents the number of transactions that have been submitted on this instance. It is an integer with an initial value of 1. It is assigned to this transaction and incremented by 1 each time a transaction is submitted.

GTID is used to replace traditional replication. The biggest difference between GTID replication and ordinary replication mode is that there is no need to specify the binary file name and location.

2. Concept of GTID master-slave replication method

The emergence of GTID-based master-slave replication is mainly used to replace the traditional log point replication method. GTID can ensure that each transaction submitted by the main library has a unique transaction ID in the cluster.

It strengthens the consistency of the master-slave database and the fault-tolerance ability of fault recovery data. When the master database goes down and a master-slave switchover occurs, the GTID method allows other slave databases to automatically find the copy location of the new master database.

Moreover, GTID can ignore transactions that have already been executed, reducing the probability of data errors.

A GTID is only executed once on a server to avoid data confusion or master-slave inconsistency caused by repeated execution;

GTID is used to replace the traditional replication method. MASTER_LOG_FILE + MASTER_LOG_POS is no longer used to start replication. Instead, use MASTER_AUTO_POSTION=1 to start copying;

The binlog on the [slave] side of GTID must be opened in order to record the executed GTID (mandatory).

3. Advantages and disadvantages of GTID

Advantages:

Based on the GTID, you can quickly determine on which instance the transaction was originally committed.

Easier to set up master-slave replication to ensure that each transaction will only be executed once.

A transaction corresponds to a unique ID, and a GTID will only be executed once on a server.

GTID is used to replace the traditional replication method. The biggest difference between GTID replication and ordinary replication mode is that there is no need to specify the binary file name and location.

Simple implementation of failover, no need to look for log_file and log_pos as before.

Reduce manual intervention and reduce service failure time. When the host hangs up, software can promote a standby machine from many standby machines as the host.

Disadvantages:

The table storage engines of the master-slave database must be consistent

Inconsistency in the table storage engines of the master-slave database will lead to data inconsistency. If the storage engines of the master-slave database are inconsistent, for example, one is a transactional storage engine and the other is a non-transactional storage engine, the one-to-one relationship between transactions and GTIDs will be destroyed, and as a result, GTID-based replication will not run correctly;

master: performs a multi-sql update on an innodb table, the effect is to generate a GTID.

slave: Assuming that the corresponding table is a MYISAM engine, an error will be reported after executing the first statement of this GTID, because a SQL statement in a non-transaction engine is a transaction.

When the slave library reports an error, simply stop slave; start slave; to ignore the error.

However, at this time, there has been a problem with the master-slave consistency, and the data that is different from the slave needs to be manually filled in. Here, the engine needs to be adjusted to be the same, and the slave is also changed to a transaction engine.

One SQL is not allowed to update a transaction engine and a non-transaction engine table at the same time

Mixing multiple storage engines in a transaction will generate multiple GTIDs. When using GTID, if the update includes non-transaction engine (such as MyISAM) and transaction engine (such as InnoDB) table operations in the same transaction, it may result in multiple GTIDs, inconsistent master-slave replication data, and slave database replication. Interruptions etc.

In a replication group, GTID must be turned on or off uniformly.

Create table….select statement replication is not supported (the main library reports an error directly);

The create table xxx as select statement will be split into two parts, the create statement and the insert statement. However, if you want to do it all at once, MySQL will throw the following error.

ERROR 1786 (HY000): Statement violates GTID consistency: CREATE TABLE … SELECT.

The method of create table xxx as select can be split into two parts, as follows:

create table xxxx like data_mgr;

insert into xxxx select *from data_mgr;

2. How GTID works

When a transaction is executed and submitted on the main library side, a GTID is generated and recorded in the binlog.

After the binlog is transferred to the slave and stored in the slave’s relaylog, the value of the GTID is read and the gtid_next variable is set, which tells the Slave the next GTID value to be executed.

The sql thread obtains the GTID from the relay log, and then compares the binlog on the slave side to see whether the GTID exists.

If there is a record, it means that the transaction with this GTID has been executed, and the slave will ignore it.

If there is no record, the slave will execute the GTID transaction and record the GTID to its own binlog.

How to determine whether the copy method is GTID or pos
Show slave statusView the Auto_Position field: 0 is pos mode, 1 is gtid mode.

3. Deploy master-slave replication

Architecture diagram

1. Prepare the environment for two machines, turn off the firewall and selinux; the environments of the two machines must be consistent; the time must also be consistent! ! !
192.168.221.136 mysql-master
192.168.221.138 mysql-slave

Time setting:
[root@localhost ~]# timedatectl set-timezone Asia/Shanghai
//Check the time difference with the ntp server (requires external network access, if there is an ntpd server on the internal network, replace the domain name with the address of the service)
[root@localhost ~]# yum -y install ntpdate
[root@localhost ~]# ntpdate -d cn.pool.ntp.org
//Synchronize time with ntp server
[root@localhost ~]# ntpdate cn.pool.ntp.org

?2. It is recommended to use the same installation method to install mysql5.7 (omitted) on two machines.

Notice:
Existing databases in the main database will not be automatically synchronized.
Before master-slave synchronization, there is a database backup on the master database, which needs to be manually imported and synchronized on the slave database.
Points to note when exporting and importing data after turning on GTID
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don’t want to restore GTIDs, pass –set-gtid-purged= OFF. To make a complete dump, pass –all-databases –triggers –routines –events
#Means: The GTID function is enabled in the current database instance. In the database instance with the GTID function enabled, exporting any of the libraries will prompt this line of information if the –set-gtid-purged parameter is not explicitly specified. This means that by default, the exported library contains GTID information. If you do not want to export the database containing GTID information, you need to add the –set-gtid-purged=OFF parameter explicitly.
mysqldump -uroot -p'' --set-gtid-purged=OFF --single-transaction --all-databases > /path/dbname.bak
mysqldump -uroot -p'' --set-gtid-purged=OFF --single-transaction -B library name 1 library name 2 > /path/dbname.bak
Import a single database on the slave database.
mysql -uroot -p'' library name < /path/dbname.bak
Import multiple databases from the database.
mysql -uroot -p'' < /path/dbname.bak

master operation:

[root@mysql-master ~]# vim /etc/my.cnf //Add the following content under [mysqld]
server-id=1
log-bin = mylog //Enable binlog logs. If the absolute path is not specified, the yum installation method is generated in /var/lib/mysql/, and the binary installation method is generated in /usr/local/mysql/data/
gtid_mode = on //Enable gtid mode on both master and slave servers
enforce_gtid_consistency=1 //Force GTID consistency consistency: consistency
sync_binlog = 1 //Force gtid
[root@mysql-master ~]# systemctl restart mysqld

Create an account on the main server:

mysql> grant replication slave,reload,super on *.* to 'slave'@'%' identified by 'JiannLt@123';
//Note: The password in the production environment uses a high-level password. In the actual production environment, replace '%' with the slave's IP.
mysql> flush privileges;

?Note: If unsuccessful, delete the previous binlog log
replication slave Permission: This permission is used to grant replication permissions to the slave server account. The slave server can read binary logs from the master server.
super permission: allows users to use the SET statement and CHANGE MASTER statement to modify global variables
reload Permission: You must have reload permission before you can execute flush [tables | logs | privileges]

slave operation:

[root@mysql-slave ~]# vim /etc/my.cnf //[mysqld]Add the following configuration
server-id=2 //server-id cannot be the same for each server and is used to identify different MySQL server instances.
gtid_mode = ON //Open gtid
enforce_gtid_consistency=1 //Force GTID consistency
master-info-repository=TABLE //Save the master server information in the mysql.slave_master_info table
relay-log-info-repository=TABLE //Save relay log information to the mysql.slave_relay_log_info table
[root@mysql-slave ~]# systemctl restart mysqld
[root@mysql-slave ~]# mysql -uroot -p'JiannLt@123' //Log in to mysql
mysql> change master to #Set the master-slave relationship
master_host='192.168.221.136', #Specify the synchronization host
master_user='slave', #Specify the synchronized account
master_password='JiannLt@123', #Specify slave password
master_auto_position=1; #Turn on automatic synchronization position mode
Query OK, 0 rows affected, 2 warnings (0.02 sec)

mysql> start slave; #Start the slave role
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G #Check the status and verify whether sql and IO are yes.
Description synchronization successful
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

4. Test synchronization

1. Create a new database on the main database

#View local IP
mysql> select substring_index(host,':',1) as ip, count(*) from information_schema.processlist group by ip;
 +----------------+---------+
| ip | count(*) |
 +----------------+---------+
| 192.168.221.136| 1 |
| localhost | 1 |
 +----------------+---------+
2 rows in set (0.00 sec)
?
mysql> create database testdb;
Query OK, 1 row affected (0.00 sec)
?
mysql> show databases;
 + -------------------- +
| Database |
 + -------------------- +
| information_schema |
| mysql |
| performance_schema |
| sys |
| testdb |
 + -------------------- +
5 rows in set (0.00 sec)

2. Check whether the synchronization is successful from the library

mysql> show databases;
 + -------------------- +
| Database |
 + -------------------- +
| information_schema |
| mysql |
| performance_schema |
| sys |
| testdb |
 + -------------------- +
5 rows in set (0.00 sec)

5. Reset the slave library

#All executed from the library
mysql> stop slave;
mysql> reset slave;
mysql> reset master;
?
#The binlog of the slave library is no longer valid, so execute this command to clear the binlog.
mysql> change master to
master_host='192.168.221.136',
master_port=3306,
master_user='slave',
master_password='JiannLt@123',
master_auto_position=1;
?#master_auto_position=1; Indicates whether to automatically obtain the binary log coordinate points of the master library. Set to 1 to indicate yes.
?
mysql> start slave; #Start the slave role
Query OK, 0 rows affected (0.00 sec)
?
mysql> show slave status\G #Check the status and verify whether sql and IO are yes.

6. Common faults

Common fault 1:
Slave has more GTIDs than the master has, using the master’s SERVER_UUID
This problem means that the GTID obtained from the slave library exceeds the main library. For example, the main library modifies the system host name without specifying the binlog file name, causing all binlogs to be modified, and the slave library will fail to judge;
The solution is as follows:

#Reset the slave library
mysql> stop slave;
mysql> reset slave;
mysql> reset master;
mysql> change master to
master_host='192.168.221.136',
master_port=3306,
master_user='slave',
master_password='JiannLt@123',
master_auto_position=1;

mysql> start slave;

?Common faults 2
If the slave library does not specify relaylog and the system host name is modified, you only need to perform synchronization again on the slave library.

mysql> stop slave;
mysql> reset slave;
mysql> change master to
master_host='192.168.221.136',
master_port=3306,
master_user='slave',
master_password='JiannLt@123',
master_auto_position=1;

mysql> start slave;

?Common faults 3
Master_has_purged_require_gtids
The main library deleted the binlog that has not been synchronized in advance.
The solution is as follows:

#View master information on the main library
mysql> show master status\G;
*************************** 1. row ********************* *******
             File: mylog.000001
         Position: 605
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: cc9a32c5-4bc7-11ee-bdae-000c295e8b83:1-2
1 row in set (0.00 sec)

#Manually specify the binary log file master_log_file and the location master_log_pos on the slave library to be consistent with those on the master
mysql > stop slave;
mysql > change master to
master_host='192.168.221.136',
master_user='slave',
master_password='JiannLt@123',
master_log_file='mylog.000001',
master_log_pos=605,
master_auto_position=0;

mysql > start slave;

7. Failover

MySQL master-slave, host failure or downtime, how to switch?
1) Execute in salve:

mysql> stop slave;
mysql> reset master;

2) Check whether it is in read-only mode. If it is read-only, you need to turn off read-only: show variables like 'read_only';
Read-only mode requires modifying the my.cnf file, commenting read-only=1 and restarting the mysql service.
Or use the command to turn off read-only without restarting, but it will be invalid after the next restart: set global read_only=off;

3) View

show slave status \G
show master status \G

4) Change the original main library IP address to the current slave library IP address in the program and test whether the application connection is normal.

8. Some questions about GTID

1. Why does GTID-based synchronization also require bin-log to be turned on?

Question: Why does the master-slave replication based on gtid also need to open the bin-log log? Isn’t it based on gtid replication? Why do I still need to open bin-log?
In MySQL, the global transaction ID (GTID) is indeed used in master-slave replication to ensure that each transaction on the master server (master) is executed only once on all slave servers (slave). However, the GTID is not the log record itself, but a unique identifier generated on the master server for each transaction.
Even when using GTID, we need binary log (binlog). Here are the main reasons:

GTID itself does not save the specific content of binlog events. It only records the globally unique identification number of each event. These modifications are still recorded in the binary log.
The main library also needs to open the binlog, because the GTID information is stored in the binlog. If the main library turns off the binlog, the GTID cannot be recorded, and therefore GTID-based replication cannot be performed.
GTID only adds a global transaction ID, and essentially replicates positions (based on log offsets) based on the binary log.
If the slave library wants to truly perform synchronous replication, it also needs to pull the corresponding binlog events from the host based on the GTID.
Therefore, the recording and transmission of binlog logs on the main library are based on GTID replication. The master needs to write the binlog to disk and pass it to the slave through the binlog protocol.
The advantage of being based on GTID is that even if the binlog log is cleared, synchronization can be restored from any location through the GTID identification.
Some operations used to restore disaster recovery, such as slave POS (position), also need to locate the appropriate position in the binlog.
Certain database operations, such as switching read-write nodes, also need to complete data consistency based on actual events in the binlog.

Therefore, even when using GTID-based master-slave replication, binary logging needs to be turned on to record and replicate data changes, support failure recovery, and store GTID information.

2. Comparison between GTID and Bin-log methods

MySQL's binary log (binlog) and global transaction identifier (GTID) are two different mechanisms in MySQL replication. They have different characteristics and applicable scenarios:

Recording methods are different:

binlog locates data events through log file + log pos.

GTID uses a globally unique transaction ID to identify events.

Distribution methods are different:

binlog is allocated sequentially.

GTID can be assigned randomly and does not depend on the service startup order.

Failure recovery is different:

Binlog needs to be restored using backup + replay of all logs.

GTID can replay only the log of lost transaction segments.

The reference relationship is different:

binlog cannot determine the order of events and dependencies.

GTID can clarify the sequence of events and dependencies between transactions.

Duplicate data is treated differently:

binlog may replay duplicate data.

GTID can identify duplicate data and automatically skip it.

Usage scenarios:

Binlog supports all versions of MySQL, is more compatible, and is suitable for regular master-slave replication.

GTID can make fault recovery more convenient, and sharding databases and tables is also simpler, making it suitable for scenarios that require higher data consistency.

In summary, GTID is more powerful in transaction management, but requires MySQL 5.6 or above. The two can be used together to give full play to their respective advantages.

Directory