Article directory
- need
- Prepare
- nameserver
- dashboard
- Broker
- exporter
- problems encountered
- write at the end
Requirements
-
RocketMQ version is 5.1.0;
-
Build a cluster of 3 masters and 3 slaves, adopt cross-deployment (to avoid two machines being master-slave each other), and save machine resources;
-
3 nameservers, 1 exporter, 1 dashboard;
-
Supports automatic fault recovery, and the controller is deployed in a way embedded in the nameserver;
-
Asynchronous brushing;
-
When the master-slave switches, the message cannot be lost
Preparation
Machine preparation:
- 172.24.30.192
- 172.24.30.193
- 172.24.30.194
Deployment planning:
service name | IP | port | |
---|---|---|---|
nameserver(controller-n0) | 172.24.30.192 | 19876(controller: 19878) | |
nameserver(controller-n1) | 172.24.30.193 | 19876(controller: 19878) | |
nameserver(controller-n2) | 172.24.30.194 | 19876(controller: 19878) | |
broker-a | 172.24.30.192 | 13210 | |
broker-a-s | 172.24.30.193 | 13210 | |
broker-b | 172.24.30.193 | 13220 | |
broker-b-s | 172.24.30.194 | 172.24.30.194 | 13220 |
broker-c | 172.24.30.194 | 13230 | |
broker-c-s | 172.24.30.192 | 13230 | |
dashboard | 172.24. 30.193 | 18281 | |
exporter | 172.24.30.192 | 18282 |
Download the binary package: https://rocketmq.apache.org/download/
nameserver
Configuration file:
This is just to demonstrate how to configure, so only one nameserver configuration is shown, and the rest can be changed to the controllerDLegerSelfId corresponding to its own node. Note: this parameter cannot be repeated.
# nameserver related listenPort = 19876 rocketmqHome = /neworiental/rocketmq-5.1.0/rocketmq-nameserver useEpollNativeSelector = true orderMessageEnable = true serverPooledByteBufAllocatorEnable = true kvConfigPath = /neworiental/rocketmq-5.1.0/rocketmq-nameserver/store/namesrv/kvConfig.json configStorePath = /neworiental/rocketmq-5.1.0/rocketmq-nameserver/conf/nameserver.conf # controller related enableControllerInNamesrv = true controllerDLegerGroup = littleCat-Controller controllerDLegerPeers = n0-172.24.30.192:19878;n1-172.24.30.193:19878;n2-172.24.30.194:19878 controllerDLegerSelfId = n0 controllerStorePath = /neworiental/rocketmq-5.1.0/rocketmq-controller/store enableElectUncleanMaster = false notifyBrokerRoleChanged = true
-
enableElectUncleanMaster: whether to support the election of a node outside the synchronization state set as the master, if set to true, messages may be lost
-
Modify rmq.namesrv.logback.xml in the conf directory, mainly to modify the log path in batches, and ignore it if the default path is acceptable;
- Tip: The log directory can use a relative path, so that it can be done once and for all, and it is enough to ensure that each service uses a different directory;
-
Modify the runserver.sh in the bin directory, mainly modify the GC log file path, JVM startup parameters: xmx, xms, if you accept the default, ignore it, the default is 8G, if you are building a pseudo-cluster for testing, be careful that the machine cannot handle it;
Startup script:
#!/bin/sh ./etc/profile nohup sh /neworiental/rocketmq-5.1.0/rocketmq-nameserver/bin/mqnamesrv -c /neworiental/rocketmq-5.1.0/rocketmq-nameserver/conf/nameserver.conf >/dev/null 2> &1 & amp; echo "startup nameserver..."
Stop the script:
If multiple nameservers are deployed on one machine, do not use this method: sh /neworiental/rocketmq-5.1.0/rocketmq-nameserver/bin/mqshutdown namesrv to stop nameserver, this method will stop all nameservers
#!/bin/bash ./etc/profile PID=`ps -ef | grep '/neworiental/rocketmq-5.1.0/rocketmq-nameserver' | grep -v grep | awk '{print $2}'` if [[ "" != "$PID" ]]; then echo "killing rocketmq-nameserver : $PID" kill $PID the fi
Before starting Broker, start all nameservers successfully.
dashboard
Slightly, see another article: https://blog.csdn.net/sinat_14840559/article/details/129737390?spm=1001.2014.3001.5501
Broker
The official document mentions: In this mode, you don’t need to specify brokerId and brokerRole. You can set the brokerRole of all nodes to SLAVE and brokerId to -1 (the master-slave switch back and forth, and the manual configuration of 0 is actually invalid). So the broker that successfully registers first is the master.
For verification, it is best to start a group of brokers first, and then start all brokers after confirming that the functions meet expectations.
Thinking: In fact, there is no master-slave asynchronous writing (ASYNC_MASTER) in this mode, and the master-slave synchronization is realized according to the Raft protocol.
broker-a master node:
brokerClusterName = littleCat brokerName = broker-a brokerId = -1 listenPort = 13210 namesrvAddr = 172.24.30.192:19876;172.24.30.193:19876;172.24.30.194:19876; # Enable controller support enableControllerMode = true controllerAddr = 172.24.30.192:19878;172.24.30.193:19878;172.24.30.194:19878; deleteWhen = 04 fileReservedTime = 48 brokerRole = SLAVE flushDiskType = ASYNC_FLUSH autoCreateTopicEnable = false autoCreateSubscriptionGroup = false maxTransferBytesOnMessageInDisk = 65536 rocketmqHome = /neworiental/rocketmq-5.1.0/broker-a storePathConsumerQueue = /neworiental/rocketmq-5.1.0/broker-a/store/consumequeue brokerIP2 = 172.24.30.192 brokerIP1 = 172.24.30.192 aclEnable = false storePathRootDir = /neworiental/rocketmq-5.1.0/broker-a/store storePathCommitLog = /neworiental/rocketmq-5.1.0/broker-a/store/commitlog # 3000 days: 3600*24*3000 timerMaxDelaySec = 259200000 traceTopicEnable = true timerPrecisionMs = 1000 timerEnableDisruptor = true
Startup script:
#!/bin/bash ./etc/profile nohup sh /neworiental/rocketmq-5.1.0/broker-a/bin/mqbroker -c /neworiental/rocketmq-5.1.0/broker-a/conf/broker.conf >/dev/null 2> &1 & amp; echo "deploying broker-a..."
Stop the script:
#!/bin/bash ./etc/profile PID=`ps -ef | grep '/neworiental/rocketmq-5.1.0/rocketmq-broker-a' | grep -v grep | awk '{print $2}'` if [[ "" != "$PID" ]]; then echo "killing rocketmq-5-broker-a : $PID" kill $PID the fi
broker-a slave node:
brokerClusterName = littleCat brokerName = broker-a brokerId = -1 listenPort = 13210 namesrvAddr=172.24.30.192:19876;172.24.30.193:19876;172.24.30.194:19876; # Enable controller support enableControllerMode = true controllerAddr = 172.24.30.192:19878;172.24.30.193:19878;172.24.30.194:19878; deleteWhen = 04 fileReservedTime = 48 brokerRole = SLAVE flushDiskType = ASYNC_FLUSH autoCreateTopicEnable = false autoCreateSubscriptionGroup = false maxTransferBytesOnMessageInDisk = 65536 rocketmqHome=/neworiental/rocketmq-5.1.0/broker-a-s1 storePathConsumerQueue=/neworiental/rocketmq-5.1.0/broker-a-s1/store/consumequeue brokerIP2=172.24.30.193 brokerIP1=172.24.30.193 aclEnable=false storePathRootDir=/neworiental/rocketmq-5.1.0/broker-a-s1/store storePathCommitLog=/neworiental/rocketmq-5.1.0/broker-a-s1/store/commitlog # 3000 days: 3600*24*3000 timerMaxDelaySec=259200000 traceTopicEnable=true timerPrecisionMs=1000 timerEnableDisruptor=true
After starting the two brokers, observe the dashboard, and the master-slave node of broker-a has been identified:
After killing the master node, the effect is as follows:
The slave is successfully switched to the master, and the node that was killed is restarted:
Become the slave node of the current master, as expected, and then start broker-b, broker-b-s, broker-c, broker-c-s normally to complete the cluster construction.
exporter
Slightly, see another article: https://blog.csdn.net/sinat_14840559/article/details/119782996
For the latest version of the cluster, it is best to download the new version of exporter: https://github.com/apache/rocketmq-exporter
Problems encountered
- When multiple brokers are deployed on the same machine, the brokers started later fail to start and crash:
Reason: On the same machine, the port settings of the two brokers are too similar, and the broker will open several ports for internal communication. The default is 1 and 2 less than the configured port, and the higher version may occupy more, so the port setting should be as far as possible The difference is big, this problem has been delayed for a long time, and the log does not say that the port is occupied, which is very unfriendly.
-
In a group of brokers, two slaves appear at the same time, one of which logs is normal, and the other reports an error: Error happens when change sync state set
Reason: There is a problem with the internally maintained SyncStateSet during the back-and-forth master-slave switching process. Stop the two brokers, and then start the broker with a normal log first, and then start the broker with an error log. If you start the broker with an error report first, the master will be elected Failed: CODE: 2012 DESC: The broker has not master, and this new registered broker can’t be elected as master
Written at the end
I have used K8S for two years to manage the cluster. This time, because the Operater I wrote before does not support the new version of the cluster, I need to build one temporarily. Although I have built it many times before, I have to say that K8S is really better than manual construction. It’s so convenient, manual construction, no matter how careful you are, there will be mistakes, and you have to do systemd, resource coordination, etc. . .
Manually building a middleware cluster is a delicate task. A small configuration error may plunge you into an infinite abyss, and it will take you a long time to troubleshoot. Therefore, preparation is very important, and all files and configurations must be prepared first. Okay, check it out, and it’s done in one go.
Relatively speaking, this article just goes through the main process. You need to go to CV for the detailed configuration file, mainly to verify the new version of the Controller. To learn the complete construction process, you can refer to: https://blog.csdn .net/sinat_14840559/article/details/108391651