Background introduction
With the rapid development of business, Company A is particularly concerned about the reputation and user experience of its products, and a good user experience is inseparable from the stable construction of infrastructure. Currently, Enterprise A has a single data center architecture. Due to the limitations of this architecture, after a data center failure, the business cannot be restored immediately, which will directly affect user access. The operation and maintenance team has always been in passive emergency mode, which is the stability of the R&D center. A steady scapegoat. In view of this, the director of the operation and maintenance center proposed to the CTO based on the current situation that in order to improve business stability, it is recommended to implement a dual-center construction project. The two data centers will jointly bear the business traffic and serve as hot backup for each other. In the event of a failure, traffic scheduling can be carried out and stopped as soon as possible. damage.
This time, the application operation and maintenance team has taken on the goal of building dual centers in the R&D center. The core task is to solve the global traffic scheduling problem of the dual data centers.
Solution
After sorting out the business architecture, it was found that the access layer architecture of the four- and seven-layer services is basically the same, and the calling links are as follows:
- Fourth layer: client-business domain name-LVS cluster-application
- Seventh layer: client-business domain name-Nginx cluster-application
Then, use the business domain name as a breakthrough to solve the dual-center global traffic scheduling problem.
1. Domain name transformation
Standardized domain name structure
Before:
The domain name resolution of a single center is that the domain name is directly resolved to the IP address. For example, the A record of www.ops.com is 1.1.1.1;
After renovation:
- Access is performed by different operators. China Unicom is connected to idc1, and by default, all are connected to idc2.
- Add data center entrance analysis
-
- idc1.ops.com Default A 1.1.1.1
- idc2.ops.com default A 2.2.2.2
- Business domain name CNAME to data center entrance
-
- www.ops.com China Unicom CNAME idc1.ops.com
- www.ops.com Default CNAME idc2.ops.com
2. Traffic scheduling
The architecture of normal access business traffic is as follows:
Assume that IDC1 currently fails, perform traffic switching, and switch all traffic from idc1 to idc2. The principle is to change the resolution of the idc1.ops.com domain name to 2.2.2.2
Environment setup
condition:
- Register an Alibaba Cloud account: Alibaba Cloud-Computing, for incalculable value
- Install Alibaba Cloud CLI: Alibaba Cloud CLI-Alibaba Cloud Help Center
- Configuring cloud analysis: the configurations of each cloud vendor are similar.
- API usage: Understand the basic principles of API
- jq command: Redirecting to jqlang.github.io
Basic instructions:
- Operating system: CentOS Linux 7 (Core)
Alibaba Cloud account registration
Omit, just register yourself
Install Alibaba Cloud CLI
Open How to Use the Installation Package to Install Alibaba Cloud CLI on Linux Platform_Alibaba Cloud CLI-Alibaba Cloud Help Center, download the Alibaba Cloud CLI tool, and upload it to the /opt/ directory of the server
cd /opt/ tar xf aliyun-cli-linux-latest-amd64.tgz chmod +x aliyun mv aliyun /usr/local/bin
Create AccessKey and enter Alibaba Cloud-Console-Account Information
Choose to continue using AccessKey
CreateAccessKeyID
Save the generated AccessKeyID and AccessKeySecret
Configuring aliyun cli: How to configure credentials non-interactively_Alibaba Cloud CLI-Alibaba Cloud Help Center
# Execute the following command to configure the CLI tool aliyun configure set \ --profile akProfile \ --mode AK \ --region cn-hangzhou \ --access-key-id fill in your AccessKeyId \ --access-key-secret fill in your AccessKeySecret # Execute the following command and the Json result will appear, indicating that the configuration is successful. root@ops-mgr-backup:~/.aliyun# aliyun ecs DescribeRegions { "Regions": { "Region": [ { "LocalName": "North China 1 (Qingdao)", "RegionEndpoint": "ecs.cn-qingdao.aliyuncs.com", "RegionId": "cn-qingdao" }, ...slightly... { "LocalName": "Germany (Frankfurt)", "RegionEndpoint": "ecs.eu-central-1.aliyuncs.com", "RegionId": "eu-central-1" } ] }, "RequestId": "E7BD8D2C-8406-5D10-8E33-D2238C93E25B" }
Configuring cloud analysis
Console-Search Cloud Analysis-Enter Cloud Analysis Console
Add the ops.com domain name (since the domain name has not been purchased, it cannot be parsed and can only be used for demonstration)
Enter the resolution settings and configure domain name records
Configure the following records
Get parsing records
Goal: Get the RecordId of the record to be modified
Document address: Call DescribeDomainRecords to obtain the resolution record list_Cloud Analysis DNS-Alibaba Cloud Help Center
It is recommended to use online debugging tools, which will automatically generate CLI commands.
Fill in the domain name, record, record type
Execute the generated command on the command line, where “RecordId”: “816320087127489536” is required when modifying the record value, so the RecordId needs to be automatically obtained every time it is executed.
root@ops-mgr-backup:~/.aliyun# aliyun alidns DescribeDomainRecords --region cn-hangzhou --DomainName 'ops.com' --RRKeyWord idc1 --TypeKeyWord A { "DomainRecords": { "Record": [ { "DomainName": "ops.com", "Line": "default", "Locked": false, "RR": "idc1", "RecordId": "816320087127489536", "Status": "ENABLE", "TTL": 600, "Type": "A", "Value": "1.1.1.1", "Weight": 1 } ] }, "PageNumber": 1, "PageSize": 20, "RequestId": "44D29484-A7AE-5B45-A0EA-C9BD26D38A64", "TotalCount": 1 }
Modify analysis record
Goal: Modify the A record of idc1.ops.com from 1.1.1.1 to 2.2.2.2
Document address: Call UpdateDomainRecord to modify domain name resolution records_Cloud Resolution DNS-Alibaba Cloud Help Center
Use online debugging to automatically generate CLI commands
Copy the generated CLI command
Execute the command and return the data as follows, indicating that the modification is successful.
root@ops-mgr-backup:~/.aliyun# aliyun alidns UpdateDomainRecord --region cn-hangzhou --RecordId 816320087127489536 --RR idc1 --Type A --Value '2.2.2.2' { "RecordId": "816320087127489536", "RequestId": "BCF3DFA3-2CD6-50C0-8F06-A7A79AF6D83D" }
View the parsing of idc1 records in the cloud parsing console has been updated to 2.2.2.2
The above content configuration is completed. At the same time, the aliyun CLI tool can be used to query and modify records normally, and then the script logic can be written.
Script content
#!/bin/bash ###################################### # Script name: switch_idc.sh # Script version: v1.0 # Function description: Global traffic scheduling (stop loss, stress testing) # Parameter description: sh switch_idc.sh {1to2|1to1|2to1|2to2} # 1to2: Schedule the traffic of idc1 to idc2 # 1to1: Schedule the traffic of idc1 back to idc1 # 2to1: Schedule the traffic of idc2 to idc1 # 2to2: Schedule the traffic of idc2 back to idc2 # Core logic: Use the aliyun CLI tool to call the DNS interface to modify the record value # Dependent tools: jq - [Linux json processing tool] # Script author: shiyang.zhu # Contact email: [email protected] # Creation time: 2023-03-06 ###################################### ###################################### #Global variables: # PARAM positional parameter # SCRIPT_DIR directory where the script is located # SCRIPT_NAME script name # LOG_DIR log directory # LOG_FILE log file # LOCK_FILE lock file # DOAMIN_NAME domain name # IDC1_RECORD_VALUE idc1 record value # IDC2_RECORD_VALUE idc2 record value # ALIYUN_CLI aliyun command line tool # REGION Alibaba Cloud Region # PARAM_LIST parameter list ###################################### PARAM="$1" SCRIPT_DIR=$(dirname $(readlink -f "$0")) SCRIPT_NAME="$0" LOG_DIR=/var/log/shell/${SCRIPT_NAME} LOG_FILE=${LOG_DIR}/$(date + %Y-%m-%d).log LOCK_FILE=/tmp/ DOMAIN_NAME="ops.com" IDC1_RECORD_VALUE="1.1.1.1" IDC2_RECORD_VALUE="2.2.2.2" ALIYUN_CLI="/usr/local/bin/aliyun" REGION="cn-hangzhou" PARAM_LIST=(1to1 1to2 2to1 2to2) [ ! -d ${LOG_DIR} ] & amp; & amp; mkdir -p ${LOG_DIR} # Error log function function err_log() { echo "[$(date + '%Y-%m-%dT%H:%M:%S%z')]: [ERROR] $@" |tee -a ${LOG_FILE} exit 1 } #Normal log function function log(){ echo "[$(date + '%Y-%m-%dT%H:%M:%S%z')]: [INFO] $@" |tee -a ${LOG_FILE} } # Script prompt function function usage(){ echo "Usage: ${SCRIPT_NAME} {1to1|1to2|2to1|2to2}" exit 1 } #Script parameter judgment if [ "$#" != "1" ];then usage fi ###################################### # Dependency command check function # Local variables: # CMDS: Dependency command list, separated by spaces ###################################### function check_cmd(){ local CMDS="jq" for CMD in ${CMDS};do local CHECK_RESULT=$(rpm -qa ${CMD}|wc -l) if [[ ${CHECK_RESULT} -eq 0 ]];then err_log "${CMD} command does not exist, please install it manually: yum install -y ${CMD}!" fi done } check_cmd ###################################### # Query the RecordID and current record value of the record, use spaces # Separate # Local variables: # IDC: Source IDC #Return data: record ID current record value, such as 816320087127489536 2.2.2.2 ###################################### function queryRecordId(){ local IDC="$1" ${ALIYUN_CLI} alidns DescribeDomainRecords \ --region ${REGION} \ --DomainName ${DOMAIN_NAME} \ --RRKeyWord ${IDC} \ --TypeKeyWord A |jq -r '.DomainRecords.Record[]|.RecordId,.Value'|xargs -n2 } ###################################### # Modify the record value function to determine the current record value and target value # Is it consistent? If consistent, no modification will be made, and a prompt will be given directly. # Local variables: # RECORD_ID: ID of the record to be modified # IDC: Source IDC # RECORD_VALUE: Target record value ###################################### function modifyRecordValue(){ local RECORD_ID="$1" local IDC="$2" local RECORD_VALUE="$3" if [ "${SRC_VALUE}"x = "${RECORD_VALUE}"x ];then log "The target value is the same as the current record value and does not need to be modified" else ${ALIYUN_CLI} alidns UpdateDomainRecord \ --region ${REGION} \ --RecordId ${RECORD_ID} \ --RR ${IDC} \ --Type A \ --Value ${RECORD_VALUE} fi } # Main function, redundant, not optimized main(){ case ${PARAM} in "1to2") local QUERY_RESULT=$(queryRecordId "idc1") local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}') local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}') modifyRecordValue ${RECORD_ID} "idc1" "${IDC2_RECORD_VALUE}" log "idc1 traffic has been scheduled to idc2" ;; "1to1") local QUERY_RESULT=$(queryRecordId "idc1") local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}') local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}') modifyRecordValue ${RECORD_ID} "idc1" "${IDC1_RECORD_VALUE}" log "idc1 traffic has been scheduled back to idc1" ;; "2to1") local QUERY_RESULT=$(queryRecordId "idc2") local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}') local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}') modifyRecordValue ${RECORD_ID} "idc2" "${IDC1_RECORD_VALUE}" log "idc2 traffic has been scheduled to idc1" ;; "2to2") local QUERY_RESULT=$(queryRecordId "idc2") local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}') local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}') modifyRecordValue ${RECORD_ID} "idc2" "${IDC2_RECORD_VALUE}" log "idc2 traffic has been scheduled back to idc2" ;; *) usage ;; esac } # Call the main function main ${PARAM}
Expected results
Simulate idc1 failure and execute sh switch_idc.sh 1to2
root@ops-mgr-backup:~# sh switch_idc.sh 1to2 { "RecordId": "816320087127489536", "RequestId": "25068F7A-53E3-5493-B115-1D51E543BABE" } [2023-03-07T16:16:23 + 0800]: [INFO] idc1 traffic has been scheduled to idc2
Simulate idc1 recovery and execute sh switch_idc.sh 1to1
root@ops-mgr-backup:~# sh switch_idc.sh 1to1 { "RecordId": "816320087127489536", "RequestId": "8CAA0E4C-A12B-5A2B-8543-63B4F6BC7C50" } [2023-03-07T16:16:25 + 0800]: [INFO] idc1 traffic has been scheduled back to idc1
Simulate idc2 failure and execute sh switch_idc.sh 2to1
root@ops-mgr-backup:~# sh switch_idc.sh 2to1 { "RecordId": "816320103958200320", "RequestId": "FFF101DE-C96D-5CD5-A50C-FEB1B4E3926A" } [2023-03-07T16:16:29 + 0800]: [INFO] idc2 traffic has been scheduled to idc1
Simulate idc2 recovery and execute sh switch_idc.sh 2to2
root@ops-mgr-backup:~# sh switch_idc.sh 2to2 { "RecordId": "816320103958200320", "RequestId": "4D948874-E66F-5A57-B84F-6DEBA95BA889" } [2023-03-07T16:16:31 + 0800]: [INFO] idc2 traffic has been scheduled back to idc2
The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skills treeHomepageOverview 16803 people are learning the system