DNS global traffic scheduling

Background introduction

With the rapid development of business, Company A is particularly concerned about the reputation and user experience of its products, and a good user experience is inseparable from the stable construction of infrastructure. Currently, Enterprise A has a single data center architecture. Due to the limitations of this architecture, after a data center failure, the business cannot be restored immediately, which will directly affect user access. The operation and maintenance team has always been in passive emergency mode, which is the stability of the R&D center. A steady scapegoat. In view of this, the director of the operation and maintenance center proposed to the CTO based on the current situation that in order to improve business stability, it is recommended to implement a dual-center construction project. The two data centers will jointly bear the business traffic and serve as hot backup for each other. In the event of a failure, traffic scheduling can be carried out and stopped as soon as possible. damage.

This time, the application operation and maintenance team has taken on the goal of building dual centers in the R&D center. The core task is to solve the global traffic scheduling problem of the dual data centers.

Solution

After sorting out the business architecture, it was found that the access layer architecture of the four- and seven-layer services is basically the same, and the calling links are as follows:

  • Fourth layer: client-business domain name-LVS cluster-application
  • Seventh layer: client-business domain name-Nginx cluster-application

Then, use the business domain name as a breakthrough to solve the dual-center global traffic scheduling problem.

1. Domain name transformation

Standardized domain name structure

Before:

The domain name resolution of a single center is that the domain name is directly resolved to the IP address. For example, the A record of www.ops.com is 1.1.1.1;

After renovation:

  • Access is performed by different operators. China Unicom is connected to idc1, and by default, all are connected to idc2.
  • Add data center entrance analysis
    • idc1.ops.com Default A 1.1.1.1
    • idc2.ops.com default A 2.2.2.2
  • Business domain name CNAME to data center entrance
    • www.ops.com China Unicom CNAME idc1.ops.com
    • www.ops.com Default CNAME idc2.ops.com

2. Traffic scheduling

The architecture of normal access business traffic is as follows:

Assume that IDC1 currently fails, perform traffic switching, and switch all traffic from idc1 to idc2. The principle is to change the resolution of the idc1.ops.com domain name to 2.2.2.2

Environment setup

condition:

  1. Register an Alibaba Cloud account: Alibaba Cloud-Computing, for incalculable value
  2. Install Alibaba Cloud CLI: Alibaba Cloud CLI-Alibaba Cloud Help Center
  3. Configuring cloud analysis: the configurations of each cloud vendor are similar.
  4. API usage: Understand the basic principles of API
  5. jq command: Redirecting to jqlang.github.io

Basic instructions:

  • Operating system: CentOS Linux 7 (Core)

Alibaba Cloud account registration

Omit, just register yourself

Install Alibaba Cloud CLI

Open How to Use the Installation Package to Install Alibaba Cloud CLI on Linux Platform_Alibaba Cloud CLI-Alibaba Cloud Help Center, download the Alibaba Cloud CLI tool, and upload it to the /opt/ directory of the server

cd /opt/
tar xf aliyun-cli-linux-latest-amd64.tgz
chmod +x aliyun
mv aliyun /usr/local/bin

Create AccessKey and enter Alibaba Cloud-Console-Account Information

Choose to continue using AccessKey

CreateAccessKeyID

Save the generated AccessKeyID and AccessKeySecret

Configuring aliyun cli: How to configure credentials non-interactively_Alibaba Cloud CLI-Alibaba Cloud Help Center

# Execute the following command to configure the CLI tool
aliyun configure set \
  --profile akProfile \
  --mode AK \
  --region cn-hangzhou \
  --access-key-id fill in your AccessKeyId \
  --access-key-secret fill in your AccessKeySecret

# Execute the following command and the Json result will appear, indicating that the configuration is successful.
root@ops-mgr-backup:~/.aliyun# aliyun ecs DescribeRegions
{
        "Regions": {
                "Region": [
                        {
                                "LocalName": "North China 1 (Qingdao)",
                                "RegionEndpoint": "ecs.cn-qingdao.aliyuncs.com",
                                "RegionId": "cn-qingdao"
                        },
                        ...slightly...
                        {
                                "LocalName": "Germany (Frankfurt)",
                                "RegionEndpoint": "ecs.eu-central-1.aliyuncs.com",
                                "RegionId": "eu-central-1"
                        }
                ]
        },
        "RequestId": "E7BD8D2C-8406-5D10-8E33-D2238C93E25B"
}

Configuring cloud analysis

Console-Search Cloud Analysis-Enter Cloud Analysis Console

Add the ops.com domain name (since the domain name has not been purchased, it cannot be parsed and can only be used for demonstration)

Enter the resolution settings and configure domain name records

Configure the following records

Get parsing records

Goal: Get the RecordId of the record to be modified

Document address: Call DescribeDomainRecords to obtain the resolution record list_Cloud Analysis DNS-Alibaba Cloud Help Center

It is recommended to use online debugging tools, which will automatically generate CLI commands.

Fill in the domain name, record, record type

Execute the generated command on the command line, where “RecordId”: “816320087127489536” is required when modifying the record value, so the RecordId needs to be automatically obtained every time it is executed.

root@ops-mgr-backup:~/.aliyun# aliyun alidns DescribeDomainRecords --region cn-hangzhou --DomainName 'ops.com' --RRKeyWord idc1 --TypeKeyWord A
{
        "DomainRecords": {
                "Record": [
                        {
                                "DomainName": "ops.com",
                                "Line": "default",
                                "Locked": false,
                                "RR": "idc1",
                                "RecordId": "816320087127489536",
                                "Status": "ENABLE",
                                "TTL": 600,
                                "Type": "A",
                                "Value": "1.1.1.1",
                                "Weight": 1
                        }
                ]
        },
        "PageNumber": 1,
        "PageSize": 20,
        "RequestId": "44D29484-A7AE-5B45-A0EA-C9BD26D38A64",
        "TotalCount": 1
}

Modify analysis record

Goal: Modify the A record of idc1.ops.com from 1.1.1.1 to 2.2.2.2

Document address: Call UpdateDomainRecord to modify domain name resolution records_Cloud Resolution DNS-Alibaba Cloud Help Center

Use online debugging to automatically generate CLI commands

Copy the generated CLI command

Execute the command and return the data as follows, indicating that the modification is successful.

root@ops-mgr-backup:~/.aliyun# aliyun alidns UpdateDomainRecord --region cn-hangzhou --RecordId 816320087127489536 --RR idc1 --Type A --Value '2.2.2.2'
{
        "RecordId": "816320087127489536",
        "RequestId": "BCF3DFA3-2CD6-50C0-8F06-A7A79AF6D83D"
}

View the parsing of idc1 records in the cloud parsing console has been updated to 2.2.2.2

The above content configuration is completed. At the same time, the aliyun CLI tool can be used to query and modify records normally, and then the script logic can be written.

Script content

#!/bin/bash
######################################
# Script name: switch_idc.sh
# Script version: v1.0
# Function description: Global traffic scheduling (stop loss, stress testing)
# Parameter description: sh switch_idc.sh {1to2|1to1|2to1|2to2}
# 1to2: Schedule the traffic of idc1 to idc2
# 1to1: Schedule the traffic of idc1 back to idc1
# 2to1: Schedule the traffic of idc2 to idc1
# 2to2: Schedule the traffic of idc2 back to idc2
# Core logic: Use the aliyun CLI tool to call the DNS interface to modify the record value
# Dependent tools: jq - [Linux json processing tool]
# Script author: shiyang.zhu
# Contact email: [email protected]
# Creation time: 2023-03-06
######################################


######################################
#Global variables:
# PARAM positional parameter
# SCRIPT_DIR directory where the script is located
# SCRIPT_NAME script name
# LOG_DIR log directory
# LOG_FILE log file
# LOCK_FILE lock file
# DOAMIN_NAME domain name
# IDC1_RECORD_VALUE idc1 record value
# IDC2_RECORD_VALUE idc2 record value
# ALIYUN_CLI aliyun command line tool
# REGION Alibaba Cloud Region
# PARAM_LIST parameter list
######################################
PARAM="$1"
SCRIPT_DIR=$(dirname $(readlink -f "$0"))
SCRIPT_NAME="$0"
LOG_DIR=/var/log/shell/${SCRIPT_NAME}
LOG_FILE=${LOG_DIR}/$(date + %Y-%m-%d).log
LOCK_FILE=/tmp/
DOMAIN_NAME="ops.com"
IDC1_RECORD_VALUE="1.1.1.1"
IDC2_RECORD_VALUE="2.2.2.2"
ALIYUN_CLI="/usr/local/bin/aliyun"
REGION="cn-hangzhou"
PARAM_LIST=(1to1 1to2 2to1 2to2)


[ ! -d ${LOG_DIR} ] & amp; & amp; mkdir -p ${LOG_DIR}

# Error log function
function err_log() {
    echo "[$(date + '%Y-%m-%dT%H:%M:%S%z')]: [ERROR] $@" |tee -a ${LOG_FILE}
    exit 1
}

#Normal log function
function log(){
    echo "[$(date + '%Y-%m-%dT%H:%M:%S%z')]: [INFO] $@" |tee -a ${LOG_FILE}
}

# Script prompt function
function usage(){
    echo "Usage: ${SCRIPT_NAME} {1to1|1to2|2to1|2to2}"
    exit 1
}

#Script parameter judgment
if [ "$#" != "1" ];then
    usage
fi


######################################
# Dependency command check function
# Local variables:
# CMDS: Dependency command list, separated by spaces
######################################
function check_cmd(){
    local CMDS="jq"
    for CMD in ${CMDS};do
        local CHECK_RESULT=$(rpm -qa ${CMD}|wc -l)
        if [[ ${CHECK_RESULT} -eq 0 ]];then
            err_log "${CMD} command does not exist, please install it manually: yum install -y ${CMD}!"
        fi
    done
}

check_cmd

######################################
# Query the RecordID and current record value of the record, use spaces
# Separate
# Local variables:
# IDC: Source IDC
#Return data: record ID current record value, such as 816320087127489536 2.2.2.2
######################################
function queryRecordId(){
    local IDC="$1"
    ${ALIYUN_CLI} alidns DescribeDomainRecords \
        --region ${REGION} \
        --DomainName ${DOMAIN_NAME} \
        --RRKeyWord ${IDC} \
        --TypeKeyWord A |jq -r '.DomainRecords.Record[]|.RecordId,.Value'|xargs -n2
}

######################################
# Modify the record value function to determine the current record value and target value
# Is it consistent? If consistent, no modification will be made, and a prompt will be given directly.
# Local variables:
# RECORD_ID: ID of the record to be modified
# IDC: Source IDC
# RECORD_VALUE: Target record value
######################################
function modifyRecordValue(){
    local RECORD_ID="$1"
    local IDC="$2"
    local RECORD_VALUE="$3"
    if [ "${SRC_VALUE}"x = "${RECORD_VALUE}"x ];then
        log "The target value is the same as the current record value and does not need to be modified"
    else
        ${ALIYUN_CLI} alidns UpdateDomainRecord \
            --region ${REGION} \
            --RecordId ${RECORD_ID} \
            --RR ${IDC} \
            --Type A \
            --Value ${RECORD_VALUE}
    fi
}


# Main function, redundant, not optimized
main(){
    case ${PARAM} in
        "1to2")
               local QUERY_RESULT=$(queryRecordId "idc1")
               local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}')
               local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}')
               modifyRecordValue ${RECORD_ID} "idc1" "${IDC2_RECORD_VALUE}"
               log "idc1 traffic has been scheduled to idc2"
               ;;
        "1to1")
               local QUERY_RESULT=$(queryRecordId "idc1")
               local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}')
               local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}')
               modifyRecordValue ${RECORD_ID} "idc1" "${IDC1_RECORD_VALUE}"
               log "idc1 traffic has been scheduled back to idc1"
               ;;
        "2to1")
               local QUERY_RESULT=$(queryRecordId "idc2")
               local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}')
               local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}')
               modifyRecordValue ${RECORD_ID} "idc2" "${IDC1_RECORD_VALUE}"
               log "idc2 traffic has been scheduled to idc1"
               ;;
        "2to2")
               local QUERY_RESULT=$(queryRecordId "idc2")
               local RECORD_ID=$(echo ${QUERY_RESULT}|awk '{print $1}')
               local SRC_VALUE=$(echo ${QUERY_RESULT}|awk '{print $2}')
               modifyRecordValue ${RECORD_ID} "idc2" "${IDC2_RECORD_VALUE}"
               log "idc2 traffic has been scheduled back to idc2"
               ;;
        *)
               usage
               ;;
    esac
}

# Call the main function
main ${PARAM}

Expected results

Simulate idc1 failure and execute sh switch_idc.sh 1to2

root@ops-mgr-backup:~# sh switch_idc.sh 1to2
{
        "RecordId": "816320087127489536",
        "RequestId": "25068F7A-53E3-5493-B115-1D51E543BABE"
}
[2023-03-07T16:16:23 + 0800]: [INFO] idc1 traffic has been scheduled to idc2

Simulate idc1 recovery and execute sh switch_idc.sh 1to1

root@ops-mgr-backup:~# sh switch_idc.sh 1to1
{
        "RecordId": "816320087127489536",
        "RequestId": "8CAA0E4C-A12B-5A2B-8543-63B4F6BC7C50"
}
[2023-03-07T16:16:25 + 0800]: [INFO] idc1 traffic has been scheduled back to idc1

Simulate idc2 failure and execute sh switch_idc.sh 2to1

root@ops-mgr-backup:~# sh switch_idc.sh 2to1
{
        "RecordId": "816320103958200320",
        "RequestId": "FFF101DE-C96D-5CD5-A50C-FEB1B4E3926A"
}
[2023-03-07T16:16:29 + 0800]: [INFO] idc2 traffic has been scheduled to idc1

Simulate idc2 recovery and execute sh switch_idc.sh 2to2

root@ops-mgr-backup:~# sh switch_idc.sh 2to2
{
        "RecordId": "816320103958200320",
        "RequestId": "4D948874-E66F-5A57-B84F-6DEBA95BA889"
}
[2023-03-07T16:16:31 + 0800]: [INFO] idc2 traffic has been scheduled back to idc2

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skills treeHomepageOverview 16803 people are learning the system