HDFS cluster NameNode high availability transformation

Article directory

    • background
    • High availability transformation
      • Program implementation
        • Environmental preparation
        • Configuration file modification
        • Application configuration
        • Cluster status verification
        • High availability verification

Background

Assume that there are currently three ZooKeeper servers, namely zk-01/02/03, and several DataNode servers;

Currently, the Namenode of the HDFS cluster does not have high availability configuration. The Namenode and Secondary Namenode are located on zk-03 at the same time.

And the role of Secondary Namenode is to assist Namenode recovery,it is not a high-availability backup of Namenode.

High availability transformation

Cluster planning

zk-01 zk-02 zk-03
Active NameNode Standby NameNode
JournalNode JournalNode JournalNode
ZK Failover Controller ZK Failover Controller

Hadoop versions before 3.0 only support enabling a single Standby Namenode. Hadoop version 3.0 and later supports enabling multiple Standby Namenodes.

Plan implementation

Environment preparation
  1. Turn off firewall
  2. Configure ssh password-free login between zk-01/02/03
  3. Configure jdk environment variables
Configuration file modification
  1. hadoop-2.7.3/etc/hadoop/core-site.xml

    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <!-- delete next line -->
            <value>hdfs://zk-03:9000</value> <!-- delete -->
            <!-- delete done -->
            <!-- add next line -->
            <value>hdfs://hacluster</value>
            <!-- add done -->
        </property>
        <property>
            <name>io.file.buffer.size</name>
            <value>131072</value>
        </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/data/0/hadoop/hadoop/tmp</value>
        </property>
        <!-- add next 5 lines -->
        <property>
            <name>ha.zookeeper.quorum</name>
            <value>zk-01:2181,zk-02:2181,zk-03:2181</value>
            <description>Specify zookeeper address</description>
        </property>
        <!-- add done -->
    </configuration>
    
  2. hadoop-2.7.3/etc/hadoop/hdfs-site.xml

    <configuration>
        <!-- add next multi-lines -->
        <property>
            <name>dfs.nameservices</name>
            <value>hacluster</value>
            <description>Specify the nameservice of hdfs as ns, which needs to be consistent with that in core-site.xml</description>
        </property>
        <property>
            <name>dfs.ha.namenodes.hacluster</name>
            <value>namenode1,namenode2</value>
            <description>There are two NameNodes under hacluster</description>
        </property>
    
        <property>
            <name>dfs.namenode.rpc-address.hacluster.namenode1</name>
            <value>zk-01:9000</value>
        </property>
        <property>
            <name>dfs.namenode.http-address.hacluster.namenode1</name>
            <value>zk-01:50070</value>
        </property>
        <property>
            <name>dfs.namenode.rpc-address.hacluster.namenode2</name>
            <value>zk-02:9000</value>
        </property>
        <property>
            <name>dfs.namenode.http-address.hacluster.namenode2</name>
            <value>zk-02:50070</value>
        </property>
    
        <property>
                <name>dfs.ha.fencing.methods</name>
                <value>sshfence</value>
                <description>Configure the isolation mechanism so that only one Namenode responds to the outside world at the same time</description>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/home/hadoop/.ssh/id_rsa</value>
                <description>SSH login is required when using the isolation mechanism</description>
        </property>
    
        <property>
              <name>dfs.ha.automatic-failover.enabled</name>
              <value>true</value>
              <description>Enable automatic switching when NameNode fails</description>
        </property>
        <property>
                <name>dfs.client.failover.proxy.provider.hacluster</name>
                <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
                <description>Automatic switching implementation method upon configuration failure</description>
        </property>
    
        <property>
            <name>dfs.namenode.shared.edits.dir</name>
            <value>qjournal://zk-01:8485;zk-02:8485;zk-03:8485/hacluster</value>
            <description>Specify the storage location of NameNode's metadata on JournalNode</description>
        </property>
        <property>
              <name>dfs.journalnode.edits.dir</name>
              <value>/data/0/hadoop/hadoop/journal</value>
              <description>Specify the location where JournalNode stores data on the local disk</description>
        </property>
        <!-- add done -->
        <property>
            <name>dfs.replication</name>
            <value>2</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/data/0/hadoop/hadoop/name</value>
        </property>
        <property>
            <name>dfs.blocksize</name>
            <value>268435456</value>
        </property>
        <property>
            <name>dfs.namenode.handler.count</name>
            <value>100</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name> <value>/data/0/hadoop/hadoop/data,/data/1/hadoop/hadoop/data,/data/2/hadoop/hadoop/data, /data/3/hadoop/hadoop/data,/data/4/hadoop/hadoop/data,/data/5/hadoop/hadoop/data,/data/6/hadoop/hadoop/data,/data/7/hadoop /hadoop/data,/data/8/hadoop/hadoop/data,/data/9/hadoop/hadoop/data,/data/10/hadoop/hadoop/data,/data/11/hadoop/hadoop/data</ value>
        </property>
    </configuration>
    
Application configuration
  1. Login zk-03
$ scp -r /home/hadoop/hadoop-2.7.3 zk-02:/home/hadoop/
$ scp -r /home/hadoop/hadoop-2.7.3 zk-01:/home/hadoop/
# Stop the service of the existing HDFS cluster
$ hdfs/sbin/stop-dfs.sh
  1. zk-01/02/03: Start all JournalNodes
$ hdfs/sbin/hadoop-daemon.sh start journalnode
  1. zk-01: Initialize and start namenode1, zkfc
# Initialize and start namenode1
$ hdfs/bin/hdfs namenode -format
$ hdfs/bin/hdfs namenode -initializeSharedEdits
$ hdfs/sbin/hadoop-daemon.sh start namenode
# Initialize ha cluster information in ZK
$ hdfs/bin/hdfs zkfc -formatZK
  1. zk-02: Start namenode2, zkfc
# Synchronize the metadata information of namenode on zk01 and start namenode2
$ hdfs/bin/hdfs namenode -bootstrapStandby
$ hdfs/sbin/hadoop-daemon.sh start namenode
# Synchronize ha cluster information in ZK02
$ hdfs/bin/hdfs zkfc -formatZK
  1. zk-01: Start other services in the cluster, including datanode
$ hdfs/sbin/start-dfs.sh
Cluster status verification
  1. Log in to zk-01/02 and execute jps respectively. The result should contain:
    • PID1 JournalNode
    • PID2 NameNode
    • PID3 DFSZKFailoverController
  • If the JournalNode process does not exist, execute:
    sbin/hadoop-daemon.sh start journalnode
  • If the DFSZKFailoverController process does not exist, execute:
    sbin/hadoop-daemon.sh start zkfc
  • If the NameNode process does not exist, execute:
    sbin/hadoop-daemon.sh start namenode
  1. Log in to zk-03 and execute jps. The result should contain:
    • PID JournalNode
  • If the JournalNode process does not exist, execute:

    sbin/hadoop-daemon.sh start journalnode

  1. On any DataNode server, execute jps, and the result should contain:
    • PID1 DataNode
  • If there is no DataNode process, execute:

    sbin/hadoop-daemon.sh start datanode

High availability verification
  1. Log in to zk-01 and check the status of namenode1:

    bin/hdfs haadmin -getServiceState namenode1, the output result should be active;

    If the above result is standby, you can execute the following command to switch the primary namenode to namenode1:

    bin/hdfs haadmin -transitionToActive --forcemanual namenode1

    Execute the command again to view the status of namenode1 and namenode2:

    bin/hdfs haadmin -getServiceState namenode1, the output should be active;

    bin/hdfs haadmin -getServiceState namenode2, the output should be standby.

  2. Log in to zk-01 and stop namenode1: bin/hdfs --daemon stop namenode
    The zkfc process should automatically stop and execute jps. NameNode and DFSZKFailoverController do not exist in the result.
    Check the status of namenode2:
    bin/hdfs haadmin -getServiceState namenode2, the result should be active.

  3. Restart namenode1:
    bin/hdfs --daemon start namenode
    Check the status of namenode1:

    bin/hdfs haadmin -getServiceState namenode1, the result should be standby.

    At this time, you can use the command to switch the primary node in step 1 to switch the primary node to namenode1.