HDFS cluster NameNode high availability transformation

Article directory

- background
- High availability transformation
- - Program implementation
  - - Environmental preparation
    - Configuration file modification
    - Application configuration
    - Cluster status verification
    - High availability verification

Background

Assume that there are currently three ZooKeeper servers, namely zk-01/02/03, and several DataNode servers;

Currently, the Namenode of the HDFS cluster does not have high availability configuration. The Namenode and Secondary Namenode are located on zk-03 at the same time.

And the role of Secondary Namenode is to assist Namenode recovery,it is not a high-availability backup of Namenode.

High availability transformation

Cluster planning

zk-01	zk-02	zk-03
Active NameNode	Standby NameNode
JournalNode	JournalNode	JournalNode
ZK Failover Controller	ZK Failover Controller

Hadoop versions before 3.0 only support enabling a single Standby Namenode. Hadoop version 3.0 and later supports enabling multiple Standby Namenodes.

Plan implementation

Environment preparation

Turn off firewall
Configure ssh password-free login between zk-01/02/03
Configure jdk environment variables

Configuration file modification

hadoop-2.7.3/etc/hadoop/core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <!-- delete next line -->
        <value>hdfs://zk-03:9000</value> <!-- delete -->
        <!-- delete done -->
        <!-- add next line -->
        <value>hdfs://hacluster</value>
        <!-- add done -->
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/0/hadoop/hadoop/tmp</value>
    </property>
    <!-- add next 5 lines -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>zk-01:2181,zk-02:2181,zk-03:2181</value>
        <description>Specify zookeeper address</description>
    </property>
    <!-- add done -->
</configuration>

hadoop-2.7.3/etc/hadoop/hdfs-site.xml

<configuration>
    <!-- add next multi-lines -->
    <property>
        <name>dfs.nameservices</name>
        <value>hacluster</value>
        <description>Specify the nameservice of hdfs as ns, which needs to be consistent with that in core-site.xml</description>
    </property>
    <property>
        <name>dfs.ha.namenodes.hacluster</name>
        <value>namenode1,namenode2</value>
        <description>There are two NameNodes under hacluster</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacluster.namenode1</name>
        <value>zk-01:9000</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.hacluster.namenode1</name>
        <value>zk-01:50070</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.hacluster.namenode2</name>
        <value>zk-02:9000</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.hacluster.namenode2</name>
        <value>zk-02:50070</value>
    </property>

    <property>
            <name>dfs.ha.fencing.methods</name>
            <value>sshfence</value>
            <description>Configure the isolation mechanism so that only one Namenode responds to the outside world at the same time</description>
    </property>
    <property>
            <name>dfs.ha.fencing.ssh.private-key-files</name>
            <value>/home/hadoop/.ssh/id_rsa</value>
            <description>SSH login is required when using the isolation mechanism</description>
    </property>

    <property>
          <name>dfs.ha.automatic-failover.enabled</name>
          <value>true</value>
          <description>Enable automatic switching when NameNode fails</description>
    </property>
    <property>
            <name>dfs.client.failover.proxy.provider.hacluster</name>
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
            <description>Automatic switching implementation method upon configuration failure</description>
    </property>

    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://zk-01:8485;zk-02:8485;zk-03:8485/hacluster</value>
        <description>Specify the storage location of NameNode's metadata on JournalNode</description>
    </property>
    <property>
          <name>dfs.journalnode.edits.dir</name>
          <value>/data/0/hadoop/hadoop/journal</value>
          <description>Specify the location where JournalNode stores data on the local disk</description>
    </property>
    <!-- add done -->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/data/0/hadoop/hadoop/name</value>
    </property>
    <property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
    </property>
    <property>
        <name>dfs.namenode.handler.count</name>
        <value>100</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name> <value>/data/0/hadoop/hadoop/data,/data/1/hadoop/hadoop/data,/data/2/hadoop/hadoop/data, /data/3/hadoop/hadoop/data,/data/4/hadoop/hadoop/data,/data/5/hadoop/hadoop/data,/data/6/hadoop/hadoop/data,/data/7/hadoop /hadoop/data,/data/8/hadoop/hadoop/data,/data/9/hadoop/hadoop/data,/data/10/hadoop/hadoop/data,/data/11/hadoop/hadoop/data</ value>
    </property>
</configuration>

Application configuration

$ scp -r /home/hadoop/hadoop-2.7.3 zk-02:/home/hadoop/
$ scp -r /home/hadoop/hadoop-2.7.3 zk-01:/home/hadoop/
# Stop the service of the existing HDFS cluster
$ hdfs/sbin/stop-dfs.sh

zk-01/02/03: Start all JournalNodes

$ hdfs/sbin/hadoop-daemon.sh start journalnode

zk-01: Initialize and start namenode1, zkfc

# Initialize and start namenode1
$ hdfs/bin/hdfs namenode -format
$ hdfs/bin/hdfs namenode -initializeSharedEdits
$ hdfs/sbin/hadoop-daemon.sh start namenode
# Initialize ha cluster information in ZK
$ hdfs/bin/hdfs zkfc -formatZK

zk-02: Start namenode2, zkfc

# Synchronize the metadata information of namenode on zk01 and start namenode2
$ hdfs/bin/hdfs namenode -bootstrapStandby
$ hdfs/sbin/hadoop-daemon.sh start namenode
# Synchronize ha cluster information in ZK02
$ hdfs/bin/hdfs zkfc -formatZK

zk-01: Start other services in the cluster, including datanode

$ hdfs/sbin/start-dfs.sh

Cluster status verification

Log in to zk-01/02 and execute jps respectively. The result should contain:
- PID1 JournalNode
- PID2 NameNode
- PID3 DFSZKFailoverController

If the JournalNode process does not exist, execute:
sbin/hadoop-daemon.sh start journalnode
If the DFSZKFailoverController process does not exist, execute:
sbin/hadoop-daemon.sh start zkfc
If the NameNode process does not exist, execute:
sbin/hadoop-daemon.sh start namenode

If the JournalNode process does not exist, execute:

sbin/hadoop-daemon.sh start journalnode

On any DataNode server, execute jps, and the result should contain:
- PID1 DataNode

If there is no DataNode process, execute:

sbin/hadoop-daemon.sh start datanode

High availability verification

Log in to zk-01 and check the status of namenode1:

bin/hdfs haadmin -getServiceState namenode1, the output result should be active;

If the above result is standby, you can execute the following command to switch the primary namenode to namenode1:

bin/hdfs haadmin -transitionToActive --forcemanual namenode1

Execute the command again to view the status of namenode1 and namenode2:

bin/hdfs haadmin -getServiceState namenode1, the output should be active;

bin/hdfs haadmin -getServiceState namenode2, the output should be standby.
Log in to zk-01 and stop namenode1: bin/hdfs --daemon stop namenode
The zkfc process should automatically stop and execute jps. NameNode and DFSZKFailoverController do not exist in the result.
Check the status of namenode2:
bin/hdfs haadmin -getServiceState namenode2, the result should be active.
Restart namenode1:
bin/hdfs --daemon start namenode
Check the status of namenode1:

bin/hdfs haadmin -getServiceState namenode1, the result should be standby.

At this time, you can use the command to switch the primary node in step 1 to switch the primary node to namenode1.