Table of Contents
Preparations required for construction:
Building process:
1. Install the virtual machine
2. Configure the network
3. Modify the host name
4. Bind the host name and IP
5. Configure password-free login
6. Use remote connection tools to upload jdk and hadoop
7. Install jdk and Hadoop
1. Unzip jdk and hadoop
2. Configure jdk and hadoop environment variables
?3. Add jdk environment
4. In the file core-site.xml
5. hdfs-site.xml
6. yarn-site.xml
7. mapred-site.xml
8. Copy to slave node
9. Format the NameNode file system
10. Start the Hadoop cluster
11. Test whether you can connect to the Hadoop platform
Preparations required for construction:
- Virtual machine: VMware Workstation Pro 17
- ISO image file: CentOS-6.5-x86_64-bin-DVD1.iso
- JDK version: jdk-8u171-linux-x64.tar.gz
- Hadoop version: hadoop-3.3.0.tar.gz
- Remote connection tool: MobaXterm
Building process:
1. Install virtual machine
A virtual machine HadoopMaster is required as the master node (the host name can be chosen by yourself, not unique)
Two or more hosts are required as slave nodes (we name them HadoopSlave1 and HadoopSlave2 here)
2. Configure the network
- Enter the command prompt and enter the command ipconfig to view the IP and gateway.
- We use bridging (automatic). After changing WiFi, make sure the network segment is the same, otherwise the virtual machine will not be able to ping the gateway.
- Here our network segment is 43, the WiFi IP address is 192.168.43.146, and the gateway is 192.168.43.1.
- When configuring the virtual machine network card, make sure that the IP address does not conflict with the WiFi address here, but the gateway must be consistent.
Open the virtual machine and use the root user to edit the file ifcfg-eth0 to configure the network. Otherwise, the permissions on the side are insufficient.
Switch to root user: su –
Edit ifcfg-eth0 to configure the network: vim /etc/sysconfig/network-scripts/ifcfg-eth0
- Reserved DEVICE=eth0, ONBOOT, BOOTPROTO
- Comment out the rest with #
- Designed to start at boot: ONBOOT=no changed to ONBOOT=yes
- Set IP to static IP: BOOTPRO=dhcp changed to BOOTPRO=static
Note: Except for the different IP addresses, the other configurations of HadoopMaster, HadoopSlave1, and HadoopSlave2 are the same.
After ifcfg-eth0 configuration is completed:
- Turn off the firewall: chkconfig iptables off
- Refresh the network card: service network restart
Note: All three virtual machines need to execute
Ensure that the IP and gateway of each virtual machine can be pinged.
3. Modify the host name
Needs to be modified in the configuration file.
Use command: vim /etc/sysconfig/network
- Change HOSTNAME to your own hostname
Note: All three virtual machines must execute
4. Bind the host name and IP
Use command: vim /etc/hosts
The IP address of each machine corresponds to its own host name.
- 192.168.43.110 HadoopMatser
- 192.168.43.111 HadoopSlave1
- 192.168.43.112 HadoopSlave2
Note: Each virtual machine must execute
Make sure you can ping the hostname:
5. Configure password-free login
- Generate key pair:
ssh-keygen -t rsa
- Keep pressing Enter
- Copy the public key into the key file:
cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
Note: Need to be executed in each virtual machine
- Remotely copy the key file to the slave node
scp ~/.ssh/authorized_keys zxa@HadoopSlave1:~/.ssh/ scp ~/.ssh/authorized_keys zxa@HadoopSlave2:~/.ssh/
Log in from the master node to the slave node and use ssh to authenticate:
6. Use the remote connection tool to upload jdk and hadoop
You can use Xshell, MobaXterm, or WinSCP to upload files. MobaXterm is used here.
Create an ordinary user zxa, create a folder software in the home directory, and then upload the compressed packages of Hadoop and jdk to the folder software.
- useradd zxa
- su-zxa
- mkdir software
- Upload jdk and hadoop compressed packages to software
Note: Creating the ordinary user zxa and directory software needs to be done for each virtual machine.
You can connect each virtual machine in turn and upload it to the software, or you can upload it to the master node HadoopMaster and then copy it remotely to the slave nodes HadoopSlave1 and HadooSlave2.
scp /home/zxa/software/hadoop-3.3.0.tar.gz zxa@HadoopSlave1:/home/zxa/software/ scp /home/zxa/software/hadoop-3.3.0.tar.gz zxa@HadoopSlave2:/home/zxa/software/ scp /home/zxa/software/jdk-8u171-linux-x64.tar.gz zxa@HadoopSlave1:/home/zxa/software/ scp /home/zxa/software/jdk-8u171-linux-x64.tar.gz zxa@HadoopSlave2:/home/zxa/software/
Seven. Install jdk and Hadoop
1. Decompress jdk and hadoop
- Decompress the jdk compressed package: tar -zxvf jdk-8u171-linux-x64.tar.gz
- Decompress the hadoop compressed package: tar -zxvf hadoop-3.3.0.tar.gz
- Create the directory hadooptmp under software: mkdir hadooptmp
Note: Each virtual machine needs to execute
2. Configure jdk and hadoop environment variables
- vim /home/zxa/.bash_profile
#jdk export JAVA_HOME=/home/zxa/software/jdk1.8.0_171 export PATH=$JAVA_HOME/bin:$PATH #hadoop export HADOOP_HOME=/home/zxa/software/hadoop-3.3.0 export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
- source /home/zxa/.bash_profile
- java -version (check whether jdk configuration is successful)
Note: Source must be used, otherwise the configuration will not take effect. Each virtual machine needs to be configured.
3. Adding jdk environment
Switch directory to hadoop, the full path is: /home/zxa/software/hadoop-3.3.0/etc/hadoop/
Add content to the files hadoop-env.sh and yarn-env.sh
export JAVA_HOME=/home/zxa/software/jdk1.8.0_171
Four. File core-site.xml in
content:
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://HadoopMaster:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/zxa/software/hadooptmp</value> </property> </configuration>Note: Only need to operate in HadoopMaster
< /h5>
五. hdfs-site.xml 中
content:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
content:
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
Note: You only need to operate on the master node HadoopMaster
6. yarn-site.xml in
content:
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>HadoopMaster:18040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>HadoopMaster:18030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>HadoopMaster:18025</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>HadoopMaster:18141</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>HadoopMaster:8088</value> </property> </configuration>Note: Only need to operate in HadoopMaster
Seven. mapred-site.xml in
content:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>Note: You only need to operate on the master node HadoopMaster.
8. Workers Medium
vim /home/zxa/hadoop-3.3.0/etc/hadoop/workers
Replace worker with the following
s
Contents in:HadoopSlave1
HadoopSlave2
8. Copy to slave node
Use the following command to configure the
Hadoop
Directory copied to slave node
superior:scp -r hadoop-3.3.0 zxa@HadoopSlave1:~/software/ scp -r hadoop-3.3.0 zxa@HadoopSlave2:~/software/Note: Just execute in the master node HadoopMaster
9. Format NameNode File System
The formatting command is as follows. This operation only needs to be done in
HadoopMaster
Execute on the node:hdfs namenode -format
10. Start the Hadoop cluster
Start command: start-all.sh
View started processes: jps
Note: The startup command only needs to be entered in the master node HadoopMaster. There are four processes in the master node and only three in the slave node.
Eleven. Test whether it can be connected to the Hadoop platform
The ip address of the master node HadoopMaster + English colon + port number
192.168.43.110:9870
192.168.43.110.8088
hadoop port 9870
hadoop port 8088
At this point, the Hadoop fully distributed cluster has been successfully built!