Host Settings
Three hosts turn off the firewall
Three hosts disable the SeLinux security mechanism
Modify the hostname for each host
Configure three host address mappings
Set password-free login
Install JDK
Unzip and install
Configure environment variables for jdk
The test installation was successful
Distribute the JDK
Install and configure Hadoop
Unzip and install
Hadoop environment configuration file – hadoop-env.sh
Modify the Hadoop configuration file core-site.xml
HDFS configuration file hdfs-site.xml
YARN configuration file yarn-site.xml
MapReduce configuration file vim mapred-site.xml
Configure workers
to initialize
sudo cp /etc/hosts /etc/hosts.bak sudo nano /etc/hosts
host settings
Three hosts turn off firewall
#Close the service systemctl stop firewalld #Turn off autostart systemctl disable firewalld
Three hosts turn off the SeLinux security mechanism
vim /etc/sysconfig/selinux
Change enforcing to disabled
Change the hostname for each host
Respectively modified to master, slave1, slave2
hostnamectl set-hostname hostname
Configure three host address mapping
vim /etc/hosts
Set password-free login
cd /root/.ssh
master password-free login master, slave1 and slave2
Generate a key pair
ssh-keygen -t rsa
Enter and hit enter 3 times
Copy three host public keys
ssh-copy-id root@hostname
Password-free login
#login ssh hostname #Sign out exit
Install JDK
Decompression installation
Unzip the installation package to the opt directory
tar -zxvf jdk-8u162-linux-x64.tar.gz -C /opt/
Modify directory name
mv jdk1.8.0_162 jdk1.8
Configure jdk environment variables
vim /etc/profile
export JAVA_HOME=/opt/jdk1.8 export PATH=$PATH:$JAVA_HOME/bin
Save configuration
source /etc/profile
Test installation success
java -version
Distribute JDK
scp -r $JAVA_HOME root@slave1:/opt scp -r $JAVA_HOME root@slave2:/opt
Install and configure Hadoop
Hadoop official download: Apache Hadoop
Unzip and install
Unzip the installation package to the opt directory
tar -zxvf hadoop-3.1.3.tar.gz -C /opt
Add hadoop environment variables
vim /etc/profile
export HADOOP_HOME=/opt/hadoop-3.1.3 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root
Note: The other two hosts also need to set the corresponding environment variables
Save configuration environment
source /etc/profile
Test installation successful
hadoop version
Hadoop environment configuration file – hadoop-env.sh
cd $HADOOP_HOME/etc/hadoop
vim hadoop-env.sh
export JAVA_HOME=/opt/jdk1.8 export HADOOP_HOME=/opt/hadoop-3.1.3 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root
Modify Hadoop configuration file core-site.xml
cd $HADOOP_HOME/etc/hadoop
vim core-site.xml
Write the following content
<configuration> <!--Used to specify the boss of hdfs --> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <!--Used to specify the storage directory of files generated when hadoop is running --> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-3.1.3/tmp</value> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> </configuration>
HDFS configuration file hdfs-site.xml
vim hdfs-site.xml
<configuration> <!--Set the directory of the name node --> <property> <name>dfs.namenode.name.dir</name> <value>/opt/hadoop-3.1.3/tmp/namenode</value> </property> <!--Set the directory of the data node--> <property> <name>dfs.datanode.data.dir</name> <value>/opt/hadoop-3.1.3/tmp/datanode</value> </property> <!--Set Auxiliary Name Node--> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:50090</value> </property> <!--Note that if you use hadoop2, the default is 50070--> <property> <name>dfs.namenode.http-address</name> <value>0.0.0.0:9870</value> </property> <!--Whether to enable hdfs permissions, when the value is false, it means off --> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
YARN configuration fileyarn-site.xml
vim yarn-site.xml
<configuration> <!--Configuration resource manager: cluster master--> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <!--Configure additional services running on the node manager--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!--Turn off virtual memory detection, if you do not configure it in the virtual machine environment, an error will be reported--> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>
MapReduce configuration file vim mapred-site.xml
vim mapred-site.xml
<configuration> <!--Configure MR resource scheduling framework YARN--> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> </configuration>
Configure workeders
vim workers
master slave1 slave2
Distribute the configured Hadoop
scp -r $HADOOP_HOME root@slave1:/opt scp -r $HADOOP_HOME root@slave2:/opt
Initialize
hdfs namenode -format
Start hadoop with the start command
start-all.sh
Master node process:
Slave node process: