1. On the vmwa machine, install ubuntu, some preparation settings
When creating a virtual machine, fill in the name of the virtual machine master,
When installing ubuntu, fill in master for name and computer name, fill in hadoop for user name,
After installation, check whether you can directly drag the file to the virtual machine. If not, install vmware-tools, restart after installation, and then copy jdk-8u162-linux-x64.tar.gz and hadoop-3.1.3.tar. gz to the main directory,
Set the screen not to turn off automatically
2. Configure ip address
Verify
3. Install openssh-server, vim (habit)
sudo apt-get update sudo apt-get install openssh-server sudo apt-get install vim
4. Install java and hadoop
cd ~ sudo tar -zxvf jdk-8u162-linux-x64.tar.gz -C /usr/local/ sudo tar -zxvf hadoop-3.1.3.tar.gz -C /usr/local
Give these two permissions, owned by the hadoop user
cd /usr/local sudo chown -R hadoop hadoop-3.1.3/ sudo chown -R hadoop jdk1.8.0_162
View results with ll
Configure environment variables
cd ~ vim.bashrc
Add at the bottom of the .bashrc file
## java environment variable export JAVA_HOME=/usr/local/jdk1.8.0_162/ export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin # hadoop variables export HADOOP_HOME=/usr/local/hadoop-3.1.3 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin
to validate
source.bashrc
check
5. Configure hadoop
cd /usr/local/hadoop-3.1.3/etc/hadoop/ vim core-site.xml #Add to <configuration></configuration> <!-- Specify NameNode address --> <property> <name>fs.defaultFS</name> <value>hdfs://master:8020</value> </property> <!-- Specify the storage directory of hadoop data --> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop-3.1.3/data</value> </property>
vim hdfs-site.xml <!-- Specify NameNode web--> <property> <name>dfs.namenode.http-address</name> <value>master:9870</value> </property> <!-- second name node web --> <property> <name>dfs.namenode.secondary.http-address</name> <value>slave2:9868</value> </property>
vim mapred-site.xml <!-- Specify NameNode address --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
vim yarn-site.xml <!-- Specify MR shuffle--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- ResourceManager addr--> <property> <name>yarn.resourcemanager.hostname</name> <value>slave1</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME, HADOOP_CONNON_HONE, HADOOP_HDFS_HONE, HADOOP_CONF_DIR, CLASSPATH_PREPEND_DISTCACHE, HADOOP_YARN_HOME, HADOOP_MAPRED_HOME</value> </property>
vim workers the master slave1 slave2
vim hadoop-env.sh HADOOP_SECURE_DN_USER=root HDFS_SECONDARYNAMENODE_USER=root HDFS_NAMENODE_USER=root HDFS_DATANODE_USER=root HDFS_ZKFC_USER=root HDFS_JOURNALNODE_USER=root YARN_RESOURCEMANAGER_USER=root YARN_NODEMANAGER_USER=root JAVA_HOME=/usr/local/jdk1.8.0_162/ HADOOP_SHELL_EXECNAME=root
6. Modify the hosts file and close the firewall
sudo vim /etc/hosts
192.168.33.10 master 192.168.33.11 slave1 192.168.33.12 slave2
# Turn off the firewall systemctl stop firewalld.service # disable booting systemctl disable firewalld.service
7. Clone the virtual machine
Shut down the virtual machine first, right-click the virtual machine-management-next step-create a full clone-virtual machine name slave1 location and look at the selection
Clone slave2 in the same way
Open three virtual machines after cloning
8. Modify the ip address and hostname of slave1 and slave2
#in slave1 hostnamectl set-hostname slave1 ## Re-use hadoop login sudo login #in slave2 hostnamectl set-hostname slave2 sudo login
9. Configure ssh password-free login
Generate the private key and public key of the ssh link on the three machines respectively
ssh-keygen -t rsa
#under master cd ~/.ssh touch authorized_keys cat id_rsa.pub >> authorized_keys #under slave1 scp ~/.ssh/id_rsa.pub hadoop@master:~/ #under master cd ~ cat id_rsa.pub >> .ssh/authorized_keys #slave2 scp ~/.ssh/id_rsa.pub hadoop@master:~/ #master cd ~ cat id_rsa.pub >> .ssh/authorized_keys #Finally, look at the three additional public keys, in the master: cat .ssh/authorized_keys #Then use the scp command to upload the authorized_keys file in the master node to the .ssh/ directory of the slave1 and slave2 nodes respectively, in the master: scp /home/hadoop/.ssh/authorized_keys hadoop@slave1:~/.ssh/ scp /home/hadoop/.ssh/authorized_keys hadoop@slave2:~/.ssh/ #Verify ssh password-free login, remote login slave1 command, if you want to log in to other nodes, change slave1 to the host name of other hosts ssh slave1 # exit command exit
10. Start the cluster
Initialize HDFS on first boot (on master)
hdfs namenode -format
Start HDFS (on master)
start-dfs.sh
Browser access
192.168.33.10:9870/
Stop the cluster (on the master)
stop-all.sh
refer to:
hadoop-3.1.3 fully distributed cluster construction – Zhihu (zhihu.com)
(60 messages) Super invincible detailed use of ubuntu to build hadoop fully distributed cluster_ubuntu build hadoop cluster_Ordinary Netizen’s Blog-CSDN Blog