OpenCloudOS Stream23 configuration hadoop

OpenCloudOS stream configuration hadoop

1. Install OpenCloudOS stream on the virtual machine

Image download link: ISO – Official Website of OpenCloudOS Community

Choose by yourself according to your needs, I choose DVD here

For the installation process, refer to OC V9 download and installation – OpenCloudOS Documentation [4. Installation Guide]

2. Hadoop installation process

(1) Preliminary preparation

1. Turn off the firewall and disable SELINUX

Enter vi /etc/selinux/config, change the value of SELINUX to disabled, save and exit.

Enter the reboot command to restart the system for the operation to take effect.

2. Configure the correspondence between host and ip

Use ifconfig to view ip address

Enter vi /etc/hosts and add 192.168.158.128 localhost in the last line (here my host name is localhost)

3. Create hadoop users and user groups, and set passwords for new users (command: passwd)

Enter useradd -m hadoop to create a hadoop user. The screenshot of successful creation is as follows:

4. Install JDK

(1) Install Xshell and use Xshell to connect to the virtual machine

(2) Use the command yum -y install lrzsz to download rz

(3) Use mkdir to create a new folder in the hadoop directory and name it app

6. Use the rz command to upload the JDK installation package, and upload the hadoop-2.6.0.tar.gz and jdk-7u79-linux-x64.tar.gz files to the app file

7. Unzip the jdk file: enter the command

tar -xvf jdk-7u79-linux-x64.tar.gz

8. Configure JDK environment variables:

(1) Establish a soft link:

ln -s jdk1.7.0_79 jdk

(2) Enter vi ~/.bashrc command to modify the file

JAVA_HOME=/home/hadoop/app/jdk
CLASSPATH=:$JAVA_HOME/lib/dt,jar:$JAVA_HOME/lib/tools.jar
PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME_CLASSPATH_PATH

(3) Use the source ~/.bashrc command to make the environment variable take effect

(4) Check the JDK version to ensure the installation is successful

(2) Hadoop pseudo-distributed cluster installation and configuration

1. Unzip the hadoop installation package:

tar -zxvf hadoop-2.6.0.tar.gz

2. Enter the hadoop configuration directory

3. Use the command vi core-site.xml to modify the core-site.xml configuration file

<property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
<\property>
<property>
                <name>hadoop.tmp.dir</name>
                <value>file:/home/hadoop/data/tmp</value>
<\property>
<property>
                <name>hadoop.proxyuser,hadoop.hosts</name>
                <value>*</value>
<\property>
<property>
                <name>hadoop.proxyuser,hadoop.group</name>
                <value>*</value>
<\property>

4. Modify the hdfs-site.xml configuration file

<property>
<name>dfs.namenode.data.dir</name>
<value>/home/hadoop/data/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

5. Modify the hadoop-env.sh configuration file

6. Modify the mapred-site.xml file

<configuration>
                <property>
                        <name>mapreduce.framework.name</name>
                        <value>yarn</value>
                </property>
</configuration>

7. Modify the yarn-site.xml file

 <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>

8. Modify slaves configuration file

9. Create hadoop 2.6.0 soft link: ln -s hadoop-2.6.0 hadoop

10. Configure hadoop environment variables and make them take effect (some environment variables have been configured before)

JAVA_HOME=/home/hadoop/app/jdk
HADOOP_HOME=/home/hadoop/app/hadoop
CLASSPATH=:$JAVA_HOME/lib/dt,jar:$JAVA_HOME/lib/tools.jar
PATH=$PATH:$JAVA_HOME/bin:HADOOP_HOME/bin:$PATH
export JAVA_HOME CLASSPATH_PATH PATH HADOOP_HOME

11. Create Hadoop related data directory

12. Enter the hadoop\bin directory and use the ./hadoop namenode -format command to format the NameNode

13. Use the command line sbin/start-all.sh to start the hadoop pseudo-distributed cluster, view the Hadoop startup process through the jps command, and use the command line sbin/stop-all.sh to close the hadoop pseudo-distributed cluster

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture directly Upload (img-7GklmRUS-1684825432466)(C:\Users\23170\AppData\Roaming\Typora\typora-user-images\image-20230522112126637.png)]
14. So far, hadoop configuration is over