Configure $HADOOP_HOME/etc/hadoop

Configure $HADOOP_HOME/etc/hadoop

cd $HADOOP_HOME/etc/hadoop
Setup slave, place hostname of the slave nodes are, in this case, same hostname because slave is on the same machine.
vi slaves:
1
localhost
2
master.hadoop.lan
Copied!
The first to edit is core-site.xml file. This file contains information about the port number used by Hadoop instance, file system allocated memory, data store memory limit and the size of Read/Write buffers.
$ vi etc/hadoop/core-site.xml
Add the following properties between <configuration> ... </configuration> tags. Use localhost or your machine FQDN, such as hadoop.master.lan for hadoop instance.
1
<property>
2
<name>fs.defaultFS</name>
3
<value>hdfs://master.hadoop.lan:9000/</value>
4
</property>
Copied!
Next open and edit hdfs-site.xml file. The file contains information about the value of replication data, namenode path and datanode path for local file systems.
$ vi etc/hadoop/hdfs-site.xml
Here add the following properties between <configuration> ... </configuration> tags. On this guide we’ll use /mnt/common/hdfs/ directory to store our hadoop file system.
Replace the dfs.data.dir and dfs.name.dir values accordingly.
1
<property>
2
<name>dfs.data.dir</name>
3
<value>file:///mnt/common/hdfs/datanode</value>
4
</property>
5
<property>
6
<name>dfs.name.dir</name>
7
<value>file:///mnt/common/hdfs/namenode</value>
8
</property>
Copied!
Because we’ve specified /mnt/common/hdfs/ as our hadoop file system storage, we need to create those two directories (datanode and namenode) from root account and grant all permissions to hadoop account, or whatever the user name that has installed hadoop by executing the below commands.
1
su - root
2
mkdir -p /mnt/common/hdfs/namenode
3
mkdir -p /mnt/common/hdfs/datanode
Copied!
In my case, the user that has installed hadoop is bigdata2
1
chown -R bigdata2:bigdata2 /mnt/common/hdfs/
2
ls -al /opt/ #Verify permissions
3
exit
Copied!
Exit root account to turn back to bigdata2 user
Please replace user bigdata2 with your user name, do NOT copy bigdata2 user name in this slide.
Next, create the mapred-site.xml file to specify that we are using yarn MapReduce framework.
$ vi etc/hadoop/mapred-site.xml
Add the following excerpt to mapred-site.xml file:
1
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
2
<configuration>
3
<property>
4
<name>mapreduce.framework.name</name>
5
<value>yarn</value>
6
</property> </configuration>
Copied!
Now, edit yarn-site.xml file with the below statements enclosed between <configuration> ... </configuration> tags:
$ vi etc/hadoop/yarn-site.xml
Add the following excerpt to yarn-site.xml file:
1
<property>
2
<name>yarn.nodemanager.aux-services</name>
3
<value>mapreduce_shuffle</value>
4
</property>
Copied!
You should see something like below:
(base) [email protected]:~/hadoop/hadoop-2.7.7/etc/hadoop$ cat yarn-site.xml
1
<?xml version="1.0"?>
2
<!--
3
Licensed under the Apache License, Version 2.0 (the "License");
4
you may not use this file except in compliance with the License.
5
You may obtain a copy of the License at
6
​
7
http://www.apache.org/licenses/LICENSE-2.0
8
​
9
Unless required by applicable law or agreed to in writing, software
10
distributed under the License is distributed on an "AS IS" BASIS,
11
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
See the License for the specific language governing permissions and
13
limitations under the License. See accompanying LICENSE file.
14
-->
15
<configuration>
16
<!-- Site specific YARN configuration properties -->
17
<property>
18
<name>yarn.nodemanager.aux-services</name>
19
<value>mapreduce_shuffle</value>
20
</property>
21
</configuration>
Copied!
set JAVA home variable for Hadoop environment by editing the below line from hadoop-env.sh file.
$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
Edit the following line to point to your Java system path.
export JAVA_HOME=/home/bigdata2/java/jdk1.8.0_202
Last modified 1yr ago
Copy link