Hadoop Configuration using Ansible

Hadoop Configuration using Ansible

·

5 min read

Today we're going to configure Hadoop in Target-nodes using Ansible. In the last blog, I've configured httpd docker image using Ansible and also discussed a few things to remember about Ansible. So in this article I'm not going to discuss those things, I'm assuming that you've gone through the last article so, Let's continue and configure the Hadoop using Ansible:-

First create the Inventory with two groups :-

  • Master-node group (Containing IP address of Master-node)
  • Slave-node group (Containing IP address of Slave-nodes)

After creating the inventory pass it's the path to the ansible.cfg (Ansible Configuration) file.

Now let's create an ansible-playbook :-

Ansible-playbook for configuring Master node:-

( Here I'm configuring localhost as the Master-node )

---
- hosts: localhost

  vars:
        - ip: 0.0.0.0
        - port: 9459        
        - dir: masternode
        - node: name

  tasks:
        - name: Downloading Hadoop
          command: wget https://archive.apache.org/dist/hadoop/core/hadoop-1.2.1/hadoop-1.2.1-1.x86_64.rpm

        - name: Installing Hadoop 
          command: rpm -i hadoop-1.2.1-1.x86_64.rpm --force

        - name: Downloading Java
          command: wget http://35.244.242.82/yum/java/el7/x86_64/jdk-8u171-linux-x64.rpm

        - name: Installing Java
          command: rpm -i jdk-8u171-linux-x64.rpm

        - name: Creating Directory
          file:
                name: /{{ dir }}
                state: directory

        - name: Configuring hdfs-site.xml
          template:
                src: /root/Ansible/hdfs-site.xml
                dest: /etc/hadoop/hdfs-site.xmli

        - name: Configuring core-site.xml
          template:
                src: /root/Ansible/core-site.xml
                dest: /etc/hadoop/core-site.xml

        - name: Formatting {{ node }}node
          command: hadoop {{ node }}node -format

        - name: Starting the {{ node }}node Server
          command: hadoop-daemon.sh start {{ node }}node

        - name: Checking Status
          command: jps
          register: status

        - debug:
                var: status

        - name: Checking Report
          command: hadoop dfsadmin -report
          register: admin_report

        - debug:
                var: admin_report

hdfs-site.xml in controller node:-

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.{{ node }}.dir</name>
<value>/{{ dir }}</value>
</property>
</configuration>

core-site.xml in controller node:-

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://{{ ip }}:{{ port }}</value>
</property>
</configuration>

master node configured.jpg

So, we can see that Master-node is successfully configured, let's move to the Master-node check the status.

hdfs-site.xml created in target node:-

hdfs.jpg

core-site.xml created in target node:-

core.jpg

Let's check Hadoop services running or not:-

services running.jpg

We can see the Namenode is successfully configured!

Ansible-playbook for configuring Slave node:-

( Here I'm configuring localhost as the Slave-node )

---
- hosts: localhost

  vars:
        - ip: 13.126.156.47
        - port: 9459        
        - dir: slavenode
        - node: data

  tasks:
        - name: Downloading Hadoop
          command: wget https://archive.apache.org/dist/hadoop/core/hadoop-1.2.1/hadoop-1.2.1-1.x86_64.rpm

        - name: Installing Hadoop 
          command: rpm -i hadoop-1.2.1-1.x86_64.rpm --force

        - name: Downloading Java
          command: wget http://35.244.242.82/yum/java/el7/x86_64/jdk-8u171-linux-x64.rpm

        - name: Installing Java
          command: rpm -i jdk-8u171-linux-x64.rpm

        - name: Creating Directory
          file:
                name: /{{ dir }}
                state: directory

        - name: Configuring hdfs-site.xml
          template:
                src: /root/Ansible/hdfs-site.xml
                dest: /etc/hadoop/hdfs-site.xmli

        - name: Configuring core-site.xml
          template:
                src: /root/Ansible/core-site.xml
                dest: /etc/hadoop/core-site.xml

        - name: Starting the {{ node }}node Server
          command: hadoop-daemon.sh start {{ node }}node

        - name: Checking Status
          command: jps
          register: status

        - debug:
                var: status

hdfs-site.xml in controller node:-

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.{{ node }}.dir</name>
<value>/{{ dir }}</value>
</property>
</configuration>

core-site.xml in controller node:-

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://{{ ip }}:{{ port }}</value>
</property>
</configuration>

datanode cof.jpg

So, we can see that Slave-node is successfully configured, let's move to the Slave-node check the status.

hdfs-site.xml created in target node:-

slave hdfs.jpg

core-site.xml created in target node:-

core site slave.jpg

Let's check Hadoop services running or not:-

jps slave.jpg

So, we can see that Master-node and Slave-node are successfully configured, that means our task is finished. All thanks to Mr Vimal Daga sir for providing such knowledge.