Tuesday, 28 March 2017

Hadoop Distributed File System (HDFS)

The default big data storage layer for Apache Hadoop is HDFS and it is designed to run on the commodity machines which are of low cost hardware servers. The distributed data is stored in the HDFS file system. HDFS is highly fault tolerant system and provides high throughput access to the applications that are required big data.

HDFS comprises of 3 important components actually, NameNode, DataNode and Secondary NameNode. As already mentioned, HDFS operates on a Master-Slave architecture model where the master being the namenode and slaves are the datanodes. The namenode controls the access to the data by its clients and the datanodes manage to store the data on the nodes that are running on. As same as the other filesystems Hadoop also splits the file into one or more blocks and these blocks are stored in the datanodes and each of those data block is replicated to 3 different datanodes to provide high availability to the hadoop eco system. This is how Hadoop being fault tolerant in nature. Also, the block replication factor is configurable. HDFS component creates several copies of the data block to be distributed across different nodes in the cluster for reliable and quick data access.

Demons of HDFS:


NameNode


Namenode is the main base of the Hadoop distributed file system. The Namenode keeps track of the metadata information and location of the data blocks on the data node. This metadata information is stored permanently on to local hard disk of the namenode in the form of namespace image known as FS image and edit log file.

FS image is snap shot of the file system and current changes like creation, deletion, updating will be in the edit logs. Often these edit logs will be converted as a snapshot that is the fsimage.  However, the namenode does not store this information persistently. The namenode creates the block to datanode mapping when it is restarted. So, If the namenode crashes, then the entire hadoop system goes down.

[root@sandbox ~]# ls -lrt /hadoop/hdfs/namenode/current/
-rw-r--r-- 1 hdfs hadoop 21505030 Mar 28 10:39 edits_0000000000020911480-0000000000021041539
-rw-r--r-- 1 hdfs hadoop  7340032 Mar 28 11:26 edits_0000000000021041540-0000000000021078684
-rw-r--r-- 1 hdfs hadoop  5639982 Mar 29 04:52 fsimage_0000000000021078684
-rw-r--r-- 1 hdfs hadoop       62 Mar 29 04:52 fsimage_0000000000021078684.md5
-rw-r--r-- 1 hdfs hadoop      203 Mar 29 04:52 VERSION
-rw-r--r-- 1 hdfs hadoop    73087 Mar 29 04:59 edits_0000000000021078685-0000000000021079182
-rw-r--r-- 1 hdfs hadoop        9 Mar 29 04:59 seen_txid
-rw-r--r-- 1 hdfs hadoop  5603567 Mar 29 05:00 fsimage_0000000000021079182
-rw-r--r-- 1 hdfs hadoop       62 Mar 29 05:00 fsimage_0000000000021079182.md5
-rw-r--r-- 1 hdfs hadoop  5242880 Mar 29 05:21 edits_inprogress_0000000000021079183
[root@sandbox ~]# 

Secondary Namenode


The responsibility of secondary name node is to periodically copy and merge the namespace image and edit log from the namenode. Actually, the secondary namenode help the namenode to create the fsimage from editlogs in regular interval also in case if the name node crashes, then the namespace image stored in secondary namenode can be used to restart the namenode. Generally the Namenode services and the secondary namenode services has to be in different nodes.

[root@sandbox ~]# ls -lrt /hadoop/hdfs/namesecondary/current/
-rw-r--r-- 1 hdfs hadoop    31794 Mar 28 04:39 edits_0000000000020911255-0000000000020911479
-rw-r--r-- 1 hdfs hadoop 21505030 Mar 28 10:39 edits_0000000000020911480-0000000000021041539
-rw-r--r-- 1 hdfs hadoop  5639982 Mar 29 05:00 fsimage_0000000000021078684
-rw-r--r-- 1 hdfs hadoop       62 Mar 29 05:00 fsimage_0000000000021078684.md5
-rw-r--r-- 1 hdfs hadoop    73087 Mar 29 05:00 edits_0000000000021078685-0000000000021079182
-rw-r--r-- 1 hdfs hadoop  5603567 Mar 29 05:00 fsimage_0000000000021079182
-rw-r--r-- 1 hdfs hadoop       62 Mar 29 05:00 fsimage_0000000000021079182.md5
-rw-r--r-- 1 hdfs hadoop      203 Mar 29 05:00 VERSION
[root@sandbox ~]#

DataNode


Data Node is where the actual data has been stored as chunks or the blocks. While storing the data in Hadoop, the data will be split by the split size mentioned and store in the cluster. The default split size is 64MB we can change as per the need.

Let’s say that we are going to store 128MB of data in Hadoop. Since, the default split size is 64MB, so the 124MB data will be split into 2 blocks and store across the cluster. Lets go for another example to storing 65MB of data. so the default split size is 64MB, so the 65MB of file will be split into 2 blocks and store in the cluster. Will be discussing more about the data nodes and the replication factors on another blog.


Journal Node:


Well, the name node is single point of failure, some High Availably(HA) has been introduced that is called as a Journal node. The JournalNode will not have fsimage files or seen_txid. In addition, it contains several other files relevant to the HA implementation. These files help prevent a split-brain scenario, in which multiple NameNodes could think they are active and all try to write edits.

No comments:

Post a Comment