Chrome Extension
WeChat Mini Program
Use on ChatGLM

Hadoop Distributed File System for Big data analysis

Hatim Talal Almansouri,Youssef Masmoudi

2019 4th World Conference on Complex Systems (WCCS)(2019)

Cited 5|Views0
No score
Abstract
Hadoop is framework that is processing data with large volume that cannot be processed by conventional systems. Hadoop has management file system called Hadoop Distributed File System (HDFS) that has NameNode and DataNode where the data is divided into blocks based on the total size of dataset. In addition, Hadoop has MapReduce where the dataset is processed in Mapping phase and then reducing phase. Using Hadoop for big data analysis has been revealed important information that can be used for analytical purpose and enabling new products. Big data could be found in many different resources such as social networks, web server logs, broadcast audio streams and banking transactions. In this paper, we illustrated the main steps to setup Hadoop and MapReduce. The illustrated version in this work is the latest released of Hadoop 3.1.1 for big data analysis. A simplified pseudo code is provided to show the functionality of Map class and reduce class. The developed steps are applied with a given example that could be generalized with bigger data.
More
Translated text
Key words
Hadoop,MapReduce,HDFS,DataNode,NameNode,Big Data Analysis
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined