Apache Hadoop’s Second Generation: Performance and Resource Management through YARN

Vanisha Mavi, Nidhi Tyagi

Authors

Vanisha Mavi, Nidhi Tyagi

Keywords:

Big Data, Hadoop, MapReduce, YARN (Yet another Resource Negotiator).

Abstract

The rapid expansion of digital technologies has resulted in an enormous growth of data generated by individuals, organizations, and technological systems. This large and complex collection of data is commonly referred to as Big Data. Big Data typically involves datasets that are too large, fast-growing, or complex to be processed efficiently using traditional data management systems and conventional analytical tools. As a result, organizations increasingly rely on advanced technologies capable of storing, managing, and analysing such massive volumes of information.

One of the most widely used technologies designed to address Big Data challenges is Hadoop. Hadoop is an open-source framework that enables the distributed storage and processing of large datasets across clusters of computers. Its architecture allows data to be stored and processed in parallel, thereby improving scalability, reliability, and efficiency. A key component of Hadoop is the MapReduce programming model, which divides large data processing tasks into smaller subtasks that can be executed simultaneously across multiple computing nodes. This distributed approach significantly enhances the speed and efficiency of data processing.

Furthermore, the Hadoop ecosystem has evolved with the introduction of YARN (Yet Another Resource Negotiator), which serves as the next generation resource management platform for Hadoop. YARN separates resource management from data processing, enabling multiple data processing frameworks to run on the same cluster. This paper provides an overview of Big Data, Hadoop architecture, the MapReduce model, and the role of YARN in modern data processing systems.

References

Vinod Kumar, Vavilapall, Arun C Murthy, Chris Douglas, Sharad Agarwali, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O’Malley, Benjamin Reed,Eric Baldeschwieler, “Apache Hadoop YARN: Yet Another Resource Negotiator”, SoCC’13, 1–3 Oct. 2013, Santa Clara, California, USA, ACM978-1-4503-2428-1.

Jenifer Jothi Mary1, Dr. L. Arockiam2”,“A Study on Basic Concepts of Big Data”.

Apache tez. http://incubator.apache.org/projects/tez.html.

Amogh Pramod Kulkarni1, Mahesh Khandewal2”, “Hadoop and Introduction to YARN”, International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 5, May 2014).

J. Dean and S. Ghemawat”, “MapReduce: simplified data processing on large clusters”, “Commun. ACM,51(1):107–113, Jan. 2008.”

“Hortonworks Hadoop YARN. http://hortonworks.com/hadoop/yarn/.

“Ibrahim Abaker Targio Hashem,Nor Badrul Anuar, Abdullah Gani, Ibrar Yaqoob, Feng Xia, Samee Ullah Khan”, ”MapReduce: Review and open challenges”, Scientometrics ,DOI 10.1007/s11192-016-1945-y, 1 February 2016.

”Kiejin Park and Limei Peng“, “A Design of High-speed Big Data Query Processing System for Social Data Analysis”, International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 14 (2016) pp 8221-8225.

“Arinto Murdopo, Jim Dowling“, “Next Generation Hadoop: High Availability for YARN”.

”Kala Karun. A, Chitharanjan.K“, “A Review on Hadoop–HDFS”, Infrastructure Extensions, Proceedings of 2013 IEEE Conference on Information and Communication Technologies (ICT 2013).

“ Smita Konda, Rohini More”, “Big Data in HDFS with Zookeeper and Flume“, International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 02 Issue: 09 | Dec-2015 www.irjet.net p-ISSN: 2395-0072.

”Ashish Sharma, Snehlata Vyas”, “Hadoop2 Yarn”, IPASJ International Journal of Computer Science (IIJCS), Volume 3, Issue 9, September 2015.

”Harshawardhan S. Bhosale, Prof. Devendra P. Gadekar”, “A Review Paper on Big Data and Hadoop”, International Journal of Scientific and Research Publications, Volume 4, Issue 10, October 2014.

”Jenifer Jothi Mary1, Dr. L. Arockiam2”, “A Study on Basic Concepts of Big Data”, International Journal of Emerging Trends in Computing and Communication Technology, Volume 1, No 3, August 2015 ISSN: 2348 4454.

“Avita Katal, Mohammad Wazid, and R H Goudar”, “Big Data: Issues. Challenges, Tools and Good Practices”, 2013 IEEE.

”Firat Tekiner and John A. Keane”, “Big Data Framework”, International Conference on System, 2013 IEEE.

“S. Loughran, D. Das, and E. Baldeschwieler”, “Introducing Hoya–HBase on YARN”, http://hortonworks.com/blog/introducing-hoya-hbase-on-yarn/2013.

Apache Hadoop’s Second Generation: Performance and Resource Management through YARN

Authors

Keywords:

Abstract

References

Downloads

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

Our Indexing