Hadoop mapreduce scheduling paradigms

Author(s)

Publication date

2017

Publisher

Institute of Electrical and Electronics Engineers

Document type

Abstract

Apache Hadoop is one of the most prominent and early technologies for handling big data. Different scheduling algorithms within the framework of Apache Hadoop were developed in the last decade. In this paper, we attempt to provide a comprehensive overview over the different paradigms for scheduling in Apache Hadoop. The surveyed approaches fall under different categories, namely, Deadline prioritization, Resource prioritization, Job size prioritization, Hybrid approaches and recent trends for improvements upon default schedulers.

Version

acceptedVersion

Permanent URL (for citation purposes)

  • https://hdl.handle.net/10642/6436