On Improving Fault Tolerance for Heterogeneous Hadoop MapReduce Clusters
學年 102
學期 1
發表日期 2013-12-16
作品名稱 On Improving Fault Tolerance for Heterogeneous Hadoop MapReduce Clusters
作品名稱(其他語言)
著者 Lin, Chi-Yi; Chen, Ting-Hau; Cheng, Yi-No
作品所屬單位 淡江大學資訊工程學系
出版者 Institute of electrical and electronics engineers (IEEE)
會議名稱 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia 2013)
會議地點 Fuzhou, China
摘要 The computing paradigm of MapReduce has gained extreme popularity in the area of large-scale data-intensive applications in recent years. Hadoop, an open-source implementation of MapReduce, can be set up easily and rapidly on commodity hardware to form a massive computing cluster. In such a cluster, task failures and node failures are not an anomaly, which will cause a substantial impact on Hadoop’s performance. Although Hadoop can restart failed tasks automatically and compensate for slow tasks by enabling speculative execution, many researchers have identified the shortcomings of Hadoop’s fault tolerance. In this research, we try to improve them by designing a simple checkpointing mechanism for Map tasks, and using a revised criterion for identifying slow tasks. Specifically, our checkpointing mechanism saves the partial output produced by the Mappers, and our criterion for identifying slow tasks considers tasks with variable progress rates. By preliminary simulations, although the results show only marginal performance improvement compared with native Hadoop and the LATE scheduler, we believe that our approaches have the potential to offer greater performance gain on real workload.
關鍵字 MapReduce;heterogeneous environments;intermediate data;checkpointing;speculative execution
語言 en
收錄於
會議性質 國際
校內研討會地點
研討會時間 20131216~20131219
通訊作者 林其誼
國別 CHN
公開徵稿 Y
出版型式 電子版
出處 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia 2013)
相關連結

機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/92701 )

機構典藏連結