會議論文
學年 | 102 |
---|---|
學期 | 1 |
發表日期 | 2013-12-16 |
作品名稱 | On Improving Fault Tolerance for Heterogeneous Hadoop MapReduce Clusters |
作品名稱(其他語言) | |
著者 | Lin, Chi-Yi; Chen, Ting-Hau; Cheng, Yi-No |
作品所屬單位 | 淡江大學資訊工程學系 |
出版者 | Institute of electrical and electronics engineers (IEEE) |
會議名稱 | 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia 2013) |
會議地點 | Fuzhou, China |
摘要 | The computing paradigm of MapReduce has gained extreme popularity in the area of large-scale data-intensive applications in recent years. Hadoop, an open-source implementation of MapReduce, can be set up easily and rapidly on commodity hardware to form a massive computing cluster. In such a cluster, task failures and node failures are not an anomaly, which will cause a substantial impact on Hadoop’s performance. Although Hadoop can restart failed tasks automatically and compensate for slow tasks by enabling speculative execution, many researchers have identified the shortcomings of Hadoop’s fault tolerance. In this research, we try to improve them by designing a simple checkpointing mechanism for Map tasks, and using a revised criterion for identifying slow tasks. Specifically, our checkpointing mechanism saves the partial output produced by the Mappers, and our criterion for identifying slow tasks considers tasks with variable progress rates. By preliminary simulations, although the results show only marginal performance improvement compared with native Hadoop and the LATE scheduler, we believe that our approaches have the potential to offer greater performance gain on real workload. |
關鍵字 | MapReduce;heterogeneous environments;intermediate data;checkpointing;speculative execution |
語言 | en |
收錄於 | |
會議性質 | 國際 |
校內研討會地點 | |
研討會時間 | 20131216~20131219 |
通訊作者 | 林其誼 |
國別 | CHN |
公開徵稿 | Y |
出版型式 | 電子版 |
出處 | 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia 2013) |
相關連結 |
機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/92701 ) |