Research Output

Near-Data Prediction Based Speculative Optimization in a Distribution Environment

  Apache Hadoop is an open source software framework that supports
data-intensive distributed applications and is distributed under the Apache 2.0 licensing agreement, where consumers will no longer deal with complex configuration of software and hardware but only pay for cloud services on demand. So how to make the performance of the cloud platform become more important in a consumer-centric environment. There exists imbalance between in some distribution of slow tasks, which results in straggling tasks will have a great influence on the Hadoop framework. By monitoring those tasks in real-time progress and copying the potential Stragglers to a different node, the speculative execution (SE) realizes to improve the probability of finishing those backup tasks before the original ones. The Speculative execution (SE) applies this principle and thus proposed a solution to handle the Straggling tasks. At present, the performance of the Hadoop system is unsatisfying because of the erroneous judgement and inappropriate selection for the backup nodes in the current SE policy. This paper proposes an SE optimized strategy which can be used in prediction of near data. In this strategy, the first step is gathering the real-time task execution information and the remaining runtime required for the task is predicted by a local prediction method. Then it chooses a proper backup node according to the near data and actual demand in the second step. On the other side, this model also includes a cost-effective model in order to make the performance of SE to the peak. The results show that using this strategy in Hadoopeffectively improves the accuracy of alternative tasks and effects better in heterogeneous Hadoop environments in various situations, which is beneficial to consumers and cloud platform.

  • Date:

    04 December 2019

  • Publication Status:


  • Publisher


  • Funders:

    RSE Royal Society of Edinburgh


Sun, M., Wu, X., Jin, D., Xu, X., Liu, Q., & Liu, X. (2019). Near-Data Prediction Based Speculative Optimization in a Distribution Environment


Monthly Views:

Linked Projects

Available Documents