60164

Uma comparação entre MapReduce e Tez para segmentação de imagens em ambientes de computação em nuvem

Favoritar este trabalho

Driven mainly by the modern advances in the Earth Observation technology in the last years, the increase of the remote sensing data volume represents a new challenge. The current available image processing solutions fail to deliver the expected performance and scalability required to deal with this large volume of data. Aiming to face this problem, the authors proposed, in a recent work, a distributed strategy for region growing segmentation of arbitrarily large images. The presented strategy is able to perform in cloud-computing environments and most of the distributed architectures. The original implementation is based on the MapReduce model, which offers a highly scalable and reliable framework for storing and processing massive data in cloud computing environments. However, MapReduce is losing popularity lately and it is being slowly replaced by different engines that have been emerged. Since the distributed image segmentation is a method independent from its implementation, this paper aim to compare the original implementation using MapReduce to a new implementation using a different distributed framework. In this work, the new implementation is based on Apache Tez. Tez enhances the MapReduce paradigm by improving its speed while maintaining MapReduce''s ability to scale to petabytes of data. The experiments carried out on a virtual cluster in a commercial cloud-computing infrastructure demonstrated that both implementations present a potential scalable and efficient solution, with Tez achieving a better performance.