59373

Utilizando um cluster virtual com Hadoop como uma ferramenta para exploração de big data em processamento de imagens digitais

Favoritar este trabalho

The amount of available remote sensing (RS) data is increasing at an extremely rapid pace due to recent advances in Earth observation technologies. This scenario leads to new challenges related to the ability to handle huge volumes of data with respect to computational techniques and resources. In this sense, RS data processing can be considered a big data problem, and in this context cloud computing is a trend since it offers a powerful infrastructure to perform large-scale computing, which is usually available in a pay-as-you-go model, and alleviates users of the need to acquire and maintain a complex computing infrastructure. Although prices currently practiced by cloud infrastructure providers are reasonably low, the development and testing of cloud-based platforms is a long work, which may become unfeasible considering the total costs involved. This work describes a solution to the problem of the costs involved in the development of methods based on cloud computing, in particular for RS data processing tools based on the Hadoop framework. Such a solution is based on the creation of a configurable virtual cluster on a single physical machine, installed with the software components required to run a distributed application. The virtual infrastructure provided by the solution was used for the development and testing of extensions of a recently proposed architecture for the distributed classification of RS data. To validate the extensions, classification experiments were carried out on hyperspectral images acquired with the ROSIS sensor, covering the University of Pavia in Italy.