SELECTION OF DATABASES TO STORE GEOSPATIAL-TEMPORAL DATA
DOI:
https://doi.org/10.31891/2219-9365-2023-76-1Keywords:
IoT, geospatial-temporal data, Cassandra, Accumulo, GeoMesaAbstract
The proliferation of geospatial-temporal data, driven by the widespread adoption of sensor platforms and the Internet of Things, has escalated the demand for effective data management solutions. In this context, GeoMesa, an open-source toolkit designed to enable comprehensive geospatial querying and analytics in distributed computing systems, plays a pivotal role. GeoMesa seamlessly integrates geospatial-temporal indexing capabilities with databases like Accumulo, HBase, Google Bigtable, and Cassandra, facilitating the storage and management of extensive geospatial datasets. This article addresses the critical need to benchmark and compare the performance of Accumulo and Cassandra when employed as underlying data stores for GeoMesa. By conducting performance tests, we aim to provide valuable insights into the relative strengths and weaknesses of these database systems, thereby aiding decision-makers in selecting the most suitable solution for their specific application requirements. The evaluation includes an in-depth analysis of performance metrics, such as throughput and latency, as well as consideration of system parameters, query density, and data access distribution. It was identified that Accumulo outperforms Cassandra almost in all areas – read latency and resource usage under heavy load and write latency under any load. In turn, Cassandra has lower read latency under low load and CPU usage under heavy load.