Login / Signup

An evaluation of relational and NoSQL distributed databases on a low-power cluster.

Lucas Ferreira da SilvaJoão V F Lima
Published in: The Journal of supercomputing (2023)
The constant growth of social media, unconventional web technologies, mobile applications, and Internet of Things (IoT) devices create challenges for cloud data systems in order to support huge datasets and very high request rates. NoSQL databases, such as Cassandra and HBase, and relational SQL databases with replication, such as Citus/PostgreSQL, have been used to increase horizontal scalability and high availability of data store systems. In this paper, we evaluated three distributed databases on a low-power low-cost cluster of commodity Single-Board Computers (SBC): relational Citus/PostgreSQL and NoSQL databases Cassandra and HBase. The cluster has 15 Raspberry Pi 3 nodes with Docker Swarm orchestration tool for service deployment and ingress load balancing over SBCs. We believe that a low-cost SBC cluster can support cloud serving goals such as scale-out, elasticity, and high availability. Experimental results clearly demonstrated that there is a trade-off between performance and replication, which provides availability and partition tolerance. Besides, both properties are essential in the context of distributed systems with low-power boards. Cassandra attained better results with its consistency levels specified by the client. Both Citus and HBase enable consistency but it penalizes performance as the number of replicas increases.
Keyphrases
  • low cost
  • big data
  • social media
  • health information
  • electronic health record
  • healthcare
  • mental health
  • lymph node
  • radiation therapy
  • rna seq
  • data analysis
  • single cell
  • neoadjuvant chemotherapy
  • locally advanced