Login / Signup

A Survey of Biological Data in a Big Data Perspective.

Gabriel Dall'AlbaPedro Lenz CasaFernanda Pessi de AbreuDaniel Luis NotariScheila de Avila E Silva
Published in: Big data (2022)
The amount of available data is continuously growing. This phenomenon promotes a new concept, named big data. The highlight technologies related to big data are cloud computing (infrastructure) and Not Only SQL (NoSQL; data storage). In addition, for data analysis, machine learning algorithms such as decision trees, support vector machines, artificial neural networks, and clustering techniques present promising results. In a biological context, big data has many applications due to the large number of biological databases available. Some limitations of biological big data are related to the inherent features of these data, such as high degrees of complexity and heterogeneity, since biological systems provide information from an atomic level to interactions between organisms or their environment. Such characteristics make most bioinformatic-based applications difficult to build, configure, and maintain. Although the rise of big data is relatively recent, it has contributed to a better understanding of the underlying mechanisms of life. The main goal of this article is to provide a concise and reliable survey of the application of big data-related technologies in biology. As such, some fundamental concepts of information technology, including storage resources, analysis, and data sharing, are described along with their relation to biological data.
Keyphrases
  • big data
  • machine learning
  • artificial intelligence
  • data analysis
  • deep learning
  • health information
  • single cell
  • neural network
  • electronic health record