Login / Signup

Genomic benchmarks: a collection of datasets for genomic sequence classification.

Katarína GrešováVlastimil MartinekDavid ČechákPetr ŠimečekPanagiotis Alexiou
Published in: BMC genomic data (2023)
Deep learning techniques revolutionized many biological fields but mainly thanks to the carefully curated benchmarks. For the field of Genomics, we propose a collection of benchmark datasets for the classification of genomic sequences with an interface for the most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. The main aim of this effort is to create a repository for shared datasets that will make machine learning for genomics more comparable and reproducible while reducing the overhead of researchers who want to enter the field, leading to healthy competition and new discoveries.
Keyphrases
  • deep learning
  • machine learning
  • neural network
  • artificial intelligence
  • rna seq
  • single cell
  • convolutional neural network
  • copy number
  • primary care
  • healthcare
  • big data
  • amino acid