Challenges and best practices in omics benchmarking.
Thomas G BrooksNicholas F LahensAntonijo MrčelaGregory R GrantPublished in: Nature reviews. Genetics (2024)
Technological advances enabling massively parallel measurement of biological features - such as microarrays, high-throughput sequencing and mass spectrometry - have ushered in the omics era, now in its third decade. The resulting complex landscape of analytical methods has naturally fostered the growth of an omics benchmarking industry. Benchmarking refers to the process of objectively comparing and evaluating the performance of different computational or analytical techniques when processing and analysing large-scale biological data sets, such as transcriptomics, proteomics and metabolomics. With thousands of omics benchmarking studies published over the past 25 years, the field has matured to the point where the foundations of benchmarking have been established and well described. However, generating meaningful benchmarking data and properly evaluating performance in this complex domain remains challenging. In this Review, we highlight some common oversights and pitfalls in omics benchmarking. We also establish a methodology to bring the issues that can be addressed into focus and to be transparent about those that cannot: this takes the form of a spreadsheet template of guidelines for comprehensive reporting, intended to accompany publications. In addition, a survey of recent developments in benchmarking is provided as well as specific guidance for commonly encountered difficulties.
Keyphrases
- single cell
- mass spectrometry
- liquid chromatography
- healthcare
- primary care
- physical activity
- emergency department
- systematic review
- randomized controlled trial
- big data
- machine learning
- high resolution
- artificial intelligence
- gas chromatography
- high performance liquid chromatography
- deep learning
- clinical practice
- tandem mass spectrometry
- meta analyses