Login / Signup

Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.

Simon BoutryRaphaël HelaersTom LenaertsMiikka Vikkula
Published in: PLoS computational biology (2023)
The development of high-throughput next-generation sequencing technologies and large-scale genetic association studies produced numerous advances in the biostatistics field. Various aggregation tests, i.e. statistical methods that analyze associations of a trait with multiple markers within a genomic region, have produced a variety of novel discoveries. Notwithstanding their usefulness, there is no single test that fits all needs, each suffering from specific drawbacks. Selecting the right aggregation test, while considering an unknown underlying genetic model of the disease, remains an important challenge. Here we propose a new ensemble method, called Excalibur, based on an optimal combination of 36 aggregation tests created after an in-depth study of the limitations of each test and their impact on the quality of result. Our findings demonstrate the ability of our method to control type I error and illustrate that it offers the best average power across all scenarios. The proposed method allows for novel advances in Whole Exome/Genome sequencing association studies, able to handle a wide range of association models, providing researchers with an optimal aggregation analysis for the genetic regions of interest.
Keyphrases
  • copy number
  • genome wide
  • high throughput
  • single cell
  • gene expression
  • climate change
  • machine learning
  • dna methylation
  • deep learning