Overlooked poor-quality patient samples in sequencing data impair reproducibility of published clinically relevant datasets.
Maximilian SprangJannik MöllmannMiguel A Andrade-NavarroJean Fred FontainePublished in: Genome biology (2024)
Thanks to a stringent selection of well-designed datasets, we demonstrate that quality imbalance between groups of samples can significantly reduce the relevance of differential genes, consequently reducing reproducibility between studies. Appropriate experimental design and analysis methods can substantially reduce the problem.