Correcting gradient-based interpretations of deep neural networks for genomics.

Antonio MajdandzicChandana RajeshPeter K Koo

Published in: Genome biology (2023)

Post hoc attribution methods can provide insights into the learned patterns from deep neural networks (DNNs) trained on high-throughput functional genomics data. However, in practice, their resultant attribution maps can be challenging to interpret due to spurious importance scores for seemingly arbitrary nucleotides. Here, we identify a previously overlooked attribution noise source that arises from how DNNs handle one-hot encoded DNA. We demonstrate this noise is pervasive across various genomic DNNs and introduce a statistical correction that effectively reduces it, leading to more reliable attribution maps. Our approach represents a promising step towards gaining meaningful insights from DNNs in regulatory genomics.

Keyphrases

neural network
single cell
high throughput
air pollution
primary care
circulating tumor
electronic health record
resistance training
machine learning
dna methylation
single molecule
deep learning