Login / Signup

Handling hybrid and missing data in constraint-based causal discovery to study the etiology of ADHD.

Elena SokolovaDaniel von RheinJilly NaaijenPerry GrootTom ClaassenJan BuitelaarTom Heskes
Published in: International journal of data science and analytics (2016)
Causal discovery is an increasingly important method for data analysis in the field of medical research. In this paper, we consider two challenges in causal discovery that occur very often when working with medical data: a mixture of discrete and continuous variables and a substantial amount of missing values. To the best of our knowledge, there are no methods that can handle both challenges at the same time. In this paper, we develop a new method that can handle these challenges based on the assumption that data are missing at random and that continuous variables obey a non-paranormal distribution. We demonstrate the validity of our approach for causal discovery on simulated data as well as on two real-world data sets from a monetary incentive delay task and a reversal learning task. Our results help in the understanding of the etiology of attention-deficit/hyperactivity disorder (ADHD).
Keyphrases
  • attention deficit hyperactivity disorder
  • data analysis
  • electronic health record
  • small molecule
  • autism spectrum disorder
  • big data
  • healthcare
  • high throughput
  • working memory
  • single cell