Sequential knockoffs for continuous and categorical predictors: With application to a large psoriatic arthritis clinical trial pool.
Matthías KormákssonLuke J KellyXuan ZhuSibylle HaemmerleLuminita PricopDavid OhlssenPublished in: Statistics in medicine (2021)
Knockoffs provide a general framework for controlling the false discovery rate when performing variable selection. Much of the Knockoffs literature focuses on theoretical challenges and we recognize a need for bringing some of the current ideas into practice. In this paper we propose a sequential algorithm for generating knockoffs when underlying data consists of both continuous and categorical (factor) variables. Further, we present a heuristic multiple knockoffs approach that offers a practical assessment of how robust the knockoff selection process is for a given dataset. We conduct extensive simulations to validate performance of the proposed methodology. Finally, we demonstrate the utility of the methods on a large clinical data pool of more than 2000 patients with psoriatic arthritis evaluated in four clinical trials with an IL-17A inhibitor, secukinumab (Cosentyx), where we determine prognostic factors of a well established clinical outcome. The analyses presented in this paper could provide a wide range of applications to commonly encountered datasets in medical practice and other fields where variable selection is of particular interest.
Keyphrases
- clinical trial
- prognostic factors
- healthcare
- primary care
- electronic health record
- big data
- machine learning
- systematic review
- phase ii
- small molecule
- high throughput
- quality improvement
- open label
- double blind
- ankylosing spondylitis
- molecular dynamics
- rheumatoid arthritis
- randomized controlled trial
- artificial intelligence