Machine learning approaches to predict age from accelerometer records of physical activity at biobank scale.
Alan Le GoallecSasha CollinM'Hamed JabriSamuel DiaiThéo VincentChirag J PatelPublished in: PLOS digital health (2023)
Physical activity improves quality of life and protects against age-related diseases. With age, physical activity tends to decrease, increasing vulnerability to disease in the elderly. In the following, we trained a neural network to predict age from 115,456 one week-long 100Hz wrist accelerometer recordings from the UK Biobank (mean absolute error = 3.7±0.2 years), using a variety of data structures to capture the complexity of real-world activity. We achieved this performance by preprocessing the raw frequency data as 2,271 scalar features, 113 time series, and four images. We defined accelerated aging for a participant as being predicted older than one's actual age and identified both genetic and environmental exposure factors associated with the new phenotype. We performed a genome wide association on the accelerated aging phenotypes to estimate its heritability (h_g2 = 12.3±0.9%) and identified ten single nucleotide polymorphisms in close proximity to genes in a histone and olfactory cluster on chromosome six (e.g HIST1H1C, OR5V1). Similarly, we identified biomarkers (e.g blood pressure), clinical phenotypes (e.g chest pain), diseases (e.g hypertension), environmental (e.g smoking), and socioeconomic (e.g income and education) variables associated with accelerated aging. Physical activity-derived biological age is a complex phenotype associated with both genetic and non-genetic factors.
Keyphrases
- physical activity
- blood pressure
- genome wide
- machine learning
- body mass index
- healthcare
- genome wide association
- neural network
- type diabetes
- big data
- climate change
- randomized controlled trial
- dna methylation
- metabolic syndrome
- heart rate
- mental health
- middle aged
- optical coherence tomography
- hypertensive patients
- community dwelling
- risk assessment
- skeletal muscle
- insulin resistance
- data analysis
- transcription factor
- high intensity