Login / Signup

A Paradigm For Calling Sequence In Families: The Long Life Family Study.

E Warwick DawJason A AnemaKaren SchwanderShiow Jiuan LinLihua WangMary WojczynskiBharat ThyagarajanNathan O StitzielMichael A Province
Published in: bioRxiv : the preprint server for biology (2024)
Over Several years, we have developed a system for assuring the quality of whole genome sequence (WGS) data in the LLFS families. We have focused on providing data to identify germline genetic variants with the aim of releasing as many variants on as many individuals as possible. We aim to assure the quality of the individual calls. The availability of family data has enabled us to use and validate some filters not commonly used in population-based studies. We developed slightly different procedures for the autosomal, X, Y, and Mitochondrial (MT) chromosomes. Some of these filters are specific to family data, but some can be used with any WGS data set. We also describe the procedure we use to construct linkage markers from the SNP sequence data and how we compute IBD values for use in linkage analysis.
Keyphrases
  • electronic health record
  • big data
  • data analysis
  • gene expression
  • machine learning
  • dna methylation
  • hepatitis c virus
  • dna damage
  • quality improvement
  • human immunodeficiency virus
  • hiv testing
  • case control