Fast and accurate HLA typing from short-read next-generation sequence data with xHLA.
Chao XieZhen Xuan YeoMarie WongJason PiperTao LongEwen F KirknessWilliam H BiggsKen BloomStephen SpellmanCynthia Vierra-GreenColleen BradyRichard H ScheuermannAmalio TelentiSally HowardSuzanne BrewertonYaron TurpazJ Craig VenterPublished in: Proceedings of the National Academy of Sciences of the United States of America (2017)
The HLA gene complex on human chromosome 6 is one of the most polymorphic regions in the human genome and contributes in large part to the diversity of the immune system. Accurate typing of HLA genes with short-read sequencing data has historically been difficult due to the sequence similarity between the polymorphic alleles. Here, we introduce an algorithm, xHLA, that iteratively refines the mapping results at the amino acid level to achieve 99-100% four-digit typing accuracy for both class I and II HLA genes, taking only [Formula: see text]3 min to process a 30× whole-genome BAM file on a desktop computer.
Keyphrases
- genome wide
- endothelial cells
- amino acid
- high resolution
- genome wide identification
- electronic health record
- induced pluripotent stem cells
- copy number
- deep learning
- pluripotent stem cells
- single molecule
- machine learning
- big data
- dna methylation
- single cell
- bioinformatics analysis
- smoking cessation
- mass spectrometry
- preterm infants
- high density
- preterm birth
- low birth weight