haploMAGIC: Accurate phasing and detection of recombination in multiparental populations despite genotyping errors.
Jose A Montero-TenaNayyer Abdollahi SisiTobias KoxAmine AbbadiRod J SnowdonAgnieszka A GoliczPublished in: G3 (Bethesda, Md.) (2024)
Recombination is a key mechanism in breeding for promoting genetic variability. Multiparental populations constitute an excellent platform for precise genotype phasing, identification of genome-wide crossovers, estimation of recombination frequencies and construction of recombination maps. Here, we introduce haploMAGIC, a pipeline to detect crossovers in multiparental populations with single-nucleotide polymorphism (SNP) data by exploiting the pedigree relationships for accurate genotype phasing and inference of grandparental haplotypes. haploMAGIC applies filtering to prevent false positive crossovers due to genotyping errors, a common problem in high-throughput SNP analysis of complex plant genomes. Hence it discards haploblocks not reaching a specified minimum number of informative alleles. A performance analysis using populations simulated with AlphaSimR revealed that haploMAGIC improves upon existing methods of crossover detection in terms of recall and precision, most notably when genotyping error rates are high. Furthermore, we constructed recombination maps using haploMAGIC with high-resolution genotype data from two large multi-parental populations of winter rapeseed (Brassica napus). The results demonstrate the applicability of the pipeline in real-world scenarios and showed good correlations in recombination frequency compared with alternative software. Therefore, we propose haploMAGIC as an accurate tool at crossover detection with multiparental populations that shows robustness against genotyping errors.
Keyphrases
- genome wide
- genetic diversity
- high throughput
- dna repair
- high resolution
- dna damage
- dna methylation
- single cell
- copy number
- loop mediated isothermal amplification
- label free
- gene expression
- clinical trial
- climate change
- emergency department
- real time pcr
- big data
- machine learning
- mass spectrometry
- double blind
- high density
- sensitive detection
- transcription factor
- bioinformatics analysis
- tandem mass spectrometry
- genome wide identification