Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions.
S M MortuzaWei ZhengChengxin ZhangYang LiRobin PearceYang ZhangPublished in: Nature communications (2021)
Sequence-based contact prediction has shown considerable promise in assisting non-homologous structure modeling, but it often requires many homologous sequences and a sufficient number of correct contacts to achieve correct folds. Here, we developed a method, C-QUARK, that integrates multiple deep-learning and coevolution-based contact-maps to guide the replica-exchange Monte Carlo fragment assembly simulations. The method was tested on 247 non-redundant proteins, where C-QUARK could fold 75% of the cases with TM-scores (template-modeling scores) ≥0.5, which was 2.6 times more than that achieved by QUARK. For the 59 cases that had either low contact accuracy or few homologous sequences, C-QUARK correctly folded 6 times more proteins than other contact-based folding methods. C-QUARK was also tested on 64 free-modeling targets from the 13th CASP (critical assessment of protein structure prediction) experiment and had an average GDT_TS (global distance test) score that was 5% higher than the best CASP predictors. These data demonstrate, in a robust manner, the progress in modeling non-homologous protein structures using low-accuracy and sparse contact-map predictions.
Keyphrases
- dna repair
- dna damage
- monte carlo
- deep learning
- molecular dynamics
- protein protein
- amino acid
- molecular dynamics simulations
- big data
- machine learning
- high resolution
- oxidative stress
- electronic health record
- binding protein
- artificial intelligence
- single molecule
- high density
- data analysis
- simultaneous determination