An analytical theory to describe sequence-specific inter-residue distance profiles for polyampholytes and intrinsically disordered proteins.
Jonathan HuihuiKingshuk GhoshPublished in: The Journal of chemical physics (2020)
Intrinsically Disordered Proteins (IDPs), unlike folded proteins, lack a unique folded structure and rapidly interconvert among ensembles of disordered states. However, they have specific conformational properties when averaged over their ensembles of disordered states. It is critical to develop a theoretical formalism to predict these ensemble average conformational properties that are encoded in the IDP sequence (the specific order in which amino acids/residues are linked). We present a general heteropolymer theory that analytically computes the ensemble average distance profiles (⟨Rij 2⟩) between any two (i, j) monomers (amino acids for IDPs) as a function of the sequence. Information rich distance profiles provide a detailed description of the IDP in contrast to typical metrics such as scaling exponents, radius of gyration, or end-to-end distance. This generalized formalism supersedes homopolymer-like models or models that are built only on the composition of amino acids but ignore sequence details. The prediction of these distance profiles for highly charged polyampholytes and naturally occurring IDPs unmasks salient features that are hidden in the sequence. Moreover, the model reveals strategies to modulate the entire distance map to achieve local or global swelling/compaction by subtle changes/modifications-such as phosphorylation, a biologically relevant process-in specific hotspots in the sequence. Sequence-specific distance profiles and their modulation have been benchmarked against all-atom simulations. Our new formalism also predicts residue-pair specific coil-globule transitions. The analytical nature of the theory will facilitate design of new sequences to achieve specific target distance profiles with broad applications in synthetic biology and polymer science.