Login / Signup

Alignment of major-groove hydrogen bond arrays uncovers shared information between different DNA sequences that bind the same protein.

Jacklin SedhomJason KinserLee A Solomon
Published in: NAR genomics and bioinformatics (2022)
Protein-DNA binding is of a great interest due to its importance in many biological processes. Previous studies have presented many factors responsible for the recognition and specificity, but understanding the minimal informational requirements for proteins that bind to multiple DNA-sites is still an understudied area of bioinformatics. Here we focus on the hydrogen bonds displayed by the target DNA in the major groove that take part in protein-binding. We show that analyses focused on the base pair identity may overlook key hydrogen bonds. We have developed an algorithm that converts a nucleotide sequence into an array of hydrogen bond donors and acceptors and methyl groups. It then aligns these non-covalent interaction arrays to identify what information is being maintained among multiple DNA sequences. For three different DNA-binding proteins, Lactose repressor, controller protein and λ-CI repressor, we uncovered the minimal pattern of hydrogen bonds that are common amongst all the binding sequences. Notably in the three proteins, key interacting hydrogen bonds are maintained despite nucleobase mutations in the corresponding binding sites. We believe this work will be useful for developing new DNA binding proteins and shed new light on evolutionary relationships.
Keyphrases