Machine Learning-Guided Systematic Search of DNA Sequences for Sorting Carbon Nanotubes.
Zhiwei LinYoona YangAnand JagotaMing ZhengPublished in: ACS nano (2022)
The prerequisite of utilizing DNA in sequence-dependent applications is to search specific sequences. Developing a strategy for efficient DNA sequence screening represents a grand challenge due to the countless possibilities of sequence combination. Herein, relying on sequence-dependent recognition between DNA and single-wall carbon nanotubes (SWCNTs), we demonstrate a method for systematic search of DNA sequences for sorting single-chirality SWCNTs. Different from previously documented empirical search, which has a low efficiency and accuracy, our approach combines machine learning and experimental investigation. The number of resolving sequences and the success rate of finding them are improved from ∼10 2 to ∼10 3 and from ∼10% to >90%, respectively. Moreover, the resolving sequence patterns determined from 5-mer and 6-mer short sequences can be extended to sequence search in longer DNA subspaces.