Fibertools: fast and accurate DNA-m6A calling using single-molecule long-read sequencing.
Anupama JhaStephanie C BohaczukYizi MaoJane RanchalisBenjamin J MalloryAlan T MinMorgan O HammElliott G SwansonConnor FinkbeinerTony LiDale WhittingtonWilliam Stafford NobleAndrew Ben StergachisMitchell R VollgerPublished in: bioRxiv : the preprint server for biology (2023)
Single-molecule chromatin fiber sequencing is based on the single-nucleotide resolution identification of DNA N 6 -methyladenine (m6A) along individual sequencing reads. We present fibertools, a semi-supervised convolutional neural network that permits the fast and accurate identification of both endogenous and exogenous m6A-marked bases using single-molecule long-read sequencing. Fibertools enables highly accurate (>90% precision and recall) m6A identification along multi-kilobase DNA molecules with a ∼1,000-fold improvement in speed and the capacity to generalize to new sequencing chemistries.