autoBioSeqpy: A Deep Learning Tool for the Classification of Biological Sequences.
Runyu JingYizhou LiLi XueFengjuan LiuMenglong LiJiesi LuoPublished in: Journal of chemical information and modeling (2020)
Deep learning has proven to be a powerful method with applications in various fields including image, language, and biomedical data. Thanks to the libraries and toolkits such as TensorFlow, PyTorch, and Keras, researchers can use different deep learning architectures and data sets for rapid modeling. However, the available implementations of neural networks using these toolkits are usually designed for a specific research and are difficult to transfer to other work. Here, we present autoBioSeqpy, a tool that uses deep learning for biological sequence classification. The advantage of this tool is its simplicity. Users only need to prepare the input data set and then use a command line interface. Then, autoBioSeqpy automatically executes a series of customizable steps including text reading, parameter initialization, sequence encoding, model loading, training, and evaluation. In addition, the tool provides various ready-to-apply and adapt model templates to improve the usability of these networks. We introduce the application of autoBioSeqpy on three biological sequence problems: the prediction of type III secreted proteins, protein subcellular localization, and CRISPR/Cas9 sgRNA activity. autoBioSeqpy is freely available with examples at https://github.com/jingry/autoBioSeqpy.
Keyphrases
- deep learning
- artificial intelligence
- electronic health record
- convolutional neural network
- big data
- crispr cas
- machine learning
- type iii
- neural network
- mental health
- amino acid
- genome editing
- autism spectrum disorder
- working memory
- healthcare
- health information
- smoking cessation
- social media
- data analysis
- virtual reality
- protein protein
- genetic diversity