Harmonizing immune cell sequences for computational analysis with large language models.
Areej AlsaafinHamid R TizhooshPublished in: Biology methods & protocols (2024)
We present SEQuence Weighted Alignment for Sorting and Harmonization (Seqwash), an algorithm designed to process sequencing profiles utilizing large language models. Seqwash harmonizes immune cell sequences into a unified representation, empowering LLMs to embed meaningful patterns while eliminating irrelevant information. Evaluations using immune cell sequencing data showcase Seqwash's efficacy in standardizing profiles, leading to improved feature quality and enhanced performance in both supervised and unsupervised downstream tasks for sequencing data.
Keyphrases
- machine learning
- single cell
- big data
- electronic health record
- autism spectrum disorder
- deep learning
- artificial intelligence
- magnetic resonance
- neural network
- magnetic resonance imaging
- healthcare
- data analysis
- computed tomography
- quality improvement
- health information
- contrast enhanced
- network analysis
- genetic diversity