Login / Signup

Improved functions for non-linear sequence comparison using SEEKR.

Shuang LiQuinn EberhardLuke NiJ Mauro Calabrese
Published in: bioRxiv : the preprint server for biology (2024)
SEquence Evaluation through k -mer Representation (SEEKR) is a method of sequence comparison that utilizes sequence substrings called k -mers to quantify non-linear similarity between nucleic acid species. We describe the development of new functions within SEEKR that enable end-users to estimate p-values that ascribe statistical significance to SEEKR-derived similarities as well as visualize different aspects of k -mer similarity. We apply the new functions to identify chromatin-enriched long noncoding RNAs (lncRNAs) that harbor XIST -like sequence fragments and show that several of these fragments are bound by XIST -associated proteins. We also highlight the best practice of using RNA-Seq data to evaluate support for lncRNA annotations prior to their in-depth study in cell types of interest.
Keyphrases
  • rna seq
  • single cell
  • nucleic acid
  • amino acid
  • healthcare
  • primary care
  • dna damage
  • sars cov
  • gene expression
  • machine learning
  • mesenchymal stem cells
  • oxidative stress
  • optical coherence tomography
  • deep learning