SeqCAT: Sequence Conversion and Analysis Toolbox.
Kevin KornrumpfNadine S KurzKlara DrofenikLukas KraußCarolin SchneiderRaphael KochTim BeißbarthJürgen DönitzPublished in: Nucleic acids research (2024)
Dealing with sequence coordinates in different formats and reference genomes is challenging in genetic research. This complexity arises from the need to convert and harmonize datasets of different sources using alternating nomenclatures. Since manual processing is time-consuming and requires specialized knowledge, the Sequence Conversion and Analysis Toolbox (SeqCAT) was developed for daily work with genetic datasets. Our tool provides a range of functions designed to standardize and convert gene variant coordinates based on various sequence types. Its user-friendly web interface provides easy access to all functionalities, while the Application Programming Interface (API) enables automation within pipelines. SeqCAT provides access to human genomic, protein and transcript data, utilizing various data resources and packages and extending them with its own unique features. The platform covers a wide range of genetic research needs with its 14 different applications and 3 info points, including search for transcript and gene information, transition between reference genomes, variant mapping, and genetic event review. Notable examples are 'Convert Protein to DNA Position' for translation of amino acid changes into genomic single nucleotide variants, or 'Fusion Check' for frameshift determination in gene fusions. SeqCAT is an excellent resource for converting sequence coordinate data into the required formats and is available at: https://mtb.bioinf.med.uni-goettingen.de/SeqCAT/.
Keyphrases
- copy number
- amino acid
- genome wide
- electronic health record
- dna methylation
- rna seq
- big data
- healthcare
- endothelial cells
- mycobacterium tuberculosis
- palliative care
- drinking water
- machine learning
- high resolution
- protein protein
- high throughput
- genome wide identification
- binding protein
- single cell
- health information
- molecularly imprinted