Improving the Gene Ontology Resource to Facilitate More Informative Analysis and Interpretation of Alzheimer's Disease Data.
Barbara KramarzPaola RoncagliaBirgit H M MeldalRachael P HuntleyMaria J MartinSandra OrchardHelen ParkinsonDavid BroughRina BandopadhyayNigel M HooperRuth C LoveringPublished in: Genes (2018)
The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer's Research United Kingdom (ARUK) foundation and led by the University College London (UCL) biocuration team, was to enhance the GO resource by developing new neurological GO terms, and use GO terms to annotate gene products associated with dementia. Specifically, proteins and protein complexes relevant to processes involving amyloid-beta and tau have been annotated and the resulting annotations are denoted in GO databases as 'ARUK-UCL'. Biological knowledge presented in the scientific literature was captured through the association of GO terms with dementia-relevant protein records; GO itself was revised, and new GO terms were added. This literature biocuration increased the number of Alzheimer's-relevant gene products that were being associated with neurological GO terms, such as 'amyloid-beta clearance' or 'learning or memory', as well as neuronal structures and their compartments. Of the total 2055 annotations that we contributed for the prioritised gene products, 526 have associated proteins and complexes with neurological GO terms. To ensure that these descriptive annotations could be provided for Alzheimer's-relevant gene products, over 70 new GO terms were created. Here, we describe how the improvements in ontology development and biocuration resulting from this initiative can benefit the scientific community and enhance the interpretation of dementia data.
Keyphrases
- copy number
- genome wide
- healthcare
- high throughput
- genome wide identification
- mild cognitive impairment
- systematic review
- quality improvement
- mental health
- dna methylation
- big data
- cross sectional
- machine learning
- artificial intelligence
- small molecule
- working memory
- protein protein
- cerebrospinal fluid
- deep learning
- genome wide analysis