Login / Signup

Exploring automatic inconsistency detection for literature-based gene ontology annotation.

Jiyu ChenBenjamin GoudeyJustin ZobelNicholas GeardKarin M Verspoor
Published in: Bioinformatics (Oxford, England) (2022)
We have created a reliable synthetic dataset to simulate four realistic types of GOA inconsistency in biological databases. Three automatic approaches are proposed. They provide reasonable performance on the task of distinguishing the four types of inconsistency and are directly applicable to detect inconsistencies in real-world GOA database records. Major challenges resulting from such inconsistencies in the context of several specific application settings are reported. This is the first study to introduce automatic approaches that are designed to address the challenges in current GOA quality assurance workflows. The data underlying this article are available in Github at https://github.com/jiyuc/AutoGOAConsistency.
Keyphrases
  • deep learning
  • machine learning
  • big data
  • systematic review
  • neural network
  • electronic health record
  • copy number
  • dna methylation
  • adverse drug
  • transcription factor
  • single cell