OrthoList 2: A New Comparative Genomic Analysis of Human and Caenorhabditis elegans Genes.
Woojin KimRyan S UnderwoodIva GreenwaldDaniel D ShayePublished in: Genetics (2018)
OrthoList, a compendium of Caenorhabditis elegans genes with human orthologs compiled in 2011 by a meta-analysis of four orthology-prediction methods, has been a popular tool for identifying conserved genes for research into biological and disease mechanisms. However, the efficacy of orthology prediction depends on the accuracy of gene-model predictions, an ongoing process, and orthology-prediction algorithms have also been updated over time. Here we present OrthoList 2 (OL2), a new comparative genomic analysis between C. elegans and humans, and the first assessment of how changes over time affect the landscape of predicted orthologs between two species. Although we find that updates to the orthology-prediction methods significantly changed the landscape of C. elegans-human orthologs predicted by individual programs and-unexpectedly-reduced agreement among them, we also show that our meta-analysis approach "buffered" against changes in gene content. We show that adding results from more programs did not lead to many additions to the list and discuss reasons to avoid assigning "scores" based on support by individual orthology-prediction programs; the treatment of "legacy" genes no longer predicted by these programs; and the practical difficulties of updating due to encountering deprecated, changed, or retired gene identifiers. In addition, we consider what other criteria may support claims of orthology and alternative approaches to find potential orthologs that elude identification by these programs. Finally, we created a new web-based tool that allows for rapid searches of OL2 by gene identifiers, protein domains [InterPro and SMART (Simple Modular Architecture Research Tool], or human disease associations ([OMIM (Online Mendelian Inheritence in Man], and also includes available RNA-interference resources to facilitate potential translational cross-species studies.
Keyphrases
- genome wide
- genome wide identification
- endothelial cells
- copy number
- public health
- systematic review
- induced pluripotent stem cells
- bioinformatics analysis
- genome wide analysis
- pluripotent stem cells
- dna methylation
- gene expression
- randomized controlled trial
- healthcare
- machine learning
- small molecule
- single cell
- binding protein
- health insurance
- case control
- risk assessment
- quantum dots