Challenges and recommendations to improve the installability and archival stability of omics computational tools.
Serghei MangulThiago MosqueiroRichard J AbdillDat DuongKeith MitchellVaruni SarwalBrian L HillJaqueline BritoRussell Jared LittmanBenjamin StatzAngela Ka-Mei LamGargi DayamaSarah A Brownlee-BouboulisLana S MartinJonathan FlintEleazar EskinRan BlekhmanPublished in: PLoS biology (2019)
Developing new software tools for analysis of large-scale biological data is a key component of advancing modern biomedical research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the installability and archival stability of computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee the continuing functionality of newly published tools. We have estimated the archival stability of computational biology software tools by performing an empirical analysis of the internet presence for 36,702 omics software resources published from 2005 to 2017. We found that almost 28% of all resources are currently not accessible through uniform resource locators (URLs) published in the paper they first appeared in. Among the 98 software tools selected for our installability test, 51% were deemed "easy to install," and 28% of the tools failed to be installed at all because of problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process. We propose for incorporation into journal policy several practical solutions for increasing the widespread installability and archival stability of published bioinformatics software.