Ensembl Genomes 2016: more genomes, more complexity.

Paul Julian KerseyJames E AllenIrina ArmeanSanjay BodduBruce J BoltDenise Carvalho-SilvaMikkel ChristensenPaul DavisLee J FalinChristoph GrabmuellerJay HumphreyArnaud KerhornouJulia KhobovaNaveen K AranganathanNicholas LangridgeErnesto LowyMark D McDowallMartin UrbanMichael NuhnChuang Kee OngBert OverduinMichael PauliniHelder Pedro Emily PerryGiulietta SpudichElectra TapanariBrandon WaltsGareth WilliamsMarcela Tello-RuizJoshua SteinSharon WeiDoreen WareDaniel M BolserKevin L Howe Eugene KuleshaDaniel LawsonGareth MaslenDaniel M Staines

Published in: Nucleic acids research (2015)

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.

Keyphrases