Determining Significant Correlation Between Pairs of Extant Characters in a Small Parsimony Framework.
Kaustubh KhandaiCristian Navarro-MartinezBrendan SmithRebecca BuonopaneSoyong Ashley ByunMurray D PattersonPublished in: Journal of computational biology : a journal of computational molecular cell biology (2022)
When studying the evolutionary relationships among a set of species, the principle of parsimony states that a relationship involving the fewest number of evolutionary events is likely the correct one. Due to its simplicity, this principle was formalized in the context of computational evolutionary biology decades ago by, for example, Fitch and Sankoff. Because the parsimony framework does not require a model of evolution, unlike maximum likelihood or Bayesian approaches, it is often a good starting point when no reasonable estimate of such a model is available. In this work, we devise a method for determining if pairs of discrete characters are significantly correlated across all most parsimonious reconstructions, given a set of species on these characters, and an evolutionary tree. The first step of this method is to use Sankoff's algorithm to compute all most parsimonious assignments of ancestral states (of each character) to the internal nodes of the phylogeny. Correlation between a pair of evolutionary events (e.g., absent to present) for a pair of characters is then determined by the (co-) occurrence patterns between the sets of their respective ancestral assignments. The probability of obtaining a correlation this extreme (or more) under a null hypothesis where the events happen randomly on the evolutionary tree is then used to assess the significance of this correlation. We implement this method: parcours (PARsimonious CO-occURrenceS) and use it to identify significantly correlated evolution among vocalizations and morphological characters in the Felidae family.