An empirical test of the relationship between the bootstrap and likelihood ratio support in maximum likelihood phylogenetic analysis.
Denis Jacob MachadoFernando Portella de Luna MarquesLarry Jiménez-FerbansTaran GrantPublished in: Cladistics : the international journal of the Willi Hennig Society (2021)
In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (S, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of S, then it might be an efficient indirect method of measuring support-an attractive possibility, given the speed of many BS implementations. To assess the relationship between S and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which S and BS are affected by the number of replicates during suboptimal tree searches for S and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between S and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of S for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both S and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate S provide better estimates of support.