Mathematical Constraints on FST: Biallelic Markers in Arbitrarily Many Populations.
Nicolas AlcalaNoah A RosenbergPublished in: Genetics (2017)
[Formula: see text] is one of the most widely used statistics in population genetics. Recent mathematical studies have identified constraints that challenge interpretations of [Formula: see text] as a measure with potential to range from 0 for genetically similar populations to 1 for divergent populations. We generalize results obtained for population pairs to arbitrarily many populations, characterizing the mathematical relationship between [Formula: see text] the frequency M of the more frequent allele at a polymorphic biallelic marker, and the number of subpopulations K We show that for fixed K, [Formula: see text] has a peculiar constraint as a function of M, with a maximum of 1 only if [Formula: see text] for integers i with [Formula: see text] For fixed M, as K grows large, the range of [Formula: see text] becomes the closed or half-open unit interval. For fixed K, however, some [Formula: see text] always exists at which the upper bound on [Formula: see text] lies below [Formula: see text] We use coalescent simulations to show that under weak migration, [Formula: see text] depends strongly on M when K is small, but not when K is large. Finally, examining data on human genetic variation, we use our results to explain the generally smaller [Formula: see text] values between pairs of continents relative to global [Formula: see text] values. We discuss implications for the interpretation and use of [Formula: see text].