Genetic Differentiation is Constrained to Chromosomal Inversions and Putative Centromeres in Locally Adapted Populations With Higher Gene Flow

Author Notes

Abstract

The impact of genome structure on adaptation is a growing focus in evolutionary biology, revealing an important role for structural variation and recombination landscapes in shaping genetic diversity across genomes and among populations. This is particularly relevant when local adaptation occurs despite gene flow, where clustering of differentiated loci can maintain locally adapted variants by reducing recombination between them. However, the limited genomic resources for nonmodel species, including reference genomes and recombination maps, have constrained our understanding of these patterns. In this study, we leverage the Atlantic silverside—a nonmodel fish with extensive local adaptation across a steep latitudinal gradient—as an ideal system to explore how genome structure influences adaptation under varying levels of gene flow, using a newly available reference genome and multiple recombination maps. Analyzing 168 genomes from four populations, we found a continuum of genome-wide differentiation increasing from south to north, reflecting higher connectivity among southern populations and reduced gene flow at northern latitudes. With increasing gene flow, the number and clustering of F_ST outlier loci also increased, with differentiated loci found exclusively within large haploblocks harboring inversions and smaller peaks overlapping putative centromeric regions. Notably, sequence divergence was only evident in inversions, supporting their role in adaptive divergence with gene flow, whereas centromeric regions appeared differentiated because of low recombination and diversity, with no indication of elevated divergence. Our results support the hypothesis that clustered genomic architectures evolve with high gene flow and enhance our understanding of how inversions and centromeres are linked to different evolutionary processes.

local adaptation, genome divergence, structural variation, recombination

Introduction

Understanding the complex interplay between natural selection and gene flow is crucial for discerning how adaptations evolve and are maintained. Populations can adapt to local environmental conditions in response to selection, but this process may be hindered if populations under divergent selection are still connected by gene flow (Slatkin 1987; García-Ramos and Kirkpatrick 1997). Because gene flow can introduce maladaptive alleles and homogenize populations, it could swamp adaptation, depending on the strength of selection and the degree of connectivity (Felsenstein 1976; Lenormand 2002; Gavrilets 2004). Developments in genomic techniques and evolutionary simulations in recent decades have led to a growing appreciation for the role of genomic architecture underlying adaptive traits—including structural variation, chromosomal organization, and recombination landscapes—in local adaptation with gene flow (Tigano and Friesen 2016; Mérot et al. 2020). While the role of structural variation in adaptation is increasingly recognized (Lucek et al. 2019; Weissensteiner et al. 2020; Hämälä et al. 2021), the broader influence of genome structure on the distribution of genetic variation across genomes and among populations, and its interaction with various evolutionary processes, is still poorly understood.

The recombination landscape, i.e. the variation in recombination rates across the genome, emerges as a key player in the dynamic interplay between selection and gene flow (Ortiz-Barrientos and James 2017; Samuk et al. 2017; Martin et al. 2019). Regions of relatively low recombination may offer protection from the deleterious effects of maladaptive gene flow and allow populations to maintain clusters of locally adapted alleles (Kirkpatrick and Barton 2006; Hoffmann and Rieseberg 2008; Yeaman 2013). For instance, recombination modifiers such as chromosomal inversions—mutations that change the orientation of a DNA segment within a chromosome—are increasingly implicated in studies of adaptation and divergence (reviewed in Wellenreuther and Bernatchez 2018). When individuals inherit both arrangements of a chromosomal inversion, they often produce nonviable gametes if recombination occurs between these inverted regions of the genome (Sturtevant and Beadle 1936; Navarro et al. 1997). Consequently, recombination in these regions is heavily reduced between alternate arrangements at the population level, often resulting in strongly linked alleles within an inversion that can act similarly to a single large-effect locus. This in turn allows populations, particularly in the face of maladaptive gene flow, to maintain sets of locally adapted alleles at high frequency (Rieseberg 2001).

Areas of suppressed recombination that do not coincide with chromosomal inversions may also play a role in adaptation with gene flow, but have received far less attention. Centromeres, which are essential for chromosome segregation during cell division, often exhibit reduced recombination rates due to high heterochromatin density (Dapper and Payseur 2017; Stapley et al. 2017), impacting the distribution of genetic diversity along chromosomes. For instance, sequence divergence between cryptic species of alpine bumblebees is elevated in regions of low recombination and near centromeres via genetic hitchhiking, with accentuated divergence in these regions in the presence of gene flow (Christmas et al. 2021). However, few studies have examined the role of centromeres in adaptive divergence, and in most cases, the positions of centromeres are unknown, in part due to the difficulty of sequencing and assembling regions of highly repetitive DNA.

Importantly, whereas chromosomal inversions only suppress recombination in a heterozygous state (Sturtevant and Beadle 1936), recombination in putative centromeric regions is consistently low (Fig. 1a). As a result, inversions may be more likely to facilitate adaptive divergence in the face of gene flow, compared with centromeric loci where consistently low recombination decreases the likelihood of incorporating adaptive variants and purging deleterious ones. Despite fundamental differences in how centromeres (consistently low-recombining) and inversions (conditionally low-recombining) modify the recombination landscape, studies distinguishing their impacts on the genomic patterns of adaptive divergence are lacking.

Fig. 1.

a) Recombination rates within and between GA (red) and NY (blue) populations and their interpopulation hybrids (gold) across representative chromosomes from Akopyan et al. (2022). Horizontal bars below plots indicate inversions (gray) and putative centromeres (black). Recombination rate drops to near zero in all three maps in the putative centromere regions, while the recombination rate in inversion regions stays at intermediate levels within the pure-population maps (GA and NY) and only drops to near-zero in the map for the interpopulation hybrid. b) Sampling localities of silverside genomes obtained from Wilder et al. (2020). Collection sites include Jekyll Island, Georgia (GA), Patchogue, New York (NY), Minas Basin, Nova Scotia (NS), and Magdalen Island, Quebec (QU). c) Principal components 1 and 2 of genome-wide variation among individuals, colored by sampling locations on the map in b).

Open in new tab Download slide

Variation in recombination rates may also be associated with various features of the genome, for instance, at broad scales with respect to genome and chromosome size (Haenel et al. 2018; Smith and Nambiar 2020), and at finer scales with respect to nucleotide, repeat, and genic content (Xu and Du 2014). Within chromosomes, recombination events are likely influenced by the tendency for double-strand breaks to occur near the telomeres where chromatin is open and accessible. In contrast, as mentioned above, recombination tends to be inhibited within and around the centromeres (Dapper and Payseur 2017; Stapley et al. 2017). Additionally, a strong correlation between GC content and recombination rate has been shown in fish (Roesti et al. 2013), birds (Smeds et al. 2016), and mammals (Galtier et al. 2001; Rousselle et al. 2018), a pattern resulting from GC-biased gene conversion, by which fixation of alleles containing G and C bases is accelerated relative to A and T bases due to their stronger molecular bonds (Mugal et al. 2015). Furthermore, recombination occurs more frequently in gene-dense regions in mice (Paigen et al. 2008) and plants (Gaut et al. 2007; Tiley et al. 2015), and in the vicinity of, but not within, coding and regulatory regions in insects (Wallberg et al. 2015; Jones et al. 2019; Torres et al. 2023), and humans (McVean et al. 2004). Despite recombination landscapes having been characterized in multiple taxa, detailed recombination maps are still primarily available for model organisms and domesticated species, making it difficult to assess the extent to which regional variation in recombination rate is associated with genomic features, the efficiency of selection, and the levels of genetic diversity in natural populations across species.

Commonly used population genetics statistics, including differentiation (F_ST) and nucleotide diversity (π), are heavily influenced by the variation in recombination rate across the genome (Noor and Bennett 2009; Cruickshank and Hahn 2014; Lotterhos 2019; Booker et al. 2020). These relationships make it hard to distinguish whether blocks of elevated differentiation identified between populations are a result of divergence with gene flow, and in fact underpin local adaptation, or result from other selective pressures interacting with the landscape of recombination. Because regions with low recombination rates experience stronger effects of linked selection, resulting in selection impacting a larger contiguous region of the genome, both purifying selection (i.e. background selection) and positive selection in isolation can emulate the peaks of differentiation caused by divergence with gene flow. However, sequence divergence (d_XY), an absolute measure of differentiation that is not dependent on within-population variation, is expected to differ in these alternate scenarios. Under the divergence with gene flow model, the regions that contribute to divergence between populations are expected to show elevated d_XY compared with background levels of absolute differentiation, whereas in isolation, selection should reduce or have no effect on d_XY compared with background levels (Cruickshank and Hahn 2014). While some studies, primarily focused on birds, have examined the signatures resulting from these different models of selection (Burri et al. 2015; Irwin et al. 2016; Delmore et al. 2018; Hejase et al. 2020; DeRaad et al. 2022), empirical studies explicitly investigating differences among recombination-modifying regions of the genome such as inversions and centromeres are still lacking, and most of what we know about the role of recombination in the evolution of clustered architectures of adaptation and divergence is based on simulations (Yeaman 2013; Flaxman et al. 2014; Gutiérrez-Valencia et al. 2021; Schaal et al. 2022). The need for high-quality genomic resources, including chromosome-level reference genomes and recombination maps, combined with the need for a so-called natural laboratory, i.e. an example of adaptive divergence in multiple traits that have evolved in populations with and without gene flow, have likely contributed to the delayed development of this research area.

The Atlantic silverside, Menidia menidia, a small fish distributed across the steep latitudinal climate gradient of the North American Atlantic coast, is an excellent system to examine the role of recombination modifiers in adaptive divergence with gene flow. Extensive research on this species has demonstrated a remarkable degree of local adaptation across its clinal range in multiple complex traits (reviewed in Conover et al. 2005, 2009; Hice et al. 2012), likely underpinned by many contributing genes. Atlantic silversides exhibit countergradient variation in growth rate (Conover and Present 1990), with populations at northern latitudes growing faster to compensate for shorter growing seasons, and southern populations growing slower due to tradeoffs with predator avoidance (Billerbeck et al. 2001; Munch and Conover 2003; Arnott et al. 2006). Silverside populations also exhibit clinal genetically based variation in other complex traits, including vertebral number, swimming performance, temperature-dependent sex determination, lipid storage, spawning temperature and duration, and offspring size at hatch (reviewed in Conover et al. 2005). Interestingly, adaptive differences are maintained within populations despite evidence of high dispersal abilities (Clarke et al. 2009). Additionally, genetic data support the presence of three regional population groups, with higher levels of connectivity among southern populations and reduced connectivity at northern latitudes (Lou et al. 2018), providing an opportunity to study patterns of adaptive divergence across varying levels of gene flow.

Recent exome-based population genomics work in Atlantic silversides suggested that geographic differentiation was primarily concentrated in large blocks on multiple chromosomes, but also revealed more scattered signatures of differentiation across the genome (Wilder et al. 2020). However, the lack of a species reference genome limited inference of exact genomic positions of outlier regions and of the association between elevated differentiation and the underlying features of the genome. Leveraging a high-quality chromosome-level reference genome for the Atlantic silverside (Tigano et al. 2021) and linkage maps (Akopyan et al. 2022), we revisit a large low-coverage whole-genome resequencing dataset (Wilder et al. 2020) to characterize patterns of population genetic diversity and differentiation in light of the underlying recombination landscape and assess potential associations between the recombination landscape and nucleotide composition and genomic features. We also calculate an absolute measure of sequence divergence between populations to distinguish the genomic regions with evidence of reduced gene flow between populations, as such regions likely contain loci that contribute to adaptive divergence with gene flow (Cruickshank and Hahn 2014). We test the hypothesis that higher gene flow will favor concentrated architectures of differentiation, predicting that the degree of clustering of differentiated alleles in low recombining regions increases with increasing levels of gene flow between populations. Further, we examine whether patterns of diversity, differentiation, and divergence differ between conditionally low-recombining inversions and consistently low-recombining putative centromeric regions. Our study is one of the first to empirically test these fundamental predictions about the relative roles of inversions and putative centromeric regions in adaptation with gene flow.

Results

Levels of Background Differentiation (F_ST) Decrease From North to South

We analyzed whole-genome sequence data from 42 fish from each of 4 populations (168 individuals total) along the North American Atlantic coast to assess population structure and differentiation. A total of 413,215,831 sites in the reference genome (including both variant and invariant sites) passed our quality filters, representing 89% of the genome that is assembled into chromosomes, and we identified 20,421,651 single nucleotide polymorphisms (SNPs) across all 168 individuals with an average depth per individual of 1.4×. The four populations formed distinct clusters along principal components of genome-wide variation, mirroring the geographic distribution of populations, with the exception of one individual from the north clustering with southern populations (Fig. 1b). The northernmost population (QU) showed the most separation in the principal component analysis (PCA) relative to the rest of the populations, reflecting the history of independent colonization of this region from the southern coastline and low ongoing gene flow (Lou et al. 2018). Pairwise comparisons of neighboring populations revealed the lowest levels of average genome-wide differentiation between GA and NY populations (mean F_ST = 0.023, median F_ST = 0.01), increasing 2-fold between NY and NS (mean F_ST = 0.040, median F_ST = 0.023), and 3-fold between NS and QU (mean F_ST = 0.068, median F_ST = 0.031). Pairwise allele frequency difference (AFD), a metric similar to F_ST that is more linear and sensitive to weak population differentiation (Berner 2019), also suggests overall high gene flow in the south and a continuum of connectivity among populations: median AFD increases from 0.043 to 0.066 to 0.084 when comparing neighboring populations from the south to the north.

Populations With Higher Gene Flow Exhibit Stronger Clustering of F_ST Outliers

We examined F_ST outlier regions across chromosomes to investigate differentiation patterns between neighboring populations. Between the two southernmost populations, clusters of F_ST peaks are prominent in contrast to low levels of genome-wide differentiation (Fig. 2c). These results corroborate previous exome-based work, which revealed that differentiation was primarily concentrated in large blocks on multiple chromosomes (supplementary fig. S1, Supplementary Material online, Wilder et al. 2020). We found this pattern to be even more pronounced with low-coverage whole-genome data mapped to the species-specific reference genome. Many of the peaks that appeared to be scattered across the reference transcriptome anchored to the medaka genome now fit into larger blocks when mapped to the silverside genome (Fig. 2, Wilder et al. 2020).

Fig. 2.

Genome-wide distribution of F_ST, averaged in 50 kb windows, across the 24 chromosomes of the Atlantic silverside genome, for pairwise comparisons of neighboring populations as indicated in fig. 1b. Comparison between a) Nova Scotia (NS) and Quebec (QU), where gene flow is limited, b) between New York (NY) and NS, and between c) Georgia (GA) and NY, where gene flow is high, demonstrating clustering of F_ST peaks among low levels of genome-wide differentiation.

Open in new tab Download slide

In the southern population comparisons where we see lower background differentiation, likely because of higher gene flow, we identified more outlier regions (50 kb windows) with elevated differentiation on fewer chromosomes compared with the northernmost populations. Comparing populations from QU and NS (northern), we identified 150 outlier F_ST regions distributed across 21 chromosomes. Between NS and NY (mid-latitude), we identified 215 outlier F_ST regions distributed across 20 chromosomes, and between NY and GA (southern), we identified 360 outlier F_ST regions distributed across three chromosomes: 8, 18, and 24. In the NY and GA comparison, we detected outlier windows only on three chromosomes because these regions of exceptionally high F_ST elevate the genome-wide average, preventing other regions with relatively elevated differentiation (discussed below) from meeting the threshold [mean + 3 standard deviation (SD)]. To assess whether outliers were nonrandomly clustered in the genome, we performed a permutation test by redistributing outliers across chromosomes (weighted by size) 10,000 times and compared the observed variance to a null distribution. In all three comparisons, F_ST outlier windows were unevenly distributed across chromosomes, with the observed concentrations deviating significantly from expected random distributions (P < 0.0001), indicating a strong clustering of outliers on certain chromosomes, most notably in the NY versus GA comparison. The observed variances were ∼4, 33, and 203 times larger than simulated means for the comparisons between QU and NS, NS and NY, and NY and GA, respectively, highlighting that as gene flow increases, F_ST outliers become intensely clustered on specific chromosomes, with the most extreme clustering seen in the NY versus GA comparison where gene flow is highest (Fig. 2c).

Massive Chromosomal Inversions Harbor Haploblocks of Differentiation

We identified the same four major haploblocks of highly elevated differentiation described previously (Wilder et al. 2020, supplementary fig. S1, Supplementary Material online): a region on chromosome 11 in the pairwise comparison of NY and NS, and regions on chromosomes 8, 18, and 24 in the comparison of NY and GA (Fig. 2). The largest haploblock with elevated differentiation was observed on chromosome 8 and covered 12.5 Mb, which represents 72% of the assembled chromosome length. The haploblocks on chromosomes 18 and 24 spanned 9.6 and 9.4 Mb, representing 73% and 54% of the respective assembled chromosomes. The haploblock between NY and NS on chromosome 11 was 8.6 Mb long and spanned 47% of the chromosome. Between NY and GA, a narrow F_ST peak was observed on chromosome 11 coinciding with one of the endpoints of the haploblock present in the north (Fig. 2). The number of nearly fixed variants (F_ST > 0.95) revealed a striking contrast in magnitude across population comparisons, with a single variant identified between NS and QU and between NS and NY, located on chromosomes 5 and 11, respectively. In contrast, GA versus NY shows a total of 55,806 nearly fixed variants, with dramatic peaks on chromosomes 18 (25,454 variants) and 24 (30,009 variants), as well as a notable cluster on chromosome 8 (336 variants), underscoring significant differentiation in these regions.

The four large haploblocks on chromosomes 8, 11, 18, and 24 coincide with known massive chromosomal inversions where linkage mapping has shown that recombination is suppressed between alternate arrangements (Akopyan et al. 2022). For chromosome 11, Akopyan et al. (2022) did not detect the inversion when comparing Georgia and New York population-specific maps, but it was apparent in the hybrid map because the New York parent in that cross carried alternate arrangements. For chromosome 18, they reported a complex, nested inversion structure, yet recombination was still broadly suppressed across the region. Peaks of differentiation on chromosomes 4, 7, and 19 between southern populations also overlap known segregating inversions (Fig. 3). The genomic positions of these inversions, as reported in Akopyan et al. (2022), are provided in supplementary table S1, Supplementary Material online. Consistent with the role of suppressed recombination in maintaining differentiation, average F_ST per 50 kb window significantly decreased with increasing recombination rate (supplementary fig. S2, Supplementary Material online), especially recombination rate between lab-reared interpopulation crosses (τ = −0.21, P < 0.0001). Low recombination between hybrids was a hallmark of the highly differentiated peaks and haploblocks between GA and NY (Fig. 3), further demonstrating the link between suppressed recombination and elevated differentiation. Additionally, PCA of each haploblock revealed a clustering pattern typical of inversions, with three main clusters corresponding to the two inversion homozygotes and the heterozygotes, consistent with the suppression of recombination within these regions (supplementary figs. S3 and S4, Supplementary Material online).

Fig. 3.

Inversions and putative centromeres coincide with patterns of genomic differentiation and recombination suppression between populations. Black points show F_ST in 50 kb windows between the GA and NY populations (left y-axis), and gray points show the recombination rate in their interpopulation hybrids (right y-axis) for each chromosome. Putative centromere and telomere positions are indicated by blue and red horizontal bars, respectively, and inversions (from Akopyan et al. 2022) are indicated by gold bars.

Open in new tab Download slide

Localized F_ST Peaks Align With Putative Centromeric Regions

The pairwise comparison of the two southernmost populations NY and GA showed that—in addition to the large inversion-associated haploblocks on specific chromosomes—most chromosomes have one, sometimes two, prominent narrow peaks in differentiation (Fig. 2). In contrast to the abrupt jump in differentiation observed at the ends of the inversion-associated haploblocks, these peaks had a more mountain-like shape, characterized by a rapid but continuous increase of differentiation toward their centers. They were evident on all chromosomes except those with large haploblocks (chromosomes 8, 18, and 24). In contrast, chromosomes 4, 7, and 19 each had two peaks—one overlapping an inversion and the other a putative centromere (Fig. 3, supplementary table S1, Supplementary Material online). Pairwise comparisons involving the northern populations also revealed similar peaks of differentiation, but the pattern was less prominent against higher levels of background differentiation and less consistent across chromosomes. The peaks that do appear, however, were in the same chromosomal positions across population comparisons (Fig. 2). For instance, the peaks on chromosomes 17 and 23 appear in all comparisons, the peak on chromosome 21 appears in the pairwise comparisons including GA, NY, and NS, and one of the peaks on chromosome 7 appears only in comparisons involving the southernmost or northernmost neighboring populations.

Strikingly, the majority of the narrow peaks coincided with putative centromeric regions (Fig. 3). We identified putative centromeric regions for 18 out of the 24 chromosomes by analyzing three recombination maps (Akopyan et al. 2022) to detect heterochromatin boundaries typical of centromeres. This approach examined patterns of marker density and distribution, along with local changes in recombination rates (Mansour et al. 2021). It is important to note that our approach identifies regions of suppressed recombination characteristic of centromeres, and while these putative centromeric regions likely contain the true centromeres, the precise boundaries may extend beyond the actual centromeres due to limitations in resolving collapsed repetitive sequences in the genome assembly. In all 18 putative centromeric regions, a discernible peak in differentiation was seen between the southern populations where genome-wide differentiation was lowest (Fig. 3). With increasing levels of background differentiation between NY and NS and even more so between NS and QU, only 12 and 7 of the 18 centromeric regions had an F_ST peak that was detectable by eye, respectively (Fig. 2). In all pairwise comparisons of neighboring populations, however, F_ST was significantly higher at putative centromeres compared with the rest of the genome (t = 20.54, P < 0.0001; Fig. 4a). Mean F_ST in centromeric regions was 3.9 times higher between GA and NY, 1.7 times higher between NY and NS, and 1.3 times higher between NS and QU compared with the median background differentiation in each respective pairwise comparison (Wilcoxon’s test, P < 0.001). Telomeres, on the other hand, had significantly lower F_ST compared with the rest of the genome between NS and QU (t = 9.56, P < 0.0001) and between NY and NS (t = 7.85, P < 0.0001), while between GA and NY, telomeres showed significantly higher F_ST (t = −2.68, P = 0.007); however, these differences were relatively small and not clearly distinguishable when plotted (Fig. 4a). For both putative centromere and telomere regions, F_ST varied not only among population comparisons, but also among chromosomes, with notably more variation across chromosomes in centromeric regions compared with telomeric regions (supplementary fig. S3, Supplementary Material online). In addition, PCAs of centromeric regions showed similar clustering to inversions, but with notable differences in pattern. While inversion regions exhibited three tight clusters corresponding to alternate homozygotes and heterozygotes due to suppressed recombination, centromeric regions showed greater spread of individuals, reflecting higher variation and genome-wide patterns of population structure (supplementary fig. S3, Supplementary Material online). This suggests that while both centromeric regions and inversions experience recombination suppression, centromeric regions retain greater haplotype diversity, whereas inversions exhibit just two distinct major haplotypes associated with alternate orientations.

Fig. 4.

Comparison of differentiation and divergence (F_ST and d_XY) across genomic regions between neighboring populations. Violin plots display average values in 50 kb windows for centromeres (blue), inversions (yellow), telomeres (red), and the rest of the genome (dark gray) for a) F_ST and b) d_XY. c) Average d_xy in each of the four major inversion haploblocks. d) Relationships between F_ST and d_XY across 50 kb windows are shown by genomic region. The four major inversions are denoted with varying point shapes, as indicated in the legend.

Open in new tab Download slide

Sequence Divergence (d_XY) is Elevated in Large Chromosomal Inversions

Patterns of genome-wide d_XY showed dramatic oscillations within chromosomes, dipping low in putative centromeres and peaking at putative telomeres for the majority of chromosomes, i.e. the opposite of the F_ST pattern described above (supplementary fig. S5, Supplementary Material online). In all pairwise comparisons of neighboring populations, d_XY was significantly higher at telomeres (t = 31.37, P < 0.0001) and significantly lower at putative centromeres (t = −52.38, P < 0.0001) compared with the rest of the genome (Fig. 4b). For most regions of the genome, including centromere and telomere regions, d_XY was not significantly different among any of the population comparisons (F = 0.886, P = 0.412). Patterns of sequence divergence varied among population comparisons only in haploblocks corresponding to massive chromosomal inversions (F = 152.2, P < 0.0001).

In the pairwise comparison of the southernmost populations, the extent of sequence divergence within the haploblocks on chromosomes 18 and 24 (mean d_XY = 0.031 and 0.026, respectively) was nearly double the genome-wide average (mean d_XY = 0.016; F = 850.7, P < 0.0001), with sequence divergence in the chromosome 18 haploblock significantly higher than in the chromosome 24 haploblock (Tukey’s honestly significant difference, HSD, P < 0.0001). In contrast, in the comparisons between northern populations, sequence divergence was lower in the haploblocks on chromosomes 18 and 24 (mean d_XY = 0.0078 and 0.0073, respectively) compared with the rest of the genome (mean d_XY = 0.016; F = 339.1, P < 0.0001; Fig. 4), and the chromosome 18 and 24 haploblocks did not differ significantly from each other (Tukey’s HSD, P = 0.70). Furthermore, the inversion haploblock on chromosome 8 showed slightly elevated sequence divergence in the comparison of the southernmost populations (mean d_XY = 0.015) compared with the two other population comparisons (mean d_XY = 0.013 for both; F = 14.09, P < 0.0001), but overall did not show elevated divergence relative to the genome-wide average (t = −1.1, P = 0.27). The F_ST haploblock on chromosome 11 that appeared between NY and NS showed elevated sequence divergence between those populations (mean d_XY = 0.017) compared with both northern (mean d_XY = 0.012) and southern population comparisons (mean d_XY = 0.014; F = 40.55, P < 0.0001; Fig. 4c).

Contrasting Patterns of Divergence and Differentiation in Inversions and Putative Centromeric Regions

Correlations between differentiation (F_ST) and divergence (d_XY) revealed distinct patterns across the genome depending on the structural features and levels of gene flow between populations. In genomic windows associated with large chromosomal inversions, high F_ST and d_XY were observed, particularly between GA and NY where gene flow is extensive (Fig. 4d, gold points). These regions with high differentiation and divergence correspond to inversions on chromosomes 18 and 24 for the southernmost population. Elevated differentiation and divergence were also apparent for the inversion on chromosome 11 between NY and NS. Thus, chromosomal inversions, which suppress recombination in a heterozygous state, maintain high levels of both differentiation and divergence between populations under conditions of high gene flow. Conversely, between NS and QU, where gene flow is limited, chromosomal inversions did not show elevated differentiation or sequence divergence. In addition, regions of the genome without inversions exhibited lower correlations between differentiation and divergence, indicating less pronounced divergent selection in those areas in all population comparisons. Notably, in putative centromeric regions, which are characterized by consistently low recombination, the pattern diverged. Here, high F_ST was accompanied by low d_XY (Fig. 4d, blue points), suggesting that these regions are not associated with divergence with gene flow. Instead, around centromeres, where recombination is consistently low, stronger effects of linked selection reduce within-population diversity and drive differentiation between populations, without increasing sequence divergence, indicating that centromeric regions are unlikely to play a role in preserving locally adapted differences in the face of gene flow.

Diversity Estimates Reflect Divergent Selection in Inversions but not Centromeric Regions

Patterns of genome-wide π resembled patterns of d_XY, with similar dramatic oscillations within chromosomes, and dips and peaks corresponding to putative centromeres and telomeres, respectively (supplementary fig. S6, Supplementary Material online). Mean genome-wide π varied slightly among populations (F = 421.8, P < 0.0001), with higher levels in southern populations (GA mean π = 0.15, NY mean π = 0.16) compared with populations in the north (NS mean π = 0.14, QU mean π = 0.13). In centromeric regions, π was consistently lower in all populations (Fig. 5a). Within chromosomal inversions, π was higher in NY compared with GA, but the same in all other regions of the genome. The inversion on chromosome 11 showed decreased π in the northernmost population, whereas the inversions on chromosomes 8 and 18 showed decreased π in the southernmost population, and for the inversion on chromosome 24, π was lower in NS (Fig. 5b). This pattern aligns with inversion frequencies among populations (Fig. 5c). The chromosome 8 inversion is fixed for the northern arrangement in NY, NS, and QU but segregates in GA, where the southern arrangement predominates. The chromosome 11 inversion is fixed for the southern arrangement in GA, nearly fixed for the northern arrangement in QU, and segregates in NY and NS. The chromosome 18 inversion is fixed for the northern arrangement in NY, NS, and QU, while the southern arrangement is nearly fixed in GA. For the chromosome 24 inversion, the northern arrangement is fixed in NY and NS, nearly fixed in QU, and the southern arrangement is nearly fixed in GA.

Fig. 5.

Comparison of diversity across genomic regions within the four populations. a) Violin plots display average π values in 50 kb windows for centromeres (blue), inversions (yellow), telomeres (red), and the rest of the genome (dark gray). b) Average π in each of the four major inversions. c) Heatmap of inversion frequencies per population for northern (NN), southern (SS), and heterozygote (NS) genotypes.

Open in new tab Download slide

Furthermore, levels of π within populations were positively correlated with levels of d_XY and negatively correlated with levels of F_ST between populations except at inversions, where differentiation and divergence far exceeded expectations from their levels of diversity (supplementary fig. S7, Supplementary Material online). Tajima's D estimates further supported these patterns. Strongly negative values of Tajima's D, indicating an excess of rare polymorphisms and possible signals of population expansion or positive selection, were observed in southern populations (GA = −1.2, NY = −1.1). In putative centromeric regions, Tajima's D was lowest in all populations, with northern populations showing negative values despite overall positive Tajima's D across the genome (NS = 0.2, QU = 0.6), which suggests balancing selection or population contraction (supplementary fig. S8, Supplementary Material online).

Recombination Landscapes Correlate With Diversity, Differentiation, and Various Genome Features

We analyzed the relationship between recombination rates and genomic features (supplementary fig. S9, Supplementary Material online), focusing on the GA recombination map because the reference genome was assembled using an individual from this population. Our analysis examined genetic diversity within GA and sequence divergence between GA and NY. We observed a positive correlation between recombination rates and nucleotide diversity (τ = 0.25, P < 0.0001) as well as sequence divergence (τ = 0.24, P < 0.0001), indicating that regions with higher recombination harbor more genetic variation and greater divergence between these populations. In contrast, recombination rates were negatively correlated with genetic differentiation (τ = −0.10, P < 0.0001), suggesting that low-recombination regions are more differentiated. Furthermore, we observed several significant correlations between recombination rates and various genomic features. Recombination rates showed a weak positive correlation with GC content (τ = 0.04, P < 0.0001) and a similarly weak negative correlation with exon density (τ = −0.05, P < 0.0001), indicating that regions with higher recombination tend to have slightly more GC-rich sequences but slightly fewer exons. Additionally, there was a notable positive correlation between recombination rates and tandem repeat content (τ = 0.14, P < 0.0001), suggesting that regions with more tandem repeats experience higher recombination rates.

Discussion

Theory predicts that higher levels of gene flow will result in more clustered genetic architectures, with the spatial arrangement of loci underlying adaptive divergence occurring in closer proximity in the genome (Yeaman and Whitlock 2011; Via 2012; Yeaman 2013). By examining 168 whole genomes from 4 populations that span a gradient of gene flow and differentiation, we provide insights into how the interaction between gene flow and selection can drastically shape genome evolution within a species. Our findings indicate that the landscape of adaptive divergence is correlated with patterns of gene flow, with clustering of differentiated regions in the genome intensifying with increasing gene flow between populations, and provide an elegant example supporting theoretical expectations about divergence with gene flow.

The Atlantic silverside is known for its prominent pattern of local adaptation across the latitudinal gradient of its range, which is partitioned into three regional subdivisions with varying levels of gene flow (reviewed in Conover et al. 2005; Lou et al. 2018). Initial genomic work in this species revealed a dramatic clustering pattern for differentiated loci in the genome in highly connected populations, suggesting that these populations may represent an extreme case of heterogeneity in levels of differentiation across the genome (Wilder et al. 2020). With access to more accurate information about the local genomic landscape, we find that the pattern of genome-wide differentiation between the two southernmost populations is even more striking than it initially appeared. With very high levels of gene flow, genomic differentiation between populations is exclusively constrained to regions of low recombination, resulting in peaks and blocks of differentiation that protrude from an otherwise homogeneous genomic background. This suggests that low-recombination regions are not just favored, but may be essential for divergent selection to persist in the face of high gene flow.

While wide blocks of elevated differentiation coincide with chromosomal inversions that only experience suppressed recombination in a heterozygous state, narrow peaks overlap putative centromeres, where recombination is consistently suppressed. While the differences between centromeres and inversions as recombination modifiers are well known, studies comparing their relative roles in shaping patterns of genomic differentiation are limited. Our results provide an important contribution for our understanding of genome evolution by distinguishing patterns of differentiation and divergence between inversions and putative centromeric regions, which, despite their fundamental differences, are rarely distinguished in such a way. We discuss the patterns and the likely contributing evolutionary processes first for inversions and then for centromeric regions.

Inversion Haploblocks Show Evidence of Adaptive Divergence With Gene Flow

Analyzing genome-wide patterns in light of a high-quality reference genome, we confirmed that the massive haploblocks, initially evidenced in transcriptome-level data (Therkildsen et al. 2019; Wilder et al. 2020), coincide with segregating chromosomal inversions. As hypothesized based on their level and extent of differentiation, these inversion haploblocks also show elevated sequence divergence, supporting their role in facilitating adaptation with gene flow. The suppression of recombination between alternate arrangements of these inversions explains how isolated blocks of high differentiation are maintained in otherwise largely undifferentiated genomes in the face of gene flow, especially south of NY.

Evolutionary models predict that when an inversion occurs, it leads to a marked reduction in diversity within the two arrangements, resembling a selective sweep or bottleneck, especially if the inversion polymorphism is balanced in the population (Navarro et al. 2000). Over time, this suppression of recombination allows the two arrangements to build up sequence divergence and reintroduce variability through gene flow and new mutations, particularly at increasing distances from the inversion breakpoints (Navarro et al. 2000; Andolfatto et al. 2001). Inversions that have persisted for a longer period are expected to show high d_XY and a significant number of fixed differences (high F_ST) between the haplotypes, along with reduced π, and low Tajima's D values, reflecting the accumulation of genetic differences over time. In contrast, more recent inversions typically display lower d_XY and F_ST, with slight reductions in π and near-neutral Tajima's D values, indicating that there has been less time for divergence.

Patterns of diversity and differentiation across inversion haploblocks suggest different evolutionary histories. Haploblocks 18 and 24 share characteristics of high sequence divergence, tens of thousands of fixed differences, and low π and Tajima's D, suggesting that they are relatively old. Haploblock 24 exhibits a U-shaped pattern of differentiation, with higher F_ST near the breakpoints and lower F_ST toward the center, a signature typical of large old inversions. In contrast, haploblock 18 does not follow this pattern due to its complex, nested structure, where F_ST drops between breakpoints of adjacent inversions and rises at the start of each breakpoint. These characteristics suggest that both inversion haploblocks may be ancient balanced polymorphisms, possibly predating speciation. Ancestral polymorphisms maintained through balancing selection may have contributed to the elevated divergence observed today, with differentiation beginning prior to lineage splitting and further reinforced by suppressed recombination within inversions. In contrast, haploblock 8, the largest, appears more recent, with fewer fixed differences and slight reductions in π. Haploblock 11, with a dramatic reduction in π at one location, QU, shows recent differentiation, with Tajima's D suggesting a selective sweep or a recent inversion. These patterns align with the inversion frequencies, where northern arrangements are fixed in northern populations (NS and QU), while southern arrangements segregate or predominate in GA and NY. The geographic distribution of inversion arrangements provides further evidence for their role in shaping patterns of diversity and differentiation. Without information that explicitly links adaptive phenotypic variation to patterns of genetic variation in this species, it is not yet possible to discern the relative roles of these multiple inversions in adaptation. The evolutionary history of the Atlantic silverside, shaped by Pleistocene glacial cycles, reveals two waves of postglaciation colonization from the south, around 16,000 and 8,000 years ago, forming the northern NS and QU populations (Lou et al. 2018). The inversions likely evolved south of the glacial front, conferring an adaptive advantage in colder climates and enabling the fish to track receding cold waters as they expanded northward. Future research could further clarify the role of inversions in facilitating adaptation.

Centromere Peaks do not Show Evidence of Adaptive Divergence With Gene Flow

In addition to the large haploblocks discussed above, we identified mountain-like peaks of differentiation (F_ST) in the south, one for each chromosome, clearly associated with dips in π and d_XY, and harbored within putative centromeric regions. In these areas, π and d_XY are lower in all populations, but the corresponding F_ST peaks are evident only in the south, where genome-wide differentiation between populations is low. High F_ST but low d_XY in centromeric regions relative to the genomic background suggests that in these consistently low-recombination regions, stronger effects of linked selection reduce within-population diversity and promote differentiation between populations, with low sequence divergence indicating an unlikely role for centromeres in preserving locally adapted differences despite gene flow. These patterns offer a clear example of how reduced diversity resulting from areas of low recombination such as centromeres can be mistakenly associated with adaptive divergence in the face of gene flow, especially when a high-quality reference genome and recombination maps are not available.

While centromeres are challenging to identify due to their repetitive and rapidly evolving nature, our recombination-based approach minimizes reliance on complete genome assemblies in these regions by focusing on recombination suppression and transitions in marker density, providing an effective method for identifying putative centromeres (Mansour et al. 2021). It is important to recognize that if centromeric sequences are partially collapsed in the silverside genome assembly, as is common in many genome assemblies, our assessment of genetic diversity in these regions may not fully capture their true characteristics. Additionally, the suppressed recombination we use to identify these regions may extend beyond the actual centromeres, meaning our “putative centromeric regions” represent our best approximation of these challenging genomic features rather than precise centromere boundaries. Although the estimates of the recombination landscape and putative centromeric positions are based on reduced-representation sequence data (Akopyan et al. 2022) and do not provide the same resolution as the whole-genome data, analyzing these data together revealed important insights into the genomic features underlying patterns of differentiation across the genome. Furthermore, consistently suppressed recombination and low marker densities strongly support these regions’ designation as centromeric regions. While lower SNP density in centromeric regions (supplementary fig. S10, Supplementary Material online) may be partly due to challenges in assembling and calling variants in highly repetitive regions, it also reflects an inherent biological feature of centromeres. Suppressed recombination increases the impact of background selection, reducing overall genetic diversity and limiting the accumulation of neutral variation (Charlesworth et al. 1993). Despite this, the available SNPs—ranging between 18,458 and 24,233 across all centromeres per population—are still sufficient to characterize population genetic patterns, even if diversity in these regions is inherently lower than in other genomic regions. Notably, as shown in supplementary fig. S10, Supplementary Material online, SNP density varies across putative centromeric regions, with some windows showing particularly low densities (minima ranging from 35 to 61 SNPs across populations), which may indicate the location of the active centromere within these broader regions.

Genomic islands of differentiation, often underpinned by regions of low recombination, have been described in many taxa (e.g. Tine et al. 2014; Bay and Ruegg 2017; Samuk et al. 2017; Zhang et al. 2017; Shi et al. 2024), but most studies do not differentiate between conditional low-recombination regions, like inversions, and consistent ones, such as centromeric regions. The centromeric patterns we observed here are more dramatic and consistent across the genome than what has been shown in other species, particularly given our focus on intraspecific populations. While the patterns may be due to extremely high levels of gene flow and/or strong selection acting on these populations, they could also be due to highly polygenic trait architectures. Determining whether these centromeric regions harbor loci contributing to adaptation is an important but challenging next step, and may be better characterized in the future with more long-read sequencing and fine-scale trait mapping. Additionally, centromere drive, a form of meiotic drive that occurs during female meiosis, may be a possible explanation for the consistent peaks in F_ST observed on every chromosome. According to the centromere drive hypothesis, a centromere can be retained in a female gamete (i.e. in the oocyte rather than the polar body) more often during meiosis and can therefore act like a selfish genetic element driving non-Mendelian segregation (reviewed in Henikoff et al. 2001; Lampson and Black 2017). This usually results in fitness costs and genetic conflict in the genome that imposes strong selective pressures on centromeric DNA. In populations that become isolated, the competition between centromere sequences can quickly drive differentiation at these regions. For instance, in medaka (Ichikawa et al. 2017) and pink salmon (Christensen et al. 2021), centromeric differences are thought to play a role in speciation. Further studies examining segregation distortion in crosses are needed to test for the potential role of centromere drive in shaping genome evolution in Atlantic silversides.

Conclusion

Our study provides a comprehensive analysis of how gene flow, selection, and recombination interact to shape patterns of genomic differentiation within the Atlantic silverside. By distinguishing between the contributions of inversions and putative centromeric regions, we have uncovered the complex genomic landscape that underlies adaptive divergence in this species. The clustering of differentiated regions in response to high gene flow underscores the critical role of genomic architecture in facilitating adaptation. Our findings not only confirm theoretical predictions about divergence with gene flow but also offer new insights into the relative influence of inversions and centromeric regions in maintaining genetic differentiation. As we continue to unravel the genetic basis of adaptation, further investigation into the functional roles of these genomic regions will be crucial for understanding the mechanisms driving speciation and adaptation in high gene flow environments.

Materials and Methods

Whole-Genome Resequencing and Variant Calling

To optimally explore the role of genome structure in the distribution and levels of diversity and differentiation, we used the Atlantic silverside reference genome v2 (Jacobs et al. 2024), which was improved by anchoring the first version of the reference genome (Tigano et al. 2021) to a species-specific linkage map (Akopyan et al. 2022). We then re-examined low-coverage whole-genome resequencing data (Wilder et al. 2020) for 42 to 50 wild-caught Atlantic silverside individuals from four locations: Jekyll Island, Georgia (GA), Patchogue, New York (NY), Minas Basin, Nova Scotia (NS), and Magdalen Island, Quebec (QU) (Fig. 1). To avoid potential bias due to variation in sample sizes, we only included 42 individuals from each population, removing individuals with the least amount of data.

Adapters were trimmed from sequence reads using Trimmomatic v.0.36 with seed matches = 2, palindrome clip threshold = 30, simple clip threshold = 10, and minAdapterLength = 4 (Bolger et al. 2014). Paired, adapter-clipped reads were then mapped to the reference genome using Bowtie2 v.2.2.9 (Langmead and Salzberg 2012) with the –very-sensitive preset option. Reads with mapping qualities below 20 were filtered out and the remaining reads sorted using SAMtools v.1.9 (Li et al. 2009). Alignment bam files from each lane were then merged for each individual. We then removed duplicated reads using MarkDuplicates v.2.9 from Picard tools (broadinstitute.github.io/picard) and realigned reads around indels using IndelRealigner from GATK (McKenna et al. 2010).

To account for the uncertainty about individual genotypes associated with low-coverage data, we used ANGSD v.0931 (Korneliussen et al. 2014) and conducted our population genomics analyses within a probabilistic framework based on genotype likelihoods. We first examined the sequencing depth distribution across all individuals with the commands -doCounts and -doDepth in ANGSD, using the mode ±2 SDs to establish minimum and maximum depth filters for calling SNPs. We used all individuals to call SNPs globally (P-value = 10⁻⁵), considering only sites with a minimum combined sequencing depth of 120, maximum combined sequencing depth of 428, mapped reads from at least half of the individuals (n = 84), and removed sites with a global minor allele frequency below 0.01. Then, we supplied a list of global SNPs using the -sites option to estimate allele frequencies in each of the four populations separately, excluding sites in the tail ends of the depth distributions for each population. Sites with read depth <20 were excluded from each population, sites with read depth more than 150 were excluded for NY, and sites with read depth more than 120 were excluded for all other populations.

Estimating Differentiation, Diversity, and Linkage Disequilibrium

To investigate population structure, we first conducted a PCA by computing eigenvectors in R from the covariance matrix between individuals estimated in PCAngsd (Meisner and Albrechtsen 2018). We estimated population genetic parameters in nonoverlapping 50 kb windows. To ensure we compared the same nonoverlapping 50 kb windows across analyses, we calculated the window intervals with the command makewindows from BEDTools v.2.29.2 (Quinlan and Hall 2010) based on the lengths of chromosomes of the reference genome. To calculate pairwise genetic differentiation (F_ST) between populations, we generated the joint SFS (2dSFS) for each pair of neighboring populations from their respective site frequency spectra using realSFS fst stats. We computed weighted Weir and Cockerham F_ST averages as the ratio of the sum of alpha (between-population variance) (Bhatia et al. 2013) to the sum of alpha plus beta (within-population variance) across all sites in 50 kb windows. We also calculated the pairwise absolute AFDs between neighboring populations using the minor allele frequency estimates for each population as an alternative to F_ST that is more sensitive to weak population differentiation (Berner 2019).

To calculate nucleotide diversity (π) and Tajima's D within populations and pairwise sequence divergence (d_XY) between populations, we used the SFS based on all sites (i.e. variant and invariant sites, so no SNP calling, only the depth filter) as a prior. We estimated per-site thetas (population scaled mutation rate) for each population with -doThetas then used thetaStat to calculate π and Tajima's D averages per 50 kb window, filtering out windows with fewer than 100 sites for estimates of π and Tajima's D. To calculate d_XY, we used a custom python script dxy_wsfs.py (Marques et al. 2018) after estimating the 2dSFS in ANGSD using all sites for each window for each pair of neighboring populations.

Characterizing Genome Features

We obtained pedigree-based recombination rates in cM/Mb from Akopyan et al. (2022), including three recombination maps for NY, GA, and an interpopulation cross. Due to drastic differences in recombination between sexes (i.e. heterochiasmy) observed in this species, with male recombination restricted to the terminal ends of chromosomes (Akopyan et al. 2022), we focused our analysis on female recombination rates, which we averaged into the same 50 kb windows described above. Because the pedigree-based recombination information was based on reduced-representation sequencing data, the amount of data missing in windows was relatively high, with 46%, 52%, and 35% of 50 kb windows missing data on recombination rates for GA, NY, and hybrid maps, respectively.

To evaluate genome-wide associations between recombination rates and genomic features that are known to correlate with recombination landscapes in other species, we characterized gene density and GC content, as recombination events tend to localize in GC-rich and/or gene-dense regions in other vertebrate species (Auton et al. 2013; Singhal et al. 2015; Shanfelter et al. 2019). To identify coding regions in the silverside genome, we obtained annotation coordinates (Jacobs et al. 2024) and calculated the total number and average proportion of exons and coding sequences (including exons and the 5′ and 3′ UTRs) for each 50 kb window with BEDTools intersect. We calculated GC content by obtaining the base composition of each window of the reference genome using BEDTools nuc. Data summaries were performed using the tidyverse package in R v. 4.0.0 (R Core Team 2020).

We also identified the location of known segregating inversions and putative centromeres and telomeres, which are typically associated with recombination cold- and hot-spots (Krimbas and Powell 1992; Petes 2001). We obtained and lifted over inversion positions (supplementary table S1, Supplementary Material online) from Akopyan et al. (2022). To calculate inversion frequencies within each population, we performed PCA on SNPs within inversion regions (supplementary fig. S4, Supplementary Material online). For each population, we identified three clusters along PC1, corresponding to the two alternate homozygous genotypes and the heterozygous genotype. The proportion of individuals (out of the 42 sampled per population) within each cluster was then calculated. Clusters representing homozygous genotypes were categorized as either northern or southern based on the predominant presence of individuals from Quebec or Georgia, respectively.

We estimated the putative locations of centromeres and telomeres using a combination of approaches. First, we used the three recombination maps to identify heterochromatin boundaries typical of centromeres and telomeres based on patterns of marker density and distribution, and local changes in recombination rates with BREC (Mansour et al. 2021). Second, we used Tandem Repeats Finder v.4.09 (Benson 1999) to identify repeats in the reference genome with pattern size <500 bp that are generally associated with telomeres in eukaryotes. We used trfparser v.1 (trfparser.sourceforge.io/) to parse the output and then filtered for repeats based on the telomere repeat motif (TTAGGG) that is conserved among vertebrates (Meyne et al. 1989) to refine our estimates of telomere positions. We used BEDTools intersect to categorize each 50 kb window as an inversion, a telomere, a centromere, or none of those. Windows falling into more than one category were assigned to the category with more support, prioritizing inversions, followed by telomeres, and then centromeres.

Evaluating Landscapes of Diversity, Differentiation, and Recombination Across Genomic Features

We plotted all estimates of population statistics and genome parameters in 50 kb windows to visualize genome-wide patterns using Manhattan plots. We analyzed the genomic distribution of F_ST outlier loci across chromosomes to assess clustering patterns. For each of the three pairwise comparisons of neighboring populations, we identified outliers as loci with F_ST values exceeding 3 SDs above the mean. Using a permutation test, we randomly redistributed the outliers across chromosomes, weighted by chromosome size, to generate a null distribution of expected variance in outlier counts. This process was repeated 10,000 times to establish a baseline for random clustering. We compared the observed variance in outlier distribution against this null distribution to determine the nonrandomness of the clustering in each comparison.

To compare how population genetic estimates varied among chromosomal regions and among pairwise comparisons of populations, we used linear mixed models with population comparison and chromosomal region as random effects, then used least-square means for post hoc pairwise comparisons. We also evaluated consistency between relative and absolute measures of differentiation by correlating windowed estimates of F_ST and d_XY across pairs using a Kendall's rank correlation test. The statistical analyses were conducted in R v. 4.0.0 (R Core Team 2020).

Supplementary Material

Supplementary material is available at Molecular Biology and Evolution online.

Acknowledgments

The authors would like to thank David Conover for providing access to the silverside samples analyzed in this study and Harmony Borchardt-Wier and Cornell's Biotechnology Resource Center for help with library preparation for low-coverage whole-genome sequencing. This study was funded through a National Science Foundation grant to N.O.T. (OCE-1756316).

Author Contributions

M.A., A.T., A.J., A.P.W., and N.O.T. designed the study. A.J. mapped the population data to the reference genome. M.A. conducted the data analysis and drafted the manuscript with critical input from all authors.

Data Availability

The genomic sequence data described in this article have been archived and are publicly available in the NCBI Short Read Archive under Bioproject ID PRJNA376564. The linkage-map-anchored reference genome is available on GenBank under accession GCA_965154125.1. Scripts for data analysis are in a GitHub repository https://github.com/makopyan/silverside-4pop.

References

Akopyan

Tigano

Jacobs

Wilder

Baumann

Therkildsen

Comparative linkage mapping uncovers recombination suppression across massive chromosomal inversions associated with local adaptation in Atlantic silversides

Mol Ecol

2022

(

3323

–

3341

Andolfatto

Depaulis

Navarro

2001

Inversion polymorphisms and nucleotide variability in Drosophila

Genet Res.

(

–

10.1017/S0016672301004955

Arnott

Chiba

Conover

Evolution of intrinsic growth rate: metabolic costs drive trade-offs between growth and swimming performance in Menidia menidia

Evolution

2006

1269

–

1278

10.1111/j.0014-3820.2006.tb01204.x

Auton

Kidd

Oliveira

Nadel

Holloway

Hayward

Cohen

Greally

Wang

, et al.

Genetic recombination is targeted towards gene promoter regions in dogs

PLoS Genet

2013

(

e1003984

10.1371/journal.pgen.1003984

Bay

Ruegg

2017

Genomic islands of divergence or opportunities for introgression?

Proc R Soc B Biol Sci

284

(

1850

20162414

10.1098/rspb.2016.2414

Google Scholar

Crossref

WorldCat

Benson

Tandem repeats finder: a program to analyze DNA sequences

Nucleic Acids Res

1999

(

573

–

580

Berner

Allele frequency difference AFD–an intuitive alternative to F_ST for quantifying genetic population differentiation

Genes (Basel).

2019

(

308

10.3390/genes10040308

Bhatia

Patterson

Sankararaman

Price

Estimating and interpreting F_ST: the impact of rare variants

Genome Res

2013

(

1514

–

1521

10.1101/gr.154831.113

Billerbeck

Lankford

Conover

Evolution of intrinsic growth and energy acquisition rates. I. Trade-offs with swimming performance in Menidia menidia

Evolution

2001

(

1863

–

1872

10.1111/j.0014-3820.2001.tb00835.x

Bolger

Lohse

Usadel

Trimmomatic: a flexible trimmer for Illumina sequence data

Bioinformatics

2014

(

2114

–

2120

10.1093/bioinformatics/btu170

Booker

Yeaman

Whitlock

Variation in recombination rate affects detection of outliers in genome scans under neutrality

Mol Ecol.

2020

(

4274

–

4279

Burri

Nater

Kawakami

Mugal

Olason

Smeds

Suh

Dutoit

Bureš

Garamszegi

, et al.

Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers

Genome Res

2015

(

1656

–

1665

10.1101/gr.196485.115

Charlesworth

Morgan

Charlesworth

The effect of deleterious mutations on neutral molecular variation

Genetics

1993

134

(

1289

–

1303

10.1093/genetics/134.4.1289

Christensen

Rondeau

Sakhrani

Biagi

Johnson

Joshi

Flores

A-M

Leelakumari

Moore

Pandoh

, et al.

The pink salmon genome: uncovering the genomic consequences of a two-year life cycle

PLoS One

2021

e0255752

10.1371/journal.pone.0255752

Christmas

Jones

Olsson

Wallerman

Bunikis

Kierczak

Peona

Whitley

Larva

Suh

, et al.

Genetic barriers to historical gene flow between cryptic species of alpine bumblebees revealed by comparative population genomics

Mol Biol Evol.

2021

(

3126

–

3143

10.1093/molbev/msab086

Clarke

Walther

Munch

Thorrold

Conover

Chemical signatures in the otoliths of a coastal marine fish, Menidia menidia, from the northeastern United States: spatial and temporal differences

Mar Ecol Prog Ser.

2009

384

261

–

271

Conover

Arnott

Walsh

Munch

Darwinian fishery science: lessons from the Atlantic silverside (Menidia menidia)

Can J Fish Aquat Sci

2005

(

730

–

737

Conover

Duffy

Hice

The covariance between genetic and environmental influences across ecological gradients: reassessing the evolutionary significance of countergradient and cogradient variation

Ann N Y Acad Sci

2009

1168

(

100

–

129

10.1111/j.1749-6632.2009.04575.x

Conover

Present

TMC

Countergradient variation in growth rate: compensation for length of the growing season among Atlantic silversides from different latitudes

Oecologia

1990

(

316

–

324

Cruickshank

Hahn

Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow

Mol Ecol.

2014

(

3133

–

3157

Dapper

Payseur

Connecting theory and data to understand recombination rate evolution

Philos Trans R Soc B

2017

372

(

1736

20160469

10.1098/rstb.2016.0469

Google Scholar

Crossref

WorldCat

Delmore

Ramos

JSL

Doren

Lundberg

Bensch

Irwin

Liedvogel

Comparative analysis examining patterns of genomic differentiation across multiple episodes of population divergence in birds

Evol Lett.

2018

(

–

DeRaad

McCormack

Chen

Peterson

Moyle

Combining species delimitation, species trees, and tests for gene flow clarifies complex speciation in scrub-jays

Syst Biol.

2022

(

1453

–

1470

10.1093/sysbio/syac034

Felsenstein

The theoretical population genetics of variable selection and migration

Annu Rev Genet.

1976

(

253

–

280

10.1146/annurev.ge.10.120176.001345

Flaxman

Wacholder

Feder

Nosil

Theoretical models of the influence of genomic architecture on the dynamics of speciation

Mol Ecol.

2014

(

4074

–

4088

Galtier

Piganeau

Mouchiroud

Duret

GC-content evolution in mammalian genomes: the biased gene conversion hypothesis

Genetics

2001

159

(

907

–

911

10.1093/genetics/159.2.907

García-Ramos

Kirkpatrick

Genetic models of adaptation and gene flow in peripheral populations

Evolution

1997

(

–

Gaut

Wright

Rizzon

Dvorak

Anderson

Recombination: an underappreciated factor in the evolution of plant genomes

Nat Rev Genet.

2007

(

–

Gavrilets

Fitness landscapes and the origin of species (MPB-41)

Princeton, New Jersey

Princeton University Press

;

2004

Gutiérrez-Valencia

Hughes

Berdan

Slotte

The genomic architecture and evolutionary fates of supergenes

Genome Biol Evol.

2021

(

evab057

Haenel

Laurentino

Roesti

Berner

Meta-analysis of chromosome-scale crossover rate variation in eukaryotes and its significance to evolutionary genomics

Mol Ecol.

2018

(

2477

–

2497

Hämälä

Wafula

Guiltinan

Ralph

dePamphilis

Tiffin

Genomic structural variants constrain and facilitate adaptation in natural populations of Theobroma cacao, the chocolate tree

Proc Natl Acad Sci U S A.

2021

118

(

e2102914118

10.1073/pnas.2102914118

Hejase

Salman-Minkov

Campagna

Hubisz

Lovette

Gronau

Siepel

Genomic islands of differentiation in a rapid avian radiation have been driven by recent selective sweeps

Proc Natl Acad Sci U S A.

2020

117

(

30554

–

30565

10.1073/pnas.2015987117

Henikoff

Ahmad

Malik

The centromere paradox: stable inheritance with rapidly evolving DNA

Science

2001

293

(

5532

1098

–

1102

10.1126/science.1062939

Hice

Duffy

Munch

Conover

Spatial scale and divergent patterns of variation in adapted traits in the ocean

Ecol Lett.

2012

(

568

–

575

10.1111/j.1461-0248.2012.01769.x

Hoffmann

Rieseberg

Revisiting the impact of inversions in evolution: from population genetic markers to drivers of adaptive shifts and speciation?

Annu Rev Ecol Evol Syst.

2008

(

–

10.1146/annurev.ecolsys.39.110707.173532

Ichikawa

Tomioka

Suzuki

Nakamura

Doi

Yoshimura

Kumagai

Inoue

Uchida

Irie

, et al.

Centromere evolution and CpG methylation during vertebrate speciation

Nat Commun.

2017

(

1833

10.1038/s41467-017-01982-7

Irwin

Alcaide

Delmore

Irwin

Owens

Recurrent selection explains parallel evolution of genomic regions of high relative but low absolute differentiation in a ring species

Mol Ecol.

2016

(

4488

–

4507

Jacobs

Velotta

Tigano

Wilder

Baumann

Therkildsen

Temperature-dependent gene regulatory divergence underlies local adaptation with gene flow in the Atlantic silverside

Evolution

2024

(

1133

–

1149

10.1093/evolut/qpae049

Jones

Wallberg

Christmas

Kapheim

Webster

Extreme differences in recombination rate between the genomes of a solitary and a social bee

Mol Biol Evol.

2019

(

2277

–

2291

10.1093/molbev/msz130

Kirkpatrick

Barton

Chromosome inversions, local adaptation and speciation

Genetics

2006

173

(

419

–

434

10.1534/genetics.105.047985

Korneliussen

Albrechtsen

Nielsen

ANGSD: analysis of next generation sequencing data

BMC Bioinformatics

2014

(

356

10.1186/s12859-014-0356-4

Krimbas

Powell

Drosophila inversion polymorphism

Boca Raton, Florida

CRC Press

;

1992

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Lampson

Black

Cellular and molecular mechanisms of centromere drive

Cold Spring Harb Symp Quant Biol.

2017

249

–

257

10.1101/sqb.2017.82.034298

Langmead

Salzberg

Fast gapped-read alignment with Bowtie 2

Nat Methods

2012

(

357

–

359

Lenormand

Gene flow and the limits to natural selection

Trends Ecol Evol

2002

(

183

–

189

10.1016/S0169-5347(02)02497-7

Google Scholar

Crossref

WorldCat

Handsaker

Wysoker

Fennell

Ruan

Homer

Marth

Abecasis

Durbin

The sequence alignment/map format and SAMtools

Bioinformatics

2009

(

2078

–

2079

10.1093/bioinformatics/btp352

Lotterhos

The effect of neutral recombination variation on genome scans for selection

G3 (Bethesda)

2019

(

1851

–

1867

10.1534/g3.119.400088

Lou

Fletcher

Wilder

Conover

Therkildsen

Searle

Full mitochondrial genome sequences reveal new insights about post-glacial expansion and regional phylogeographic structure in the Atlantic silverside (Menidia menidia)

Mar Biol.

2018

165

(

124

10.1007/s00227-018-3380-5

Google Scholar

Crossref

WorldCat

Lucek

Gompert

Nosil

2019

The role of structural genomic variants in population differentiation and ecotype formation in Timema cristinae walking sticks

Mol Ecol.

(

1224

–

1237

Mansour

Chateau

Fiston-Lavier

A-S

BREC: an R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

BMC Bioinformatics

2021

(

396

10.1186/s12859-021-04233-1

Marques

Jones

Palma

Kingsley

Reimchen

Experimental evidence for rapid genomic adaptation to a new niche in an adaptive radiation

Nat Ecol Evol.

2018

(

1128

–

1138

10.1038/s41559-018-0581-8

Martin

Davey

Salazar

Jiggins

Recombination rate variation shapes barriers to introgression across butterfly genomes

PLoS Biol

2019

(

e2006288

10.1371/journal.pbio.2006288

McKenna

Hanna

Banks

Sivachenko

Cibulskis

Kernytsky

Garimella

Altshuler

Gabriel

Daly

, et al.

The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data

Genome Res

2010

(

1297

–

1303

10.1101/gr.107524.110

McVean

GAT

Myers

Hunt

Deloukas

Bentley

Donnelly

The fine-scale structure of recombination rate variation in the human genome

Science

2004

304

(

5670

581

–

584

10.1126/science.1092500

Meisner

Albrechtsen

Inferring population structure and admixture proportions in low-depth NGS data

Genetics

2018

210

(

719

–

731

10.1534/genetics.118.301336

Mérot

Oomen

Tigano

Wellenreuther

A roadmap for understanding the evolutionary significance of structural genomic variation

Trends Ecol Evol.

2020

(

561

–

572

10.1016/j.tree.2020.03.002

Meyne

Ratliff

Moyzis

Conservation of the human telomere sequence (TTAGGG)n among vertebrates

Proc Natl Acad Sci U S A.

1989

(

7049

–

7053

10.1073/pnas.86.18.7049

Mugal

Weber

Ellegren

GC-biased gene conversion links the recombination landscape and demography to genomic base composition

BioEssays

2015

(

1317

–

1326

10.1002/bies.201500058

Munch

Conover

Rapid growth results in increased susceptibility to predation in Menidia menidia

Evolution

2003

(

2119

–

2127

10.1111/j.0014-3820.2003.tb00389.x

Navarro

Barbadilla

Ruiz

Effect of inversion polymorphism on the neutral nucleotide variability of linked chromosomal regions in Drosophila

Genetics

2000

155

(

685

–

698

10.1093/genetics/155.2.685

Navarro

Betrán

Barbadilla

Ruiz

Recombination and gene Flux caused by gene conversion and crossing over in inversion heterokaryotypes

Genetics

1997

146

(

695

–

709

10.1093/genetics/146.2.695

Noor

MAF

Bennett

Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species

Heredity (Edinb).

2009

103

(

439

–

444

Ortiz-Barrientos

James

Evolution of recombination rates and the genomic landscape of speciation

J Evol Biol.

2017

(

1519

–

1521

Paigen

Szatkiewicz

Sawyer

Leahy

Parvanov

SHS

Graber

Broman

Petkov

The recombinational anatomy of a mouse chromosome

PLoS Genet

2008

(

e1000119

10.1371/journal.pgen.1000119

Petes

Meiotic recombination hot spots and cold spots

Nat Rev Genet.

2001

(

360

–

369

Quinlan

Hall

BEDTools: a flexible suite of utilities for comparing genomic features

Bioinformatics

2010

(

841

–

842

10.1093/bioinformatics/btq033

R Core Team

R: 2019. A language and environment for statistical computing

Version 3

Vienna (Austria)

R Foundation for Statistical Computing

;

2020

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Rieseberg

Chromosomal rearrangements and speciation

Trends Ecol Evol.

2001

(

351

–

358

10.1016/S0169-5347(01)02187-5

Roesti

Moser

Berner

Recombination in the threespine stickleback genome—patterns and consequences

Mol Ecol.

2013

(

3014

–

3027

Rousselle

Laverré

Figuet

Nabholz

Galtier

Influence of recombination and GC-biased gene conversion on the adaptive and non-adaptive substitution rate in mammals vs. birds

Mol Biol Evol.

2018

msy243

10.1093/molbev/msy243

Google Scholar

Crossref

WorldCat

Samuk

Owens

Delmore

Miller

Rennison

Schluter

Gene flow and selection interact to promote adaptive divergence in regions of low recombination

Mol Ecol.

2017

(

4378

–

4390

Schaal

Haller

Lotterhos

Inversion invasions: when the genetic basis of local adaptation is concentrated within inversions in the face of gene flow

Philos Trans R Soc B

2022

377

(

1856

20210200

10.1098/rstb.2021.0200

Google Scholar

Crossref

WorldCat

Shanfelter

Archambeault

White

Divergent fine-scale recombination landscapes between a freshwater and marine population of threespine stickleback fish

Genome Biol Evol.

2019

(

1573

–

1585

Shi

Zhou

Liang

Wang

Linked selection and recombination rate generate both shared and lineage-specific genomic islands of divergence in two independent Quercus species pairs

J Syst Evol.

2024

(

505

–

519

Singhal

Leffler

Sannareddy

Turner

Venn

Hooper

Strand

Raney

Balakrishnan

, et al.

Stable recombination hotspots in birds

Science

2015

350

(

6263

928

–

932

10.1126/science.aad0843

Slatkin

Gene flow and the geographic structure of natural populations

Science

1987

236

(

4803

787

–

792

10.1126/science.3576198

Smeds

Mugal

Qvarnström

Ellegren

High-resolution mapping of crossover and non-crossover recombination events by whole-genome re-sequencing of an avian pedigree

PLoS Genet

2016

(

e1006044

10.1371/journal.pgen.1006044

Smith

Nambiar

New solutions to old problems: molecular mechanisms of meiotic crossover control

Trends Genet

2020

(

337

–

346

10.1016/j.tig.2020.02.002

Stapley

Feulner

PGD

Johnston

Santure

Smadja

Variation in recombination frequency and distribution across eukaryotes: patterns and processes

Philos Trans R Soc B Biol Sci

2017

372

(

1736

20160455

10.1098/rstb.2016.0455

Google Scholar

Crossref

WorldCat

Sturtevant

Beadle

The relations of inversions in the X chromosome of Drosophila melanogaster to crossing over and disjunction

Genetics

1936

(

554

–

604

10.1093/genetics/21.5.554

Therkildsen

Wilder

Conover

Munch

Baumann

Palumbi

Contrasting genomic shifts underlie parallel phenotypic evolution in response to fishing

Science

2019

365

(

6452

487

–

490

10.1126/science.aaw7271

Tigano

Friesen

Genomics of local adaptation with gene flow

Mol Ecol.

2016

(

2144

–

2164

Tigano

Jacobs

Wilder

Nand

Zhan

Dekker

Therkildsen

Chromosome-level assembly of the Atlantic silverside genome reveals extreme levels of sequence diversity and structural genetic variation

Genome Biol Evol.

2021

(

evab098

Tiley

Burleigh

The relationship of recombination rate, genome structure, and patterns of molecular evolution across angiosperms

BMC Evol Biol.

2015

(

194

10.1186/s12862-015-0473-3

Tine

Kuhl

Gagnaire

P-A

Louro

Desmarais

Martins

RST

Hecht

Knaust

Belkhir

Klages

, et al.

European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation

Nat Commun.

2014

(

5770

Torres

API

Höök

Näsvall

Shipilina

Wiklund

Vila

Pruisscher

Backström

The fine-scale recombination rate variation and associations with genomic features in a butterfly

Genome Res

2023

(

810

–

823

10.1101/gr.277414.122

Via

Divergence hitchhiking and the spread of genomic isolation during ecological speciation-with-gene-flow

Philos Trans R Soc B Biol Sci

2012

367

(

1587

451

–

460

10.1098/rstb.2011.0260

Google Scholar

Crossref

WorldCat

Wallberg

Glémin

Webster

Extreme recombination frequencies shape genome variation and evolution in the honeybee, Apis mellifera

PLoS Genet

2015

(

e1005189

10.1371/journal.pgen.1005189

Weissensteiner

Bunikis

Catalán

Francoijs

K-J

Knief

Heim

Peona

Pophaly

Sedlazeck

Suh

, et al.

Discovery and population genomics of structural variation in a songbird genus

Nat Commun.

2020

(

3403

10.1038/s41467-020-17195-4

Wellenreuther

Bernatchez

Eco-evolutionary genomics of chromosomal inversions

Trends Ecol Evol.

2018

(

427

–

440

10.1016/j.tree.2018.04.002

Wilder

Palumbi

Conover

Therkildsen

Footprints of local adaptation span hundreds of linked genes in the Atlantic silverside genome

Evol Lett.

2020

(

430

–

443

Young but not relatively old retrotransposons are preferentially located in gene-rich euchromatic regions in tomato (Solanum lycopersicum) plants

Plant J

2014

(

582

–

591

Yeaman

Genomic rearrangements and the evolution of clusters of locally adaptive loci

Proc Natl Acad Sci U S A.

2013

110

(

E1743

–

E1751

10.1073/pnas.1219381110

Yeaman

Whitlock

The genetic architecture of adaptation under migration-selection balance

Evolution

2011

(

1897

–

1911

10.1111/j.1558-5646.2011.01269.x

Zhang

Song

Gao

Cheng

Shao

Alström

Lei

Genomic differentiation and patterns of gene flow between two long-tailed tit species (Aegithalos)

Mol Ecol.

2017

(

6654

–

6665

. doi:

Author notes

Conflict of Interest: The authors declare no competing interests.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Associate Editor:

Download all slides

Month:	Total Views:
April 2025	159
May 2025	39

Article Contents

Genetic Differentiation is Constrained to Chromosomal Inversions and Putative Centromeres in Locally Adapted Populations With Higher Gene Flow

Abstract

Introduction

Results

Levels of Background Differentiation (F_ST) Decrease From North to South

Populations With Higher Gene Flow Exhibit Stronger Clustering of F_ST Outliers

Massive Chromosomal Inversions Harbor Haploblocks of Differentiation

Localized F_ST Peaks Align With Putative Centromeric Regions

Sequence Divergence (d_XY) is Elevated in Large Chromosomal Inversions

Contrasting Patterns of Divergence and Differentiation in Inversions and Putative Centromeric Regions

Diversity Estimates Reflect Divergent Selection in Inversions but not Centromeric Regions

Recombination Landscapes Correlate With Diversity, Differentiation, and Various Genome Features

Discussion

Inversion Haploblocks Show Evidence of Adaptive Divergence With Gene Flow

Centromere Peaks do not Show Evidence of Adaptive Divergence With Gene Flow

Conclusion

Materials and Methods

Whole-Genome Resequencing and Variant Calling

Estimating Differentiation, Diversity, and Linkage Disequilibrium

Characterizing Genome Features

Evaluating Landscapes of Diversity, Differentiation, and Recombination Across Genomic Features

Supplementary Material

Acknowledgments

Author Contributions

Data Availability

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Genetic Differentiation is Constrained to Chromosomal Inversions and Putative Centromeres in Locally Adapted Populations With Higher Gene Flow

Abstract

Introduction

Results

Levels of Background Differentiation (FST) Decrease From North to South

Populations With Higher Gene Flow Exhibit Stronger Clustering of FST Outliers

Massive Chromosomal Inversions Harbor Haploblocks of Differentiation

Localized FST Peaks Align With Putative Centromeric Regions

Sequence Divergence (dXY) is Elevated in Large Chromosomal Inversions

Contrasting Patterns of Divergence and Differentiation in Inversions and Putative Centromeric Regions

Diversity Estimates Reflect Divergent Selection in Inversions but not Centromeric Regions

Recombination Landscapes Correlate With Diversity, Differentiation, and Various Genome Features

Discussion

Inversion Haploblocks Show Evidence of Adaptive Divergence With Gene Flow

Centromere Peaks do not Show Evidence of Adaptive Divergence With Gene Flow

Conclusion

Materials and Methods

Whole-Genome Resequencing and Variant Calling

Estimating Differentiation, Diversity, and Linkage Disequilibrium

Characterizing Genome Features

Evaluating Landscapes of Diversity, Differentiation, and Recombination Across Genomic Features

Supplementary Material

Acknowledgments

Author Contributions

Data Availability

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Levels of Background Differentiation (F_ST) Decrease From North to South

Populations With Higher Gene Flow Exhibit Stronger Clustering of F_ST Outliers

Localized F_ST Peaks Align With Putative Centromeric Regions

Sequence Divergence (d_XY) is Elevated in Large Chromosomal Inversions