Textural Features on Computed Tomography Scans Predict Overall Survival in Patients with Esophageal Cancer

Purpose: To predict overall survival (OS) in non-metastatic esophageal cancer using texture analysis of pretherapy computed tomography (CT) images. Materials and Methods: Records from 762 non-metastatic esophageal cancer patients with non-contrast CT scans (obtained from 1998-2011) before receiving chemoradiation were retrospectively reviewed. 328 quantitative image features were extracted from the esophageal gross tumor volume (GTV). A random survival forest model compared how well five of these features (entropy, histogram 10th percentile, volume, volume-to-area ratio, fraction GTV pruned after thresholding) predicted OS versus all 328 features. Cox proportional hazards modeling was used to derive scores, based on these five features, which could stratify patients by survival in a training set consisting of 50% of the 762 cases, chosen randomly from the data. This model was then tested in a validation set (remaining 50% of cases). Multivariate analysis was done with the image-derived score and other prognostic variables. Results: CT texture analysis based on the five image-derived features yielded a similar concordance rate for predicting OS (56%) as did all 328 features (56%), and in fact showed higher concordance for predicting OS than disease stage alone (44%). This image-derived score was also able to significantly stratify OS (P<0.05) in both the training and validation set, as well as independently predict OS in multivariate analysis (HR 1.61, 95% CI 1.13-2.29, P=0.009), along with stage, treatment with surgery, tumor grade, and radiation modality. Conclusions: Texture features from pretreatment CT images can independently predict OS in patients with nonmetastatic esophageal carcinoma.


Advances in Knowledge
• In a cohort of 762 patients with non-metastatic esophageal cancer, computed tomography (CT) texture analysis (based on five imaging features: entropy, histogram 10th percentile, volume, volume-to-area ratio, and fraction GTV pruned after thresholding) was found to independently predict overall survival (OS) in multivariate analysis (HR 1.61, 95% CI 1.13-2.29, P=0.009), as did clinical stage, treatment with surgery after chemoradiation, tumor grade, and radiation modality. • Using a random survival forest model where survival concordance was assessed by computing an estimate of the cumulative hazard function, this CT-based textural analysis was found to produce higher concordance for OS (56%) than was disease stage alone (44%), and similar to baseline standardized uptake values on positron emission tomography (55%). • Using log-rank tests, CT-based texture analysis was found to stratify OS with statistical significance (P=0.0119) in a randomly chosen validation set (N=381, random 50% of the 762 patients), and was also seen to dichotomize survival (P=0.0086) in esophageal cancer patients who received definitive chemoradiation without surgery (N=383).

Introduction
The prognosis for patients with esophageal cancer is dismal. The current standard of care for locally advanced esophageal cancer is neoadjuvant chemoradiation followed by surgery, which has led to improved survival compared to surgery alone [1]. However, the benefit of surgery after chemoradiation is controversial [2,3], as is the potential benefit of induction chemotherapy before chemoradiation [4]. Identifying novel predictors of outcome for patients with esophageal cancer may allow better risk stratification for guiding more optimized management strategies.
Tumor heterogeneity is a well-known indicator of adverse prognosis in esophageal cancer. Genomic and phenotypic heterogeneity in esophageal cancer can negatively affect treatment response to a variety of cytotoxic agents [5,6]. Additionally, heterogeneity in tumor vascularity can result in hypoxic areas within the tumor, which can drive genomic instability [7], promote tumor survival, and result in treatment failure [8]. Noninvasive ways of assessing biological tumor heterogeneity may be useful for predicting treatment response and survival.
One noninvasive approach to assessing tumor heterogeneity is to analyze textural qualities on images of tumors, which would provide surrogate information on the tumor microenvironment. Both positron emission tomography (PET) and computed tomography (CT) have been used to derive textural information on tumors, which is then analyzed by using structural, model-based, or statistical methods [9,10]. Statistical methods, used widely for oncologic texture analysis, are based on representations of texture from the distribution and relationship of pixel gray-level values in an image. The value of statistic-based CT texture analysis as a prognostic marker in cancer has been promising in evaluations of several types of cancer, including non-small cell lung cancer [11], liver cancer [12], colorectal cancer [13], and renal cell cancer [14]. However, few studies (with limited sample sizes) have investigated the potential of CT texture analysis for predicting prognosis in esophageal cancer [15][16][17]. The goal of our current study was to assess whether texture analysis of pre-therapy non-contrast CT images can be used as a prognostic marker for OS in a large cohort of patients with non-metastatic esophageal cancer.

Patients
This study was approved by the appropriate institutional review board. We identified 762 consecutive patients with biopsy-confirmed non-metastatic esophageal cancer who had undergone non-contrast CT scans before receiving chemoradiation ± surgery between 1998 and 2011. All patients must have had pre-therapy CT scans with consistent imaging parameters (please see "CT Image Acquisition" below); patients with CT scans of differing tube voltage (N=5) and slice thickness (N=45) were excluded from analysis. Beam hardening or metal artifacts were not examined as an exclusion parameter. Disease stage was determined according to the 6 th (2002) edition of the American Joint Committee on Cancer staging manual.

Treatment
Chemotherapy consisting of a fluoropyrimidine (IV or oral) and either a platinum compound or a taxane was given concurrently with radiation therapy to a median dose of 50.4 Gy delivered in daily 1.8-Gy fractions. The GTV was defined as all known gross disease based on the non-contrast radiation planning CT and all available clinical information (including baseline PET/CT, diagnostic CT with contrast, and endoscopy/endoscopic ultrasound results). Each GTV for all 762 patients was manually contoured by a radiation oncologist with expertise in the treatment of thoracic/esophageal tumors.

CT image acquisition
All pre-therapy CT images were acquired by using imaging systems manufactured by either GE Medical Systems or Philips. Images were obtained across various machine models, but attention went into assuring that all image sets had consistent imaging parameters including: 1) tube voltage of 120 kVp for all patients, 2) consistent slice thickness with modal average of 2.5 mm (range: 2.5-3 mm), 3) comparable in-slice pixel dimension of 0.98mm (range 0.94-0.98 mm allowed), and 4) use of convolution kernel with body filter for all patients.

CT texture analysis and statistical methods
Non-contrast CT image sets obtained from all 762 patients before chemoradiation were available in institutional archives for analysis. To analyze the textural features on CT images of esophageal tumors, 328 distinct quantitative image features (based on tumor geometry, intensity histogram, absolute gradient image [IGR], co-occurrence matrix [COM], and run-length matrix [RLM]) were extracted from the physician-delineated GTV on each image set. Histograms were calculated from intensity of pixels, without consideration of spatial relations between pixels. IGR derives features from gradient magnitude map of the image. COM is a second-order histogram, computed from intensities of pairs of pixels, and RLM holds counts of pixel runs with the specified gray-scale level and length. Details of these features are described elsewhere [18].
For the current study, instead of using all 328 image features, five representative image-texture features (entropy, histogram 10 th percentile, tumor volume, volume-to-area ratio, and fraction GTV pruned after thresholding) were ultimately chosen for the final texture-analysis. These five features were chosen based on a previously published study which sought to identify CT image features that were reproducible (small variation between sessions or between CT scanners), non-redundant (not highly correlated with other features), and informative (features that vary between patients) [18]. For these features, the noise (change in the value of the feature if the patient is imaged twice) is small compared with the variability in the value of the feature between patients. Thus, based on both this non-small cell lung cancer study [18] and another yet-to-be published study examining a smaller range of useful imaging features in esophageal cancers, these 5 features were felt to be the most relevant for assessing CT texture in esophageal tumors. To determine whether these 5 features were indeed accurate representations of the initial 328 quantitative image features, a random survival forest model [19] was used to compare the model error of these five image features with that of all 328 image features for predicting OS. The error of the random forest model was estimated by computing an estimate of the cumulative hazard function. Each nodesplit of a tree in this ensemble was obtained via a log-rank test. An estimate of the model's performance was assessed through the out-ofbag (OOB) error rate. This error rate was estimated through the formula: 100 * (1-C), with C being a concordance index [19] that measures how well the random forest correctly ranks survival of any two individuals in the data. In addition to providing an estimate of error rate, the model also outputs a measure of variable importance, i.e. a ranked list of predictor variables that are important to the model [19]. The importance measure of a variable is the value of prediction error of the entire model (i.e. all the variables) subtracted from the error of a model containing randomized values for that variable. The relative importance value is the value of importance measure divided the maximum possible importance value across all the variables. hazards model were used to produce an image-derived score for each patient (i.e. coefficient * tumor volume + coefficient * volume-to-area ratio + coefficient * fraction GTV pruned after thresholding + coefficient * histogram 10th percentile + coefficient * entropy = imagederived score, with score ranging from 2.852 to 6.650). Using a kadaptive partitioning algorithm, we then sought a cut-point on the score continuum that could stratify survival among the training set consisting of a random 50% of the 762 cases. This identified cutpoint (3.811) was then tested within a validation cohort (consisting of the remaining 50% of the cases not present in the training cohort). The training set and validation set were balanced by vital status; although we did not balance by demographics, baseline covariates were controlled for in the multivariate model described below. As another validation method, we also used k-adaptive partitioning algorithm to derive a cut-point along the continuum of image-derived score for the set of patients who received preoperative radiation followed by surgery. Using a log-rank test, we then assessed whether this cut-point could also induce a statistically significant survival difference in the set of patients who received definitive chemoradiation alone.
Next, to assess whether this image-derived score would remain an independent predictor of survival outcomes, a new multivariate Cox proportional model was constructed using the image-derived score and other potential prognostic variables including age at diagnosis, Karnofsky performance status (KPS), baseline PET standardized uptake value (SUV) both as a continuous and a dichotomized variable (using a clinical and median cut-off of SUV ≤ 2 vs. >2 and SUV ≤ 10 vs. >10, respectively), disease stage, tumor histology, radiation modality, tumor grade and length, pathologic response to chemoradiation, receipt of induction chemotherapy, and treatment with surgery.
Dates of death were determined by reviewing clinical follow-up information in the patients' medical records and Social Security Death Index. OS was calculated from date of diagnosis to date of death or last follow-up. Table 1 summarizes the patient-, disease-, and treatment-related characteristics of the study group. The median age at diagnosis was 64 years; 84% (643/762) were men with moderate-to-poorly differentiated esophageal adenocarcinomas. While 49.7% (379/762) of the patients received trimodality therapy with neoadjuvant chemoradiation followed by surgery, 50.3% (383/379) received definitive chemoradiation alone.

Quantitative image features
As previously noted, a total of 328 quantitative image features based on tumor geometry, intensity histogram, IGR, COM, and RLM were extracted from the esophageal cancer GTV for each image set. A random survival forest model was used to compare the concordance of five of these features (entropy, histogram 10th percentile, tumor volume, volume-to-area ratio, and fraction GTV pruned after thresholding) versus that of all 328 quantitative image features for predicting OS. The estimated error rates for predicting OS were 43.6% for all 328 variables vs. 43.9% for the five image variables. Because the five image features yielded a comparable error rate (44%) for predicting OS, only those five were used to create an image-derived score for subsequent analysis. Notably, the error of those five image variables was lower than overall clinical stage (56%) and similar to baseline PET SUV (45%) for predicting OS. A detailed list and relative importance of each of the 5 image texture features are displayed in Table 2.

Prediction of survival based on image-derived score
Next, using Cox proportional hazards modeling, we constructed an image-derived score to model the relationship between OS and the five imaging variables within the training set (random sample of 50% of the study group) ( Table 3). A cutpoint along this image-derived score (derived from the training set) was able to stratify OS with statistical significance (P=0.0119) in the validation set ( Figure 1A). Similarly, a cutpoint was derived along the continuum of the image-derived score for the set of patients treated with preoperative chemoradiation followed by surgery. This cutpoint along the image-derived score was also seen to dichotomize survival (P=0.0086) in the remaining patients who received definitive chemoradiation alone ( Figure 1B).

Variable
Comparis on

Multivariate analysis of survival outcomes
Other known prognostic patient and disease factors, along with the image-derived-score, were then included in a multivariate analysis of predictors of OS. The five-feature image-derived score remained an independent predictor of OS on multivariate analysis (P=0.009), as did treatment with surgery (P<0.0001), overall clinical stage (P=0.0003), tumor grade (P=0.03), and radiation modality (P=0.03) (  Table 4: Multivariate analysis of potential predictors of overall survival outcomes. Abbreviations: CI, confidence interval; PET, positron emission tomography; SUV, standardized uptake value; IMRT: Intensity-Modulated Radiation Therapy; 3D-CRT: Three-Dimensional Conformal Radiation Therapy. * Note baseline PET SUV was also analyzed as a dichotomized variable (using a clinical cutoff of SUV ≤ 2 vs. >2, and a median cutoff of SUV ≤ 10 vs. >10) in our multivariate model and its p-value for predicting OS remained >0.05.

Discussion
Our findings suggest that textural features extracted from pretreatment CT images may serve as an independent predictor of OS in patients with non-metastatic esophageal carcinoma, even after adjusting for other known prognostic covariates. While textural features have been established to provide predictive data on outcomes in other cancers, establishing its ability to predict survival outcomes in esophageal cancers specifically not only further validates texture analysis as a legitimate prognostic biomarker that warrants prospective evaluation, but also highlights CT texture analysis as a novel method that can be used to risk-stratify and guide management decisions in a dismal cancer. Patients with the worst prognoses, for example, could be guided towards randomized trials evaluating targeted agents beyond standard chemoradiation. Similarly, the need for induction chemotherapy before chemoradiation, or surgery after chemoradiation, could be better considered in light of the risk group in which patients are placed based on this prognostic imagingbiomarker score.
One possible explanation for the link between textural appearance of tumors on CT and patient survival is that the relationship between tumor heterogeneity and hypoxic voids/areas of necrosis (as related to tumor vascularity), which may be seen as differences in pixel intensity/ attenuation on CT [11]. Hypoxia can then result in oxidative stress, promotion of survival factors, increased tumor aggression, and treatment resistance [7]. Tumor textural features has in-fact been linked with tumor hypoxia on histologic examinations of non-small cell lung cancer (NSCLC) patients [20]. Because hypoxia is a recognized marker of poor outcome, a relationship between tumor hypoxia, tumor heterogeneity, CT textural heterogeneity, and survival outcomes would make sense from a biologic standpoint.
Previous studies have indeed shown CT texture analysis to have promise as a prognostic and predictive marker in a variety of types of cancer. In NSCLC, tumor heterogeneity as assessed by CT textural analysis and disease stage seemed to independently predict survival and was more predictive than tumor uptake of fluorodeoxyglucose [11,21]. In colorectal cancer, liver texture on portal phase CT images was superior to CT perfusion images at predicting survival [13]. However, only two studies have been done to-date assessing CT texture features as a potential prognostic biomarker in esophageal cancer. Ganeshan et al. [16] found associations between CT textural features, high tumor metabolism, and advanced disease stage in 21 patients with esophageal cancer; interestingly, CT textural features independently predicted survival but SUV and disease stage did not. Another study by Yip et al. [15] confirmed the association between CT textural features and survival time in 36 patients with esophageal tumors undergoing contrast-enhanced CT before and after chemoradiation. These preliminary studies highlighted the potential of using CT textural analysis as a prognostic marker in esophageal cancer; but, drawing conclusions are difficult given the relatively small numbers of patients analyzed.
The study reported here further validates these results by examining the utility of baseline CT texture analysis in a larger and more relatively uniform cohort of 762 esophageal cancer patients all treated with chemoradiation. Although a recent meta-analysis could not confirm SUV to be independent of other predictive factors such as stage [22], we found CT imaging-derived score to independently predict OS on multivariate analysis, despite adjustments for disease stage, baseline SUV, and other known prognostic covariates. In fact, similar to findings in previous studies, baseline SUV (whether continuous or dichotomized) also did not independently predict OS in our study. This suggests that although CT texture score was comparable to SUV in predicting OS in our forest survival model, it may provide additional prognostic information beyond SUV alone.
Other imaging-related characteristics being evaluated for prognostic value in esophageal cancer include post-treatment SUVmax, SUVmean, SUVpeak [23,24], diffusion-weighted magnetic resonance imaging of tumors [25], dynamic contrast-enhanced CT measurements of tumor perfusion [26], and use of other metabolic tracers with PET [16]. Future comparative studies of CT textural analysis against these other potential imaging biomarkers are needed. Another step would be to combine CT texture analysis with other established prognostic clinical variables to derive an even more sophisticated model for predicting outcomes, but this would require extensive validation and more detailed exploration beyond the scope of this paper.
Other limitations of our study include its retrospective nature. Because our image-derived score was based on a random sample of patients from one institution, the robustness of the cutpoint for that score is unclear and our results may not be generalizable to other centers. Nevertheless, our image-derived score was able to stratify patients into distinct risk-groups, and remained an independent predictor of OS even as a continuous variable. Another limitation is the potential for gas to have been present within the delineated tumors, which could influence the values of the extracted quantitative features; we minimized this confounding by using thresholding techniques. Finally, textural analyses were done on CT scans obtained without intravenous contrast enhancement. The inclusion of contrast material could improve the ability of CT textural analysis to pick up subtleties in vascular heterogeneity; but, not every patient will be able to tolerate contrast.
Despite these limitations, this study is still the largest to-date to investigate the potential of using CT textural analysis as an imagingbased biomarker of prognosis in esophageal cancer. Our results suggest that features extracted from pretreatment CT images can independently predict OS in patients with non-metastatic esophageal carcinoma and warrant further investigation.