Abstract
BACKGROUND AND PURPOSE: Two-thirds of lymphatic malformations in children are found in the head and neck. Although conventionally managed through surgical resection, percutaneous sclerotherapy has gained popularity. No reproducible grading system has been designed to compare sclerotherapy outcomes on the basis of radiologic findings. We propose an MR imaging–based grading scale to assess the response to sclerotherapy and present an evaluation of its interrater reliability.
MATERIALS AND METHODS: A grading system was developed to stratify treatment outcomes on the basis of interval changes observed on MR imaging. By means of this system, 56 consecutive cases from our institution with formally diagnosed head and neck lymphatic malformations treated by sclerotherapy were retrospectively graded. Each patient underwent pre- and posttreatment MR imaging. Each study was evaluated by 3 experienced neuroradiologists. Interrater reliability was assessed using the Krippendorff α statistic, intraclass coefficient, and 2-way Spearman ρ correlation.
RESULTS: The overall Krippendorff α statistic was 0.93 (95% CI, 0.89–0.95), denoting excellent agreement among raters. Intraclass coefficients with respect to consistency and absolute agreements were both 0.97 (95% CI, 0.96–0.98), illustrating low variability. Every combination of individual rater pairs demonstrated statistically significant (P < .01) linear Spearman ρ correlations, with values ranging from 0.90 to 0.95.
CONCLUSIONS: The proposed radiographic grading scale demonstrates excellent interrater reliability. Adoption of this new scale can standardize reported outcomes following sclerotherapy for head and neck lymphatic malformation and may aid in the investigation of future questions regarding optimal management of these lesions.
ABBREVIATIONS:
- BDL
- Berenstein-De Leacy
- LM
- lymphatic malformation
- LVM
- lymphatic-venous malformation
Lymphatic malformations (LMs) and lymphatic-venous malformations (LVMs) are low-flow vascular malformations that arise as a result of erroneous vascular development during embryogenesis.1,2 Both malformations are characterized by distended lymphatic channels, with additional anomalous venous channels present in LVMs. LMs and LVMs reflect 2 distinct classifications; however, clinical differentiation of these subtypes is lacking in much of the literature.
LMs most commonly present in childhood, with an incidence of 1/20,000 children admitted to the hospital compared with 1/100,000 adults admitted.3 They are unlikely to regress spontaneously and demonstrate growth proportionate to body size.4 Most of these lesions are diagnosed before 2 years of age, at which point 90% come to attention due symptoms including cosmetic disfigurement, recurrent infection, bleeding, or compression of adjacent structures.3,5 They represent approximately 5% of benign tumors in infants and children and are located in the head and neck in 66% of cases.3,6
Despite nuanced etiologic and pathologic distinctions, treatment approaches to LMs and LVMs are currently similar; thus, the 2 will hereafter be considered as a grouped category (LM-LVMs). Surgical excision of LM-LVMs in the head and neck has proved challenging due to the close proximity of the lesions to vital structures, which often leads to subtotal resection and future recurrence.7 Particularly in the case of lesions located around the face and upper airway, total resection has the potential to cause pronounced deformity and/or functional impairment (respiratory, digestive, and neurologic).8 As a result, percutaneous sclerotherapy has emerged as a common treatment technique for both macro- and microcystic LM-LVMs in these regions.
Despite the widespread treatment of head and neck LM-LVMs, a standardized grading scale for the assessment of the results is lacking. The radiographic criteria for the evaluation of outcomes have not yet been developed. In 1995, a preoperative staging scale for LMs was proposed by de Serres et al9 to predict the prognosis and outcome of surgical intervention on the basis of lesion location. More recently, Balakrishnan et al10 developed a consensus statement recommending standardized clinical outcome measures for studies evaluating the treatment of head and neck LMs. Only one of these measures, lesion volume, is based on radiographic evaluation.
A reliable and reproducible grading system for radiographic treatment outcomes of head and neck LM-LVMs offers the ability to both refine reporting standards and clarify communication between treating physicians. We propose 1 such system based on contrast-enhanced MR imaging and evaluate its interrater reliability in a cohort of patients treated for head and neck LM-LVM with percutaneous sclerotherapy. The soft-tissue detail, absence of ionizing radiation, safety profile, and ubiquity of MR imaging make it an ideal technique on which to develop imaging-based criteria.
MATERIALS AND METHODS
Patient Selection and Treatment
This study was approved by an institutional review board (Mount Sinai Hospital, New York). We retrospectively reviewed 56 cases of fluoroscopically guided percutaneous sclerotherapy for head and neck LM-LVMs from 2005 to 2019. Lesions were initially diagnosed using MR imaging and/or sonography in conjunction with the clinical examination. LMs and LVMs were differentiated by the presence of enhancement on postcontrast T1 sequences. The de Serres stage and lesion architecture (macrocystic, microcystic, or mixed) were determined by preprocedural MR imaging. All patients underwent pre- and posttreatment MR imaging, and patients of all ages were included. Sclerosants used during therapy included bleomycin and doxycycline. To increase the sample of patients with findings representative of lesion progression, we included 5 patients who did not undergo sclerotherapy between MR imaging time points. Raters were blinded to the treatment status of all patients.
The LM-LVM response to therapy was evaluated via the comparison of pre- and postprocedural MR imaging. Change in lesion size was evaluated on axial T2-weighted fat-saturated sequences and in orthogonal planes when available. In the case of LVMs and for the evaluation of granulation tissue response, contrast-enhanced fat-saturated T1 sequences were used.
Imaging Protocol
All patients underwent MR imaging on a 3T Magnetom Skyra MR imaging system (Siemens). A combination of a 12-element head and neck coil was used for radiofrequency signal reception. MR imaging protocol included T1 (TR/TE = 530/17 ms, flip angle = 150°, voxel = 0.6 × 0.6 × 5 [section] mm, axial plane), T2-fat saturation (TR/TE = 3600/90 ms, flip angle = 154°, voxel = 0.6 × 0.6 × 5 [section] mm, axial and coronal planes), and T1 postcontrast with fat saturation (TR/TE = 560/11 ms, flip angle = 180°, voxel = 0.6 × 0.6 × 5 [section] mm, axial and coronal planes). To improve the homogeneity of fat-suppression in the neck, we used the Dixon fat-suppression technique as reported in a prior study.11 A total of 0.1 mmol/kg of gadolinium was administered for postcontrast imaging.
Grading Scale and Statistical Analysis
The proposed grading system—the Berenstein-De Leacy (BDL) system—is summarized in Table 1. It categorizes treatment responses into 7 distinct grades and includes the descriptive modifier “B,” which can be added to any grade to signify granulation tissue formation in the treatment bed. Radiographic improvement is stratified across 4 grades (1–4). The remaining grades (5–7), respectively, signify no change, regression with extension into an untreated area, and gross progression. The grading system is discussed in detail below.
The BDL system was used by 3 neuroradiologists (R.D.L., A.D., K.N.) with expertise in head and neck imaging to grade interval changes in lesion volume in 56 patients with head and neck LM-LVMs. Fifty-one patients underwent fluoroscopically guided percutaneous sclerotherapy within the imaging interval, and 5 patients had no treatment. Grading was conducted in an independent and blinded fashion. Estimation of relative residual lesion volume was based on visual assessment without the aid of automated tools for volumetric analysis. All differences in scoring were resolved by discussion and consensus.
To assess the interrater reliability of the proposed scale, we conducted 3 statistical analyses. First, we calculated the Krippendorff alpha α statistic, a commonly used metric of interrater reliability, treating the grading criteria as an ordinal scale. Next, intraclass coefficients for consistency and absolute agreement were calculated to evaluate the variability of a single outcome grade with respect to the variation across all cases. Finally, the 2-way Spearman ρ correlation was calculated to assess linear correlations between all pairs of raters in a nonparametric fashion. A P value < .05 was set to demarcate statistical significance.
For all of the above tests, we determined the strength of interrater reliability according to the criteria proposed by Cicchetti and Sparrow12: <0.40, poor; 0.40–0.59, fair; 0.60–0.74, good; ≥0.75, excellent.13 A value of 1.00 indicates perfect agreement, 0 indicates no better than chance, and negative values indicate worse than chance. Statistical analysis was conducted using SPSS, Version 22.0 (IBM).
RESULTS
Pre- and posttreatment MRIs of 56 patients were presented to 3 raters in a random order. The median age of patients on final imaging was 6.38 years (range, 0.3–73.9 years), and 28 patients (50.0%) were female. The median imaging interval was 32.5 months (range, 1–131), and the median number of sclerotherapy treatments received was 1.5 (range, 0–13); 66.1% of lesions were classified as pure LMs. LM-LVMs were localized to the right side of the head and neck in 18 cases (32.1%) and the left in 16 cases (28.6%) and were bilateral in 22 cases (39.3%). Nine LM-LVMs were classified as macrocystic (16.1%), 19 were microcystic (33.9%), and 28 were mixed (50.0%). Anatomic locations representing all preoperative stages of the de Serres criteria were present in the validation cohort. Patient demographics, imaging interval, number of treatments, and LVM characteristics are summarized in Tables 2 and 3.
There was unanimous agreement among raters in 39 cases (69.6%), agreement among two-thirds of raters in 16 cases (28.6%), and no agreement on initial grading in 1 case (1.7%). All discrepancies were resolved via discussion. All grades of the BDL scale, including the modifier for granulation tissue, were represented in the validation cohort. The prevalence of each grade is presented in Table 4. Stratification of grade prevalence by lesion architecture is available in the Online Supplemental Data.
The overall Krippendorff α statistic was 0.93 (95% CI, 0.89–0.95), denoting excellent agreement. Both intraclass coefficients with respect to consistency and absolute agreements were 0.97 (95% CI, 0.96–0.98), denoting excellent consistency. A 2-way Spearman ρ correlation was calculated between every permutation of raters: K.N. and A.D. demonstrated a ρ of 0.90 (P < .001), K.N. and R.D.L. demonstrated a ρ of 0.93 (P < .001), and A.D. and R.D.L demonstrated a ρ of 0.95 (P < .001). These values denote a strong and significant linear correlation between individual rater pairs. Measurements of interrater reliability are summarized in Table 5.
DISCUSSION
To the best of our knowledge, the BDL system described and evaluated above is the only grading system to be put forward for assessing the radiologic response to therapy in patients with head and neck LM-LVMs treated by either surgical excision or percutaneous sclerotherapy. Among 3 separate raters evaluating 56 different cases of LM- LVM treatment, it has demonstrated excellent consistency and rates of agreement, suggesting that it may be useful as a tool for both clinical communication and radiographic outcomes reporting.
The recent proliferation of studies evaluating techniques and agents for the treatment of LM-LVMs14⇓⇓⇓-19 necessitates a standardized method for radiographic outcome, reporting that can provide a common language for accurate comparison among independent trials. The utility of such a system is well-exemplified by the radiographic Response Evaluation Criteria in Solid Tumors for response to treatment in solid tumors, which, since its introduction in 2000, has allowed reliable comparison of different treatment trials during the past 2 decades.20 Similar benefits extend across a broad range of disease interventions, including intracranial aneurysm embolization (Modified Raymond-Roy Classification), stroke intervention (Modified Treatment in Cerebral Ischemia score), and others.21,22
Important steps toward improved standardization have already been taken with regard to preoperative staging and clinical outcome reporting, and this scale is intended to supplement this effort by providing a standard, simple, and reproducible measure of radiographic treatment results.
The 7-grade scale proposed and evaluated here stratifies radiographic improvement across 4 levels, illustrated by case examples in Figs 1–4. These grades include the following: 1) complete regression of the lesion, 2) near-complete repression with trace residual lesion, 3) partial regression with <50% residual lesion, and 4) partial regression with >50% residual lesion. Grade 5 indicates minimal or no interval change in the treated LM-LVM (Fig 5). Regression of the LM-LVM in 1 area, with expansion into a previously uninvolved area is denoted by grade 6 (Fig 6). Finally, gross interval progression is denoted by grade 7 (Fig 7). The formation of granulation tissue within the treatment bed, which may occur in conjunction with any radiographic grade, is indicated by the modifier B as illustrated in Fig 3. This modifier was omitted from analysis to preserve the statistical power of interrater testing; thus, its interrater reliability has not been evaluated. Nonetheless, it is included as a possible addition to each grade due to the clinical importance of granuloma formation, which is pathologically distinct from the LVM itself and may impede the improvement of cosmetic deficits or mass-related symptoms.
The common standard offered by the BDL system is designed to include a number of advantages over current consensus recommendations that include “LM volume” as the only standardized radiographic indicator of treatment outcome.10 By treating the LM-LVM response to therapy as a percentage change, it avoids direct estimation of LM-LVM volume, which, in the absence of advanced segmentation methods, is prone to significant error due to the irregular geometry commonly observed in these lesions. Furthermore, by considering the change across an imaging interval, the system captures treatment response as opposed to the absolute volume of the residual lesion after treatment. The BDL system was not designed to assess the clinical efficacy of embolization procedures but rather to provide a standardized system by which radiographic changes in head and neck LM-LVMs can be reported for the purposes of research and clinical communication. Nonetheless, it is structured in a way that allows future validation studies correlated to clinical outcomes.
The primary limitation of this study comes as a result of the validation cohort sample size, which is relatively small due to the rarity with which these malformations present for treatment. As a result, consistency and agreement were evaluated only with respect to the scale as a whole, and not within each grade of the scale individually. Similarly, the descriptive B, which in our proposed grading system represents the modifier used for the present of granulation tissue within the treatment bed, was also excluded from analysis. Further studies evaluating this scale in a larger cohort could be conducted to overcome this limitation. Other promising areas for future study include validation of the scale with regard to open surgical treatment outcomes, correlation to clinical examination findings, and correlation with meaningful clinical end points, most notably recurrence rates.
CONCLUSIONS
The present study demonstrates that the BDL system for assessing the treatment of head and neck LM-LVMs has excellent consistency and rates of agreement. This is the only grading system to be put forward for assessing the response to therapy in patients with head and neck LM-LVMs treated by either surgical excision or percutaneous sclerotherapy. This system offers an effective means by which to streamline clinical communication and standardize radiographic outcome reporting for the treatment of these lesions.
Footnotes
Paper previously presented as a poster at: Annual Meeting of the American Society of Neuroradiology, May 30 to June 4, 2020; Virtual. Submission No. 2476.
Disclosures: Reade De Leacy—UNRELATED: Consultancy: Cerenovus. Deeksha Chada—UNRELATED: Employment: Icahn School of Medicine at Mount Sinai, Comments: I am a salaried full-time employee (as a Data Analyst) within the Department of Neurosurgery at Icahn School of Medicine at Mount Sinai; Other: Columbia University Mailman School of Public Health, Comments: I was a graduate student research assistant in the Department of Epidemiology at the Mailman School of Public Health from Fall 2018 to Spring 2019 (paid hourly for approximately 20 hours a week). Kambiz Nael—UNRELATED: Consultancy: Olea Medical. Alejandro Berenstein—UNRELATED: Board Membership: Bendit Technology, EndoStream; Consultancy: Ceranova, Magneto; Payment for Lectures Including Service on Speakers Bureaus: Ceranova, Comments: educational Webinar; Royalties: AngioDynamics; Payment for Development of Educational Presentations: Ceranova; Stock/Stock Options: Bendit Technology, Magneto, EndoStream, MIVI Neuroscience, Scientia, Rapid; Other: MicroVention. *Money paid to the institution.
References
- Received December 9, 2020.
- Accepted after revision May 28, 2021.
- © 2021 by American Journal of Neuroradiology