Abstract
BACKGROUND: Publication bias and/or true heterogeneity can skew aggregate impressions from scientific literature. To better determine aggregate measures for unruptured intracranial aneurysm (UIA) treatment, we analyzed adverse outcome rates of surgical clipping and endovascular coil embolization.
METHODS: Two independent reviewers searched MEDLINE for studies publishing adverse outcome rates for endovascular coiling and surgical clipping between January 1990 and July 2003. Studies were classified as single-center, multicenter, or community-based. We defined adverse outcome rates as combined all-cause early or in-hospital morbidity and mortality. We determined cumulative adverse outcome rates by plotting precision measure (sample size) against trial-specific effect (adverse outcome rate).
FINDINGS: We included 4 endovascular coiling multicenter/community-based studies (1019 patients) and 13 single-center studies (810 patients) and 5 surgical clipping multicenter/community-based studies (10,541 patients) and 23 single-center studies (1759 patients). Cumulative adverse outcome rates for endovascular coiling and surgical clipping were 8.8% (95% confidence interval [CI] 7.6%–10.1%) and 17.8% (95% CI 17.2%–18.6%).
INTERPRETATION: Scattergram distribution illustrated the magnitude of bias in current literature reporting UIAs. Major parts of the literature may have underestimated surgical clipping morbidity and mortality, which can be attributed to bias from smaller retrospective studies. Neuroradiologic coiling studies were less likely to include factors contributing to inaccurate adverse outcome rates.
For many physicians, clinical practice is based on scientific literature that often lacks data from prospective, randomized trials. In these circumstances, an aggregate impression is formed from a dataset that consists primarily of single-center case series and retrospective series. Although not immediately apparent, influences such as publication bias and dataset heterogeneity may collectively skew the aggregate impression and thus have a profound impact on clinical management.
Currently, no independently-blinded, randomized clinical trials address whether coiling or clipping is optimal in the treatment of unruptured intracranial aneurysms (UIAs). Although prospective, nonrandomized data now available from the International Study of Unruptured Intracranial Aneurysms (ISUIA) provide an important assessment of treatment options and address many issues with data heterogeneity by providing uniform definitions of starting points, aneurysmal characteristics, end points, and follow-up of patients, it remains important to place these new results in context with previously published data to address unexpected outcomes and understand how new information may influence our thinking.
One way to frame the body of literature concerning the treatment of UIAs is to adapt previously described scattergram techniques (1), which allow us to use a simple graphical test to outline the summed magnitude of publication bias, data set heterogeneity, and other confounding factors that, though not immediately apparent, may significantly skew our decisions for therapy and cause conflict and controversy in our recommendations for treatment.
Here we systematically reviewed literature reporting clipping and coiling adverse outcome rates (AORs) and performed funnel plot analysis to further understand how the magnitude of biases affected reported AORs. Understanding the magnitude of these biases is important for determining the optimal treatment strategies for specific patient populations, guiding future research, and highlighting the need for more rigorous clinical trials.
Methods
We searched MEDLINE for studies by using keywords intracranial aneurysm, embolization, therapeutic, surgery, and treatment outcome. We reviewed bibliographies for citations until no further citations were found. Included studies fulfilled several criteria: publication between January 1990 and July 2003, UIA treatment with coiling or clipping, UIA treatment outcome data, separate reporting of ruptured intracranial aneurysms and UIA treatment outcomes, >10 patients, and publication in English. Two reviewers (T.L., M.B.) independently extracted data, and an adjudicator (J.P.-S.) resolved discrepancies. Data were separated by study design, baseline characteristics, and clinical outcome. We recorded mean age, time interval between procedure and outcome assessment by specialty, patient number, definition of morbidity, AOR, and procedural mortality. Studies were classified as single-center, multicenter, or community-based. We defined AOR as combined all-cause early or in-hospital morbidity and mortality. We recalculated AORs by using data from the modified Rankin scale, Glasgow Outcome Scale, or patient outcome data. Authors’ AOR criteria were used if data recalculation was not possible. We used the time interval for follow-up closest to 3 months. Data were recalculated to include patients with UIAs only. To evaluate bias, risks of clipping and coiling were tabulated by plotting precision measure (sample size) on the Y axis and trial-specific effect (AOR) on the X axis. Linear regression of AOR with time was performed by using the median study date.
Results
Coiling of UIAs
Nineteen coiling studies met selection criteria. From a total of 1925 patients, 4 multicenter or community-based studies described treatment of 1019 patients (2, 3). Two publications focused on basilar bifurcation aneurysms exclusively (4, 5). Baseline characteristics and AOR data appear in Tables 1 and 2.
More than half of the studies addressed only UIAs, and all reported AORs or data allowing for recalculation of AORs. Classification of AOR varied widely, and some studies omitted baseline information for age, aneurysm size, and site distribution. One study presented total morbidity without defining outcome criteria (6). Follow-up time interval spanned from the immediate postoperative period to 16.3 months. AORs varied from 0.0% (7) to 30.0% (8). Mortality varied from 0.0% to 12.0%. Peak procedural mortality was 9.7%. In most studies, procedural mortality was reported as 0.0%. Figure 1 shows coiling AORs versus sample size. The cumulative coiling AOR is 8.8% (95% confidence interval [CI] 7.6%–10.1%). Coiling AOR for single-center retrospective studies is 8.1% (95% CI 6.2%–10.4%). Coiling AORs have declined since coils were introduced in the early 1990s (Fig 2). This trend occurs consistently when AORs are plotted against median study date.
Clipping of UIAs
Thirty clipping studies met selection criteria. Meta-analyses were excluded (9, 10). Two continued follow-up with the same patient cohort (3, 11). Eight multicenter or community-based studies included 9579 of 11,363 patients. Twenty-one were retrospective, single-center studies, and only one was a prospective, single-center study. One study reported only giant intracranial aneurysms (12) whereas another discussed anterior circulation aneurysms exclusively (13). Fourteen studies presented UIA size or site distribution characteristics independently (Table 3), and all presented AORs or allowed for recalculation of AOR data. Clipping AORs varied from 0.0% (14) to 25.1% (2). Plotting clipping AORs against median study date shows increasing morbidity and mortality in the 1990s (Fig 3). This trend is largely due to recent publication of multicenter and community-based data.
In 11,363 subjects, the cumulative clipping AOR is 17.8% (95% CI 17.2%–18.6%). For prospective, multicenter, and community-based studies, the cumulative AOR is 19.7% (95% CI 18.9%–20.5%), including only prospective ISUIA data (3). The cumulative AOR in retrospective studies is 7.9% (95% CI 6.7%–9.3%). The clipping AOR scattergram is shown in Figure 4.
Discussion
The optimal treatment strategy for UIAs is currently unknown, because of the absence of large, randomized clinical trials. Although a clinical trial for ruptured intracranial aneurysms has been completed in the form of the International Subarachnoid Aneurysm Trial, (15) and prospective comparative data for unruptured aneurysms based on a standardized approach to patient entry and outcomes is available from the ISUIA, (3) current recommendations for management of UIAs still depend on data from heterogeneous series. AORs vary widely, and lack of comparability between studies hinders accurate aggregate impressions of the literature. Previous reports have discussed publication bias in surgical clipping studies (16). In Figs 1 and 4, we used a simple scattergram technique, adapted from previously published methods (1), to demonstrate the magnitude by which literature biases have influenced our outlook on UIAs. This technique estimates bias by plotting a measure of precision, such as the number of subjects, against a trial specific effect, such as AOR. In the absence of bias, the plot resembles an inverted symmetrical funnel, with small studies scattering at the bottom of the plot and more precise studies clustering around the true treatment effect. The presence of bias is suggested by visually apparent asymmetry (1).
Our results demonstrated significant asymmetry in the scattergram of clipping AORs, attributable to a substantial difference in cumulative AORs between the retrospective series (7.9%; 95% CI 6.7%–9.3%) and the remaining prospective, multicenter, and community-based studies, inclusive of the ISUIA (19.7%; 95% CI 18.9%–20.5%). The causes underlying this discrepancy are difficult to determine, but it seems likely that they would include publication bias—ie, a tendency to publish studies with positive outcomes—and true heterogeneity stemming from differences in patient population, such as age, aneurysm size, aneurysm site, and predisposing social and genetic factors, or from differences in the definitions of measured outcomes (17).
Other possibilities that may explain the large discrepancy seen in the retrospective series include differences in practitioner experience, procedure difficulty, and patient selection. Arguments that this may be the case, however, are difficult to substantiate by examining the data from the various studies. With respect to practitioner experience and procedure difficulty, one might surmise that the preponderance of lower AORs is attributable to centers with neurosurgeons particularly adept at clipping aneurysms or due to selection of patient subpopulations less likely to have significant morbidity and mortality; however, because these studies did not have homogeneous definitions of patient outcomes or selection criteria a priori, data supporting this reasoning is incomplete. Nevertheless, it is interesting to note that few authors publishing the results of the smaller of these series later report their clinical experience as the number of reportable cases increases, which raises the question of whether such reports may have been censored because of publication bias. Also of note, several studies reported outcomes for groups having subpopulations with prior SAH—a practice that would tend to increase overall AOR rather than decrease it.
The scattergram of coiling AORs appears significantly less asymmetric. Although this does not exclude the presence of publication bias or heterogeneity in the published literature for the coiling of UIAs, it does appear that the cumulative AORs for coiling cluster around a central tendency, making it more likely that they approximate a single treatment effect (1). This appears reasonable, in light of the fact that populations with UIAs may be more homogeneous than a randomly selected group from the general population in that they are often asymptomatic, healthy, middle-aged individuals with relatively few confounding comorbidities. Even among groups with UIAs these individuals are more likely to have similarities among themselves in that they are often selected for coiling after screening for surgical clipping candidates has taken place and tend to include the subset of the population likely to have increased periprocedural complication rates (3). With respect to end points, some variation in the definition of adverse outcomes may have been offset by the fact that the adverse outcomes being measured are often severe, such as severe neurologic deficit or death, and would have been recorded regardless of what definition was being used.
As suggested above, the presence of heterogeneity or bias cannot always be detected on the basis of asymmetry in the scattergram. As seen Fig 3, there appears to be a positive correlation between AOR and median publication date for clipping. This result is not unexpected, in view of the later publication dates of the prospective and multicenter or community-based studies. What is being demonstrated here is likely the same effects that skew the scattergram seen in Fig 4. In contrast, there appears to be a negative correlation between AOR and median publication date for coiling. This trend reflects the tendency for earlier studies to report higher AORs, which may reasonably be attributed to improvements in technique, refinement of practitioner skill, or better selection of treatment candidates. As we saw in the discussion of the clipping data, however, determining the cause of this trend proves problematic, again because of difficulty in standardizing starting points and outcomes. This highlights the distinct possibility that other undetected and unanticipated biases may be underlying the current data for both coiling and clipping, bringing out another limitation in aggregate data in comparison to data from well-controlled, randomized prospective studies.
Comparison between coiling and clipping results by using data from a review of this type is problematic. As evidenced by recent data from the ISUIA, adverse outcome rates vary widely due to patient age, aneurysm characteristics, and site distribution (3). Variability in the definitions and assessment of outcomes and the reporting of baseline aneurysm and patient characteristics in retrospective series makes analysis of the nature of the differences in reported AORs difficult, because the cumulative data are often incomplete. Furthermore, the degree to which differences such as these affect reported values in individual series is also difficult to assess. Prospective data now available from the ISUIA offer the first insight into the relationship baseline characteristics and clinical end points by offering a standardized prospective assessment of the procedures with uniform definitions of starting points, aneurysmal characteristics, end points, and follow-up of patients. Although the endovascular and surgical groups are not comparable, the ISUIA data do make it possible to outline the magnitude of the differences and adjust for them statistically.
Aggregate analysis as performed with the scattergram techniques demonstrated here is unable to correct for biases that may be present in individual studies, but it does allow us to see the cumulative effect of these biases as they influence the body of literature as a whole. Our data clearly demonstrate that there is a significant process at work skewing our impression of adverse outcomes associated with surgical clipping. It is important to note that the number of subjects in many of the retrospective series on surgical clipping is not insignificant. Thus, the number of patients in such series cannot necessarily be relied on as a measure of reliability of the result without closer scrutiny of the underlying methodology. As suggested by the prior discussion regarding the variability in outcomes with respect to patient and aneurysm characteristics, any estimate of outcomes by using aggregate data must be treated with caution. Indeed, even though cumulative aggregate AOR is appreciably higher than that reported by the ISUIA, it is still possible that these results are consistent, especially considering the older cohort in the aggregate data.
Conclusion
Efforts to distinguish and identify forms of bias, such as publication bias, have increased in the past 2 decades. In this scattergram analysis, we visually illustrate factors that implicate biases in the UIA literature. This aggregate analysis indicates that bias may influence reported clipping AORs to a larger extent than it does in coiling studies. Although we obtained corrected AOR estimates for both treatments, quantifying factors to account for the wide AOR variability was difficult. Ultimately, our results highlight the need for a well-designed prospective, randomized trial to avoid these biases and compare both types of treatment more accurately.
Acknowledgments
We are indebted to Drs. Alexander Khaw, Ralph Sacco, S. Claiborne Johnston, Joseph Lau, Mitchell Berman, William Friedewald, and Shing Lee for their guidance and manuscript review.
Footnotes
Funding is from the Weitzman Family Fellows Research Program for the Study of Malformations of the Vasculature (grant 642212) and the Richard and Jenny Levine Fund, neither of which had a role in study design, data collection, data analysis, data interpretation, or writing of this report.
J.P.-S. had the initial idea and oversaw all aspects of this study. T.L. and M.B. performed the independent review of literature, analyzed the data, and prepared the manuscript. R.S. oversaw the statistics for this review. J.P.M. contributed to interpretation and analysis of the data for this study.
The authors have no conflicts of interest related to this study.
References
- Received September 13, 2004.
- Accepted after revision March 1, 2005.
- Copyright © American Society of Neuroradiology