Under its Evidence-based Practice Program, the Agency for Healthcare Research and Quality (AHRQ) is developing scientific information for other agencies and organizations on which to base clinical guidelines, performance measures, and other quality improvement tools. Contractor institutions review all relevant scientific literature on assigned clinical care topics and produce evidence reports and technology assessments, conduct research on methodologies and the effectiveness of their implementation, and participate in technical assistance activities.
Select for PDF version (58 KB). PDF Help.
Overview / Reporting the Evidence / Methodology / Findings / Future Research / Availability of the Full Report
Diseases of the pancreas and biliary tree are common in the United States. An estimated 6 per 100,000 people are afflicted with common bile duct stones, representing only a small fraction of those with gallstones. There are approximately 57,400 newly diagnosed cases of malignancy of the pancreas, gallbladder, or extrahepatic biliary tract each year, and the prognosis is usually poor. Pancreatitis can occur in an acute, acute recurrent, or chronic pattern, with common etiologic factors including alcohol consumption and choledocholithiasis.
This report is the product of a systematic literature review of the evidence on the diagnostic and therapeutic effectiveness of endoscopic retrograde pancreatography (ERCP) focusing on four clinical conditions:
In addition, the evidence describing patient, procedure, or operator determinants of complications of ERCP is systematically reviewed. The evidence on the prediction of common bile duct stones is reviewed as well.
The clinical topic areas addressed in this evidence report were developed by the planning committee for the National Institutes of Health State-of-the-Science Conference (January 2002) on Endoscopic Retrograde Cholangiopancreatography. For each major topic, there are several key questions that address the most pertinent diagnostic and therapeutic issues.
a. What is the diagnostic performance of ERCP in detecting common bile duct stones in comparison to alternatives? Alternatives include endoscopic ultrasound (EUS), magnetic resonance cholangiopancreatography (MRCP), or computed tomography cholangiography (CTC).
b. What are the outcomes of treatment using ERCP strategies compared to using surgical or medical management?
c. What is the diagnostic value of specific risk factors or predictive models for assessing the likelihood of having a common bile duct stone?
a. What is the comparative diagnostic performance of ERCP tissue sampling techniques in establishing a tissue biopsy diagnosis of pancreaticobiliary malignancy, and how do these techniques compare to alternative nonsurgical tissue sampling techniques (e.g., endoscopic ultrasound-guided fine-needle aspiration (FNA) or percutaneous FNA)?
b. What is the diagnostic performance of ERCP in diagnosing the presence of malignant pancreaticobiliary obstruction in comparison to other imaging alternatives (e.g., EUS or MRCP)?
c. What are the outcomes of treatment using ERCP strategies to treat malignant pancreaticobiliary obstruction compared to using surgical or interventional radiology treatment?
a. What is the diagnostic performance of ERCP in detecting underlying causes or complications of pancreatitis that are amenable to treatment in comparison to alternatives (e.g., EUS or MRCP)?
b. What are the outcomes of treatment using ERCP strategies compared to using surgical or medical therapy?
a. What is the diagnostic performance of ERCP with sphincter of Oddi manometry in identifying a pancreaticobiliary origin of pain in comparison to alternatives (e.g., biliary scintigraphy, EUS, or MRCP)?
b. What are the outcomes of treatment using ERCP strategies compared to using surgical or medical therapy?
The protocol for this review was designed prospectively to define study objectives, search strategy, patient populations of interest, study selection criteria, outcomes of interest, data elements to be abstracted and methods for abstraction, and methods for study quality assessment.
One reviewer performed primary data abstraction of all data elements into the evidence tables, and a second reviewer checked accuracy of the evidence tables. Disagreements were resolved between the two reviewers, or if necessary, in consultation with the Evidence-based Practice Center Director or members of the Technical Advisory Group.
The National Library of Medicine (NLM) staff conducted a comprehensive literature search for journal articles on ERCP from the PubMed®/MEDLINE®, BIOSIS, EMBASE, and SciSearch® databases with a publication date from 1980 through August 13, 2001. Articles which had been indexed to the NLM Medical Subject Heading (MeSH®) "cholangiopancreatography, endoscopic retrograde" as well as those containing the following list of ERCP synonyms and textword combinations were retrieved:
The "?" is a truncation symbol used to permit retrieval for variant word endings, as cholangiopancreatography, cholangiopancreatographic, etc.
Excluded from the search results were articles that:
The literature search for Topic 1c on prediction of common bile duct stones and for additional studies selected by the secondary selection criteria for Topics 3 and 4 used a streamlined search process to identify key articles addressing the clinical issue of interest. Reference lists from these articles were reviewed, focused MEDLINE searches were performed, and related articles were identified.
The Technical Advisory Group and peer reviewers for this project were asked to inform the project team of any studies relevant to the key questions addressed in this evidence report that were not retrieved by either of the search strategies.
The online searches of the PubMed, EMBASE, BIOSIS, and SciSearch databases in conjunction with additional citations identified through manual searching yielded a total of 5,698 titles and abstracts for review. Based on review of abstracts, 789 articles were selected for review in full text. Approximately 117 of these articles were excluded as review articles. Primary and secondary selection criteria were applied to articles identified as potential clinical trial reports. This process yielded a total of 149 included studies for the review of evidence.
The selection criteria for all topics in this report were:
To keep readers informed of ongoing studies, studies published only in abstract form since 1999 and judged to be important are noted in this systematic review. Data were not abstracted into the evidence tables.
Studies of diagnostic performance met the following additional selection criteria:
Studies of therapeutic outcomes met the following additional selection criteria:
Studies of predictors of ERCP complications met the following additional selection criteria:
Studies on the prediction of common bile duct stones met the following additional selection criteria:
Secondary Selection Criteria
There was a paucity of literature that met the primary selection criteria for questions on ERCP treatment of chronic pancreatitis (Topic 3b) and ERCP treatment of chronic abdominal pain of possible pancreaticobiliary origin (Topic 4b). To examine these questions, the original study selection criteria were relaxed for these topics to include:
For diagnostic performance studies, the outcomes of interest were test performance characteristics (i.e., sensitivity, specificity) in diagnosing clinically relevant findings.
For therapeutic outcome studies, the primary outcomes of interest include:
For studies of factors predicting ERCP complications, the primary outcomes of interest were measures of relative risk or predictive value associated with patient, procedure, or operator factors.
The approach to assessing the quality of evidence used domains commonly recognized as important in the literature on study quality. Quality criteria were developed for each of the three types of studies included in this systematic review:
For many topics addressed in this evidence review, studies meeting the most rigorous standards of quality do not exist. Thus, the main purpose of quality assessment in this systematic review is to discriminate between the better and lesser quality studies in the available evidence base.
For studies of therapeutic efficacy, the approach to quality assessment was adapted from that of the U.S. Preventive Services Task Force. Study quality domains of interest were:
A study was rated as "Good" if it clearly met all quality parameters. A study was rated "Fair" if it reasonably met these parameters and had no fatal flaw. A study was rated "Poor" if it was fatally flawed on one or more parameters (e.g, if comparable groups were not assembled or maintained or outcome measures were invalid or not applied equally among groups).
For studies of diagnostic performance, criteria for assessing study quality were developed using key references in the field of study quality assessment. The selection criteria used for this systematic review eliminated poor quality studies from inclusion. Study quality domains of interest to discriminate between good and fair quality studies were: enrollment of representative subjects (includes appropriate spectrum of patients, unbiased enrollment, complete enrollment of eligible patients, accounting for all eligible subjects); ERCP interpreted independently of diagnostic alternative; and diagnostic alternative interpreted independently from ERCP. As relevant, issues of suitability and interpretation of reference standards are addressed qualitatively in the discussion of each question.
For multivariable logistic regression analysis studies, the quality domains of interest were the degree of over-fitting present in the multivariable models, the nature of statistical reporting, and the use of procedures to establish internal validity. Degree of over-fitting was assessed using the ratio of the number of endpoints divided by the number of candidate variables in the model; and were classified as: satisfactory (ratio >10) to severe (ratio <4).
Diagnostic Performance of ERCP Compared to Alternatives
The search and selection process yielded 10 studies on MRCP (total n=834), 9 studies on EUS (total n=601), and 6 studies with 7 sets of findings on CTC (total n=266), but reference standards were not consistent among studies.
Individual studies were relatively small and unlikely to have adequate power to detect a statistically significant difference; and no studies reported tests of statistical significance. Thus, it is not possible to determine with confidence whether the diagnostic performance is similar or poorer than ERCP or to accurately quantify any difference.
The evidence comparing EUS to ERCP employs a reference standard that permits inferences regarding comparative performance. The evidence suggests that EUS is similar to ERCP in detecting common bile duct stones.
MRCP has a degree of concordance with ERCP that results in sensitivities and specificities greater than 90 percent in most studies. Concordance of CTC with ERCP appears to be lower, with sensitivities as low as 80 percent in some studies.
The role of alternative tests in the management of patients with suspected common bile duct stones cannot be determined strictly by diagnostic performance. The costs and risks of the tests, and the costs and risks of actions based on test results, along with the pretest probability of stones must all be considered to determine the optimal management strategy.
ERCP Treatment Strategies Compared to Surgical or Medical Management
In order to evaluate ERCP treatment strategies, studies must account for patients through the diagnostic and treatment process, including additional procedures needed when initial treatment fails, and total morbidity of the alternative strategies. Overall, the literature is very thin and spread out over many different comparisons of interest, preventing strong conclusions about any specific comparison of treatment strategies.
The limited evidence available suggests that: laparoscopic common bile duct exploration may be better than ERCP strategies to manage cholecystectomy patients with the least resource use; definitive surgery with cholecystectomy prevents long term complications at acceptable short-term morbidity when compared to sphincterotomy alone in high-risk surgical patients with suspected common bile duct stones; and endoscopic treatment of acute cholangitis reduces short-term mortality when compared to emergency surgery.
Limited evidence suggests that the following techniques have similar stone removal rates and short-term complications: intracorporeal and extracorporeal lithotripsy methods for removing large common bile duct stones; balloon dilation and sphincterotomy; and needle-knife fistulotomy and needle-knife precut papillotomy.
Diagnostic Value of Specific Risk Factors or Predictive Models for Assessing the Likelihood of Having A Common Bile Duct Stone
The probability of a common duct stone is one important factor in determining diagnostic and treatment strategies. When preoperative probability is high, ERCP may be preferred. When probability is low, expectant management is preferred. Additional diagnostic tests may be used to discriminate among patients in the middle range of probability. The exact probability cutoffs depend on the risks and benefits of the diagnostic and treatment alternatives. The risk factor or prediction model with the best receiver-operating characteristics (ROC) would make the best decision rule if the cutoff threshold were set correctly.
Thirteen studies (total n=7,409) reported multiple findings of sensitivities and specificities of a single or combination of risk factors to predict the presence of common bile duct stones. The single risk factors most commonly assessed were: clinical jaundice or elevated bilirubin, liver function tests, and ultrasound findings of a dilated common bile duct. All have significant associations with the presence of common duct stones, but none have both high sensitivity and specificity. Of the four studies testing prediction rules based on combinations of risk factors, only one study was a validation of an independently developed prediction rule. Multivariable prediction rules appear to have superior ROCs compared to individual risk factors.
The absence of any risk factors for stones (or a discriminant function indicating absence of stones) is a very strong predictor of the absence of stones. Absence of any risk factor produces probabilities of stones that are in the same range as a negative ERCP exam in a patient with risk factors for stones (0 percent to 17 percent).
Diagnostic performance of ERCP Tissue Sampling Techniques In Establishing A Tissue Biopsy Diagnosis of Pancreaticobiliary Malignancy Compared To Each Other and To Alternative Nonsurgical Tissue Sampling Techniques
Twelve studies comparing at least two tissue sampling techniques were identified in this systematic review. The available studies are limited by small size and do not consistently compare techniques in the same group of patients. Most studies do not report statistical tests, so it is not possible to determine with confidence whether reported differences in sensitivity are significantly different. While available evidence is suggestive, larger studies are needed to draw conclusions on relative performance of tissue sampling techniques.
The available evidence suggests that sensitivity for detecting malignancy is similar or higher for brush cytology vs. bile aspiration cytology, similar for fine-needle aspiration (FNA) cytology vs. brush cytology, and similar or higher for forceps biopsy vs. brush cytology. Using combinations of two or more sampling techniques may increase overall sensitivity. No comparative studies evaluated whether incremental improvement could also be achieved by repeated sampling using the same technique.
In the absence of comparative studies of endoscopic ultrasound (EUS)-FNA and ERCP-FNA, indirect comparison of single-arm studies was attempted. Results from 10 studies including at least 400 subjects with pancreatic mass suggest a range of sensitivity in detecting pancreatic malignancy of 60-94 percent with a specificity of 100 percent. Two studies of ERCP-FNA including 164 subjects with various pancreatobiliary tumors reported sensitivities ranging from 25 percent to 62 percent. While sensitivity reported in these studies appears to be lower than that for EUS-FNA, such a comparison is not valid due to differences in study populations, cytology techniques, and study settings.
Diagnostic Performance of ERCP Compared to Alternatives in Detecting Malignant Pancreaticobiliary Obstruction
Treatment Outcomes Using ERCP Strategies To Treat Malignant Pancreaticobiliary Obstruction Compared To Using Surgical or Interventional Radiology Treatment
Five studies compared endoscopic stent drainage with surgical bypass for palliation of malignant obstructive jaundice, and a randomized controlled trial of 204 patients provided the most robust evidence. There were no significant differences in overall survival, relief of jaundice, technical success, total hospitalization days, or perioperative mortality. Major complications were more frequent in the surgery group (11 percent vs. 29 percent, p=0.02); and stent replacement was required in 37 percent of patients treated with ERCP stents.
Two randomized controlled trials (total n=206) and one nonrandomized trial (n=165) compared metal to plastic stents placed by ERCP for palliation of biliary obstruction due to malignancy. Both types of stents offer initial relief of jaundice and the available evidence does not conclusively show any difference in perioperative adverse events. Overall patient survival is not significantly different when stent occlusions are treated with stent exchange as needed. Total resource utilization including need for repeat ERCP, total hospital days, and costs was reported to be lower with metal stents compared with plastic stents.
Six studies (total n=782), addressed preoperative stenting compared to no stenting prior to surgery for malignant pancreaticobiliary obstruction. The available evidence is of poor methodologic quality and fails to demonstrate that preoperative stenting improves health outcomes. Few studies report overall complications including both those related to the preoperative stent and the surgery, and these suggest that when complications of preoperative endoscopic stenting are considered along with the perioperative complications of surgery, preoperative stenting is associated with more complications. Preoperative stenting does appear to significantly improve elevated bilirubin and liver function tests, but the available evidence does not suggest that surgical outcomes are improved as a result.
Diagnostic Performance of ERCP Compared To Alternatives To Detect Underlying Causes or Complications of Pancreatitis That Are Amenable To Treatment
Treatment Outcomes of ERCP Strategies Compared To Surgical or Medical Therapy
For treatment of acute pancreatitis, three randomized controlled trials (total n=554) compared early ERCP to delayed or selective ERCP. The available evidence suggests that early ERCP reduces complications in patient populations with acute pancreatitis and signs and symptoms suggesting biliary obstruction. In patients with low likelihood of biliary obstruction, delayed or selective ERCP permits many patients to avoid the procedure, and may result in lower complication rates. In addition, one retrospective associational study of a Veterans Administration database of patients with acute pancreatitis (n=2,075) suggests that outcomes of ERCP treatment are similar to those of surgery.
For ERCP treatment in patients with acute recurrent or chronic pancreatitis, study selection criteria were relaxed as described above. Although the available evidence is sparse and largely uncontrolled, it suggests that ERCP treatment reduces emergency room visits and hospitalization in patients with pancreas divisum and acute recurrent pancreatitis. Evidence on ERCP drainage of pseudocysts is also sparse and poorly controlled, but suggests that pain relief with ERCP is similar to results of surgery.
Diagnostic Performance of ERCP with Sphincter of Oddi Manometry Compared with Alternatives To Identify A Pancreaticobiliary Origin of Pain
Treatment Outcomes of ERCP Strategies Compared To Surgical or Medical Therapy
Two randomized controlled trials (total n=128) show that endoscopic sphincterotomy relieves pain in patients with pancreaticobiliary pain, sphincter of Oddi dysfunction, and elevated basal sphincter of Oddi pressure on manometry (greater than 40mm Hg). The results of five single arm studies (total n=183) corroborate these data and suggest that patients with a dilated common bile duct and/or delayed contrast emptying may also benefit from endoscopic sphincterotomy.
There is insufficient evidence to determine whether endoscopic sphincterotomy improves outcomes in patients with normal manometry findings. For this group, the small studies included in this review do not report significant improvements in pain with endoscopic sphincterotomy.
Thirteen studies reported on multivariable logistic regression analyses of factors associated with complications of ERCP. The four largest studies each included more than 1,800 patients, and the total number of complications observed in these studies ranged from 98 to 229. Overall, the methodologic quality of the available analyses is limited by over-fitting, i.e., testing an excessive number of factors relative to the number of complications observed. Consequently, this literature is exploratory in nature. Reported magnitudes of association are not reliable, significant independent variables may have been overlooked, and some significant associations may be misleading. Moreover, the existing studies do not use common, standardized definitions for the complications and factors of interest. Thus, caution should be used in drawing inferences for clinical practice from these studies.
Patient, procedure, and operator factors were identified that were found to be significantly associated with complications in several of the more robust studies. Younger age (using various cut-offs, but generally 60 years or less) was significantly associated with total complications and with pancreatitis; as was suspected sphincter of Oddi dysfunction. Precut endoscopic sphincterotomy was the procedure-related factor most commonly associated with total complications or pancreatitis; a significant association with difficulty in cannulation was also reported, but less frequently. Multiple pancreatic contrast injections were associated with pancreatitis. For hemorrhage, the clearest association was patient factors related to coagulopathy. Case volume was the only operator-related factor found to be significantly associated with complications. These studies used various cut-offs to define lower volume centers: one or fewer procedures per endoscopist per week; fewer than 40 endoscopic sphincterotomies per endoscopist per year; and fewer than 150 procedures per year.
Recommendations for future research include the following:
Rigorous studies are required in order to reliably quantify the relative performance of diagnostic ERCP compared to alternatives. Existing studies do not consistently use common reference standards and frequently do not report tests of statistical significance. Thus, assumptions about equivalence or difference among alternative diagnostic technologies are not supported by robust empirical evidence.
Comparative studies of alternative diagnostic and treatment strategies are urgently needed. It is imperative to use a comprehensive approach to outcomes assessment, taking into account the total burden of morbidity and resource utilization.
Evidence on treatment of chronic pancreatitis and relapsing or recurrent pancreatitis is sparse. Rigorously designed controlled trials are needed to assess the outcomes of treatment for this debilitating condition.
Risk factors for complications of diagnostic and therapeutic ERCP have been explored using multivariable model analysis. Such analyses generate hypotheses for reducing complications, but cannot demonstrate cause and effect. Thus, interventions intended to reduce complications should incorporate prospectively defined studies to evaluate the results.
The full evidence report from which this summary was derived was prepared for AHRQ by the Technology Evaluation Center under contract number 290-97-001-5. Printed copies may be obtained free of charge from the AHRQ Publications Clearinghouse or by calling 1-800-358-9295. Requestors should ask for Evidence Report/Technology Assessment No. 50, Endoscopic Retrograde Cholangiopancreatography.
The Evidence Report is also online on the National Library of Medicine Bookshelf, or can be downloaded as a set of PDF files or as a zipped file.
AHRQ Publication Number 02-E008
Current as of January 2002
Internet Citation:
Endoscopic Retrograde Cholangiopancreatography. Summary, Evidence Report/Technology Assessment: Number 50. AHRQ Publication No. 02-E008, January 2002. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/clinic/epcsums/ercpsum.htm.
Return EPC Evidence Reports
Clinical Information
AHRQ Home Page
Department of Health and Human Services