Author(s): Naranjo CA, Busto U, Sellers EM, Sandor P, Ruiz I
The estimation of the probability that a drug caused an adverse clinical event is usually based on clinical judgment. Lack of a method for establishing causality generates large between-raters and within-raters variability in assessment. Using the conventional categories and definitions of definite, probable, possible, and doubtful adverse drug reactions (ADRs), the between-raters agreement of two physicians and four pharmacists who independently assessed 63 randomly selected alleged ADRs was 38% to 63%, kappa (k, a chance-corrected index of agreement) varied from 0.21 to 0.40, and the intraclass correlation coefficient of reliability (R[est]) was 0.49. Six (testing) and 22 wk (retesting) later the same observers independently reanalyzed the 63 cases by assigning a weighted score (ADR probability scale) to each of the components that must be considered in establishing causal associations between drug(s) and adverse events (e.g., temporal sequence). The cases were randomized to minimize the influence of learning. The event was assigned a probability category from the total score. The between-raters reliability (range: percent agreement = 83% to 92%; κ = 0.69 to 0.86; r = 0.91 to 0.95; R(est) = 0.92) and within-raters reliability (range: percent agreement = 80% to 97%; κ = 0.64 to 0.95; r = 0.91 to 0.98) improved (p < 0.001). The between-raters reliability was maintained on retesting (range: r = 0.84 to 0.94; R(est) = 0.87). The between-raters reliability of three attending physicians who independently assessed 28 other prospectively collected cases of alleged ADRs was very high (range: r = 0.76 to 0.87; R(est) = 0.80). It was also shown that the ADR probability scale has consensual, content, and concurrent validity. This systematic method offers a sensitive way to monitor ADRs and may be applicable to postmarketing drug surveillance.