Notes
Outline
Meta-Analysis for Clinical Researchers

An Introduction to Systematic Reviews
& Meta-analysis
Fredric M. Wolf, Ph.D.
University of Washington
Dept of Medical Education
& Biomedical Informatics
http://www.dme.washington.edu/
Goal of This Short Course
To provide a conceptual understanding of the quantitative methods used to synthesize evidence across independent trials or studies
 Major Topics
Methods for pooling evidence across independent studies
Pooling binary and continuous outcomes
Differences between fixed and random effects models
Guidelines for appraising published systematic reviews/meta-analyses
Topics – Session 1:
General Issues
What is a systematic review/meta-analysis?
History of meta-analysis (very abbreviated)
Methods for pooling evidence across independent studies - examples
Pooling binary and continuous outcomes
Topics – Session 2:
“Sticky Wickets”
Cumulative meta-analysis vs. standard meta-analysis
Differences between fixed and random effects models; heterogenity
Sources of bias
Guidelines for appraising published systematic reviews/meta-analyses
 Methods for Pooling Evidence Across Independent Studies
Why do it? Why pool?
What is a systematic review/meta-analysis?
Brief history of meta-analysis (very abbreviated)
Examples
Why Meta-analysis/Systematic Reviews?
   “. . . the mass of new information makes it difficult for practicing physicians to follow the literature in all areas that might be relevant to their practices.  New methods to synthesize and present information from widely dispersed publications are needed
. . . .”

Jerome Kassirer.  Clinical trials and meta-analysis: what do they do for us?   N Engl J Med 1992; 327:273-4.
Why Need Meta-analysis? Information Explosion
10-fold Increase in Number of Professional Journals
Psychology Journals: 
91 (1951) --> 1,175 (1992)
Math Science Journals: 
91 (1953) --> 920 (1992)
Biomedical Journals: 
2,300 (1940)--> 23,000 (1993)
Problem – Conflicting Information
Not only is there more information, but . . .
Not all information is of equal quality
Information does not necessarily = evidence
There is often conflicting information & reports
Traditional narrative reviews can be very “impressionistic”
Problems With Traditional Literature Reviews Addressed in Meta-analysis
Selective inclusion of studies, often based on the reviewer's own impressionistic view of the quality of the study
Differential subjective weighting of studies in the interpretation of a set of findings
Misleading interpretations of study findings
Failure to examine characteristics of the studies as potential explanations for disparate or inconsistent results across studies
Failure to examine moderating variables in the relationship under examination
Brief History of Meta-analysis
(very abbreviated)
1900’s: Karl Pearson pooled correlation coefficients -enteric fever & inoculation rates British Army
1930’s: Ronald Fisher pooled p-values
1972: Richard Peto: log rank test for combining (binary) data from different trials; Mantel, Thomas Chalmers
1976: Gene Glass “meta-analysis” effects of psychotherapy
1989 Murray Enkin, Marc Keirse, Iain Chalmers Effective Care in Pregnancy & Childbirth
1992/1993 UK Cochrane Centre/Cochrane Collaboration created
Primary, Secondary and Meta-analysis of Research
 “Primary analysis is the original analysis of data in a research study . . . .
Secondary analysis is the re-analysis of data for the purpose of answering the original research question with better statistical techniques, or answering new questions with old data . . . .”
GV Glass 1976, p. 3
Primary, Secondary and Meta-analysis of Research
 “Meta-analysis refers to the analysis of analyses . . . . the statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings.  It connotes a rigorous alternative to the casual, narrative discussions of research studies which typify our attempts to make sense of the rapidly expanding research literature.”
GV Glass 1976, p. 3
Rationale for Systematic Reviews
“provide summaries of what we know, and do not know, that are as free from bias as possible.” (Chalmers et al 1999)
“research that uses explicit & transparent methods to synthesise relevant studies, allowing others to comment on, criticise or attempt to replicate the conclusions reached. Systematic reviews follow same set of procedures as any individual study, & are often reported in the same way. . . .” (Petrsino et al 1999)
4 Basic Questions That a SR/MA Tries to Answer
Are the results of the different studies similar?
To the extent that they are similar, what is the best overall estimate of effect?
How precise and robust is this estimate?
Can dissimilarities be explained?
Lau J, Ioannidis JPA, Schmid CH. Quantitative Synthesis in Systematic Reviews. Annals of Internal Medicine 1997; 127:820-826.
What is a Systematic Review?
State objectives of the review, & outline eligibility criteria
Search for studies that seem to meet eligibility criteria
Tabulate characteristics of each study identified & assess its methodological quality
Apply eligibility criteria & justify any exclusions
What is a Systematic Review?
Assemble the most complete dataset feasible, with involvement of investigators
Analyse results of eligible studies. 
Use statistical synthesis of data
(meta-analysis) if appropriate & possible
Perform sensitivity analyses, if appropriate & possible (including subgroup analyses)
Prepare a structured report of the review, stating aims, describing materials & methods, & reporting results
Cochrane Collaboration
Comparison with human genome project in potential impact for clinical medicine
Naylor CD.  Grey zones of clinical practice: some limits to evidence-based medicine.  Lancet 1995; 345:840-2.
What is the Cochrane Collaboration & why is it important?
   Cochrane Library           CD (& WWW)
Cochrane Database of
Systematic Reviews
(CDSR)
Database of Abstracts of Reviews of Effectiveness (DARE)
Cochrane Central Register of Controlled Trials (CENTRAL)
Cochrane Review Methodology Database
Health Technology Assessment DB (HTA)
NHS Economic/Evaluation Database (NHS EED)
Sources of Evidence in “Real-time”: Cochrane Library (Issue 1, 2003)
CDSR  (Cochrane Database of Systematic Reviews)
>50 CRGs,   >1500 complete reviews,  >1200 protocols, >200 new reviews/year
DARE (DB of Abstracts of Reviews of Effectiveness)
 >3800 abstracts
Cochrane Central Register of Controlled Trials (CENTRAL)  >353,000 RCTs/CCTs
HTA (Health Technology Assessment Database) >2800
NHS Economic /Evaluation DB >10,000
Retrieval Problems
50% of RCTs not found in MEDLINE
44% of all RCTs (17% - 76%)
72% of RCTs in MEDLINE (32% - 91%)
Publication Type (PT) indexing
RCT (1991)
CCT (1995)
Meta-analysis (1993)
Hand Searching Medical Journals
>630 journals
19,266 RCTs tagged (1991-93)
14,964 CCTs tagged (1985-90)
MEDLINE updated annually to include or re-tag these studies as RCTs or CCTs
Distinguishing Feature of Cochrane Reviews
“The fundamental distinction between Cochrane Reviews and other reports of systematic reviews is that their authors are expected to update and amend them in the light of relevant additional data, and criticisms or other comments.”

Sir Iain Chalmers 1999
Another Distinguishing Feature of Cochrane Reviews
Consistency in format across all reviews, both in the sections of the review & particularly for tables/figures
Enables readers to better understand without shifting cognitive frame of reference for each review
However there are costs/trade-offs associated with this standardization

Wolf FM. Lessons to be learned from evidence-based medicine: Practice and promise of evidence-based medicine and evidence-based education. Medical Teacher 2000; 22(3): 251-259.
Slide 25
Slide 26
Prophylactic Corticosteroids for Preterm Birth
“Antenatal corticosteroid therapy is associated with a significant reduction in the incidence of neonatal death or infant death.
The magnitude of this effect was greater in the earlier years of antenatal corticosteroid use, when the case-fatality rates were high, however even with increasingly low fatality rates for respiratory distress syndrome the effect is still statistically significant.”
Slide 28
Effects of Asthma Self Management Educational Interventions in
Children and Adolescents: A Systematic Review and Meta-analysis
James P Guevara, MD,MPHa, Fredric M Wolf, PhDb, Cyril M Grum, MDc, & Noreen M Clark, PhDc
aUniversity of Pennsylvania, Philadelphia, USA
bUniversity of Washington, Seattle, USA
cUniversity of Michigan, Ann Arbor, USA
Cochrane Library 2003 (1);  BMJ 2003; in press.
Objectives/Purpose
To determine the effectiveness of asthma self-management education programs on health outcomes in children
Outcomes:
Lung Function
Asthma Morbidity
Heath Care Utilization:
Self-reported Perceptions of Self-care Abilities
Objectives/Purpose
To conduct subgroup analyses to examine the impact of
type of educational intervention (individual vs. group)
intensity of educational intervention (no. of sessions)
 self-management strategy (symptom vs. peak flow)
degree of asthma severity
length of follow-up
study quality (adequacy of allocation concealment, withdrawal rates, and type of trial, i.e., RCT vs. CCT)

 Inclusion Criteria for Studies
Randomized controlled trials (RCTs) or controlled clinical trials (CCT)
Children & adolescents ages 2 to 18 years old
Educational intervention designed to teach one or more self-management strategies related to prevention, attack management, or social skills
Included outcomes on pulmonary function tests, morbidity, or health care utilization
Search Strategy –
References & Databases
Studies were identified from
Cochrane Airways Group's Special Register of Controlled Trials comprised of references from
MEDLINE (1966-2000)
EMBASE (1980-2000)
CINAHL (1982-2000)
hand searched airways-related journals
PsychINFO
Reference lists from relevant review articles that were identified (ancestry approach)
Search Strategy - Terms
asthma OR wheez*
AND
education* OR self management OR self-management
AND
placebo* OR trial* OR random* OR double-blind OR double blind OR single-blind OR single blind OR controlled study OR comparative study.
Identification of Trials
Potentially relevant studies from literature search and hand searches (n=318)
Excluded on basis of abstract, e.g., not randomised or controlled clinical trials (n=273)
Articles selected for full text review (n=45)
Excluded after full text review (n=13)
Eligible trials (n=32)
Main Outcome Measures
Lung Function (pulmonary function tests):
FEV1
PEFR (Peak Flow )
Asthma Morbidity:
asthma exacerbations
days of school absence
days of restricted activity
nights disturbed by asthma
asthma severity
Main Outcome Measures
Health Care Utilization:
physician visits
emergency department (ED) visits
hospitalizations
Self-reported Perceptions of Self-care Abilities:
self-efficacy
Subgroup Analyses
Time of enrollment in the intervention
(1-6 mo, 7-12 mo, or >/=12 months)
Self-management strategy
(peak flow-based vs. symptom-based)
Intervention type
(individual vs. group)
Intensity of intervention
(single vs. multiple sessions)
Study quality
(RCT vs. CCT; random allocation procedures)
Methods
32 eligible trials were abstracted & coded (by 2 independent coders)
Approximately 3706 patients
All eligible studies were abstracted onto preprinted data collection forms
Authors of studies were contacted & asked to provide missing data
Statistical Analyses (Meta-analysis): Pooled Effect Size Measures
Standardized weighted mean differences (SMD) for continuous outcomes (vs. WMD)
Pooled odds ratios for binary outcome measures
+ estimates of NNT
(number of patients needed to treat to prevent 1 bad outcome or result in 1 favorable outcome)
95% confidence intervals & tests of homogeneity of effects are reported
Both fixed-effects & random effects models
Overall Results: Lung Function
4 trials involving 141 patients
Significant improvement associated with self-management education:
FEV1 (1 trial, n = 110 )
 (SMD –0.46 ,    95% CI  –0.84 to –0.08)
Peak flow (PEFR) (3 trials, n = 148)
 (SMD –0.53,    95% CI  –0.87 to –0.20)
Pooled FEV1 & PEFR (4 trials, n = 258)
 (SMD –0.50 ,    95% CI  –0.76 to –0.25)
Lung Function
Overall Results: Asthma Morbidity 18 trials involving 1695 patients
Significant reductions in
days of restricted activity (6 trials, n = 378)
(SMD –0.25,    95% CI –0.46 to –0.05)
days of school absence (16 trials, n = 1626)
 (SMD –0.14,    95% CI –0.23 to –0.04)
nights disturbed by asthma (3 trials, n = 202)
(SMD –0.34,    95% CI –0.62 to –0.05) fixed effects (SMD –0.39,    95% CI –1.07 to +0.28) random effects
No reduction in proportion of patients experiencing an asthma exacerbation
Days of Restricted Activity
Days of School Absence
Nights Disturbed by Asthma
Health Care Utilization Outcomes
16 trials, 1781 patients
Significant reductions in
number of emergency dept visits (11 trials)
(SMD –0.19,     95% CI –0.31 to –0.07)
No reduction in
proportion of patients visiting emergency dept
number of physician visits
risk of hospitalization
number of hospitalizations
Emergency Department Visits
General Practitioner Visits
Overall Results: Self-reported Perceptions of Self-care Abilities
Significant improvement associated with self-management education:
Self-efficacy (4 trials involving 272 patients)
 (SMD –0.47,     95% CI –0.71 to –0.23)
Subgroup Results
There is not enough evidence to reliably discern differences in effectiveness of self-management education as a function of
asthma severity
number of educational sessions
individual versus group sessions
peak flow vs. symptom management interventions
Conclusions & Future Implications
Evidence from existing clinical trials supports conclusion that self-management educational interventions for children with asthma compared to usual care results in
improved lung function
declines in health care utilization
decreased asthma morbidity (limited effects)
This suggests desirability of incorporating self-management education into routine asthma care for children
Conclusions & Future Implications
Almost all educational programs included prevention & attack management planning components. A small subset included a social skills component.
Many studies were either poorly reported or of less than desired quality, or both
When designing future studies, much more attention needs to be paid to better reporting & higher quality study design
Conclusions & Future Implications
Evidence was unavailable for sufficient numbers of patients to reliably estimate effects for many important subgroups that could inform provider
& patient decision making
Because evidence supports the conclusion that education is effective when compared to no education, future studies should directly test alternative interventions against one another rather than against no education controls
Topics – Session 2
Cumulative meta-analysis vs. standard meta-analysis
Differences between fixed and random effects models
Sources of bias
Guidelines for appraising published systematic reviews/meta-analyses
 Standard Meta-analysis (Left) & Cumulative Meta-analysis (Right)
Sources of Bias
RCTs (primary studies)
Selection bias
Performance bias
Exclusion bias
Detection bias
Meta-analyses
Publication bias
Language bias
Coder bias
“Apples & oranges” vs “fruit” (heterogeneity)
Quality bias (small studies)
Multiple publication bias
Slide 58
Sources of Bias in a Meta-analysis
Publication bias
Fail-safe N
Funnel Plots
Language bias
Coder bias
“Apples & oranges” vs. “fruit” (heterogeneity)
Quality bias (small studies)
Multiple publication bias
An Inverted Funnel Plot to Detect Publication Bias
An Inverted Funnel Plot to Detect Publication Bias
Pooling Binary & Continuous Outcomes
General principals
Effect size
Confidence Intervals
Types of data
Sources of potential bias
Interpreting results
Examples
Differences Between Fixed & Random Effects Models
So what’s in a model?
Why does it matter?
How to deal with heterogeneity?
Heterogeneity
Common, to be expected, not the exception
Should do test for homogeneity, but . . . interpret heterogeneity cautiously in spirit of exploratory data analysis
Exploring sources of heterogeneity can lead to insights about modification of apparent associations by various aspects of
Study design
Exposure measurements
Study populations
Heterogeneity
Relations discovered in process of exploring heterogeneity may be useful in planning & carrying out new studies
Excluding outliers solely on basis of disagreement with other studies can lead to seriously biased summary estimates (avoid)
Easier to interpret sources of heterogeneity when identified in advance of data analysis
(not when suggested only by data)
Fixed & Random Effects
Fixed effects models assume that an intervention has a single true effect
Random effects models assume that an effect may vary across studies
Random Effects
Assumes sample of studies randomly drawn from population of studies
This is NOT typically true because:
All trials are included
Trials are systematically (e.g., conveniently) sampled and not randomly sampled
Random Effects
Primary value of M-A is in search for predictors of between-study heterogeneity
Random-effects summary is last resort only when predictors or causes of between-study heterogeneity cannot be identified
Random-effects can conceal fact that summary estimate or fitted model is poor summary of the data
Sander Greenland.  Am J Epidemiol 1994;140;290-6.
Random Effects
Sometimes needed, but more sensitive to publication bias than fixed-effects
Random effects weights vary less across studies than fixed-effects weights
W = 1/v  versus   w = 1/(v + t2)
Leads to reduced variation in weights
Thus smaller studies given larger relative weights when random effects models used
Thus influenced more strongly by any tendency NOT to publish small statistically insignificant studies
®  biased estimate, spuriously strong associations
Random Effects
Fixed effects weights vs. random effects weights
W = 1/v  versus   w = 1/(v + t2)
Identical when there is little or no between study variation
When differ, confidence intervals are larger for random-effects than fixed effects
Smaller studies given larger relative weights in random effects models &  > influence
Conversely, influence of larger studies is less
May result in type II (beta error), e.g., Finding no significant difference when one truly exists
Nights Disturbed by Asthma
Methodologic Choices & Their Implications in Dealing With Heterogeneous Data
in a Meta-analysis
Guidelines for Appraising Published
  Systematic Reviews/ Meta-analyses
User’s guide criteria
Sources of bias
Making Sense of Evidence: 10 Questions to Help You Make Sense of a Review
Three broad issues need to be considered when appraising research:
A:  Are the results of the study valid?
B:  What are the results?
C:   Will the results help locally?
Based on worksheet developed by the Critical Assessment Skills Programme (CASP).  http://www.phru.org.uk/~casp/
Questions are adapted from Oxman AD et al. Users’ Guides to the Medical Literature: VI. How to use an overview. JAMA 1994; 272 (17): 1367-1371.
Screening Questions
A: Are the results of the study valid?
(Yes/ Can’t tell/ No)
1. Did the review address a clearly focused research question?
HINT: A research question should be “focused” in terms of: 1. population studied, 2. intervention given or exposure,
3. outcomes considered
 2. Did the review include the right type of studies?
HINT: These would: 1. address the review’s research question, 2. have an appropriate study design
 Is it worth continuing based on answers to above?
Detailed Questions (Yes/ Can’t tell/ No)
3. Did the reviewers try to identify all relevant studies?
 HINT: Look for: 1. which bibliographic databases were used, 2. follow-up from reference lists, 3. personal contact with experts, 4. search for unpublished studies, 5. search for non-English language studies
4. Did reviewers assess quality of the included studies?
 HINT: A clear, pre-determined strategy should be used to determine which studies are included.  Look for: 1. a scoring system, 2. more than one assessor
Detailed Questions (Yes/ Can’t tell/ No)
5.  If results of studies have been combined, was it reasonable to do so?
HINT: Consider whether: 1. results of each study are clearly displayed, 2. results were similar from study to study (look for tests of heterogeneity),
3. reasons for any variations in results are discussed
B: What are the results?
6. What are the main results of the review?
HINT:  Consider 1. how the results are expressed (e.g., odds ratio, relative risks etc.),
2. what the results are
7. Could these results be due to chance?
HINT: Look for tests of statistical significance (p-values) and confidence intervals (CIs)
C: Will the Results Help Locally? (Yes/ Can’t Tell/ No)
8. Can the results be applied to the local population? Can the results be applied to your cousin?
HINT: Consider whether: 1. the population sample covered by the review could be sufficiently different from your population to cause concern, 2. your local settings is likely to differ much from that of the review
9. Were all important outcomes considered?
HINT: Consider outcomes from the point of view of the:
1. individual, 2. policy makers and practitioners, 3. family/carers, 4. wider community
C: Will the Results Help Locally? (Yes/ Can’t Tell/ No)
10. Should policy or practice change as a result of evidence 
  contained in this review?
HINT: Consider whether benefits are worth the harms and costs
Additional Topics - Not Covered
Meta-regression
Bayesian MA
Pooling non-randomized data
Pooling/MA of diagnostic test data
Individual patient data MA
Consort MA guidelines
Missing data –imputation, bias?
For Additional Information, Please Contact:
Fredric M. Wolf, Ph.D.
Professor & Chair, Dept. of Medical Education
& Biomedical Informatics
Adjunct Professor of Health Services
University of Washington
E-312 Health Sciences/Box 357240
Seattle, WA  98195-7240   USA
Tel: 206.543.2259  FAX: 206.543.3461
E-mail:  wolf@u.washington.edu