Jump to content

FAQ

From CAMIH

Level of Evidence

Version Oxford 2011

The Oxford Centre for Evidence-Based Medicine (OCEBM) 2011 Levels of Evidence is a framework used to rate the quality and strength of evidence in medical research and clinical practice. This system was created by the Oxford Centre for Evidence-Based Medicine to guide clinicians and researchers in assessing the reliability of evidence when making medical decisions. The 2011 version offers a granular approach, categorizing levels based on the type of clinical question (e.g., therapy vs. diagnosis), which makes it useful for selecting studies specific to various types of medical queries.

Clinicians use the OCEBM Levels of Evidence to assess the quality of studies when reviewing medical literature. Higher-level evidence (Levels 1 and 2) is generally prioritized in clinical decision-making, whereas lower-level evidence (Levels 4 and 5) is relied upon when stronger studies are unavailable. This system helps clinicians make evidence-based choices, supporting patient outcomes through research-backed practices.


Oxford Centre for Evidence-Based Medicine (OCEBM) 2011 Levels of Evidence
Questions Step 1 (Level 1) Step 2 (Level 2) Step 3 (Level 3) Step 4 (Level 4) Step 5 (Level 5)
How common is the problem? Local and current random sample surveys (or censuses) Systematic review of surveys that allow matching to local circumstances⁎⁎ Local non-random sample⁎⁎ Case-series⁎⁎ n/a
Is this diagnostic or monitoring test accurate? (Diagnosis) Systematic review of cross sectional studies with consistently applied reference standard and blinding Individual cross sectional studies with consistently applied reference standard and blinding Non-consecutive studies, or studies without consistently applied reference standards⁎⁎ Case-control studies, or “poor or non-independent reference standard⁎⁎ Mechanism-based reasoning
What will happen if we do not add a therapy? (Prognosis) Systematic review of inception cohort studies Inception cohort studies Cohort study or control arm of randomized trial⁎⁎ Case-series or case-control studies, or poor quality prognostic cohort study⁎⁎ n/a
Does this intervention help? (Treatment Benefits) Systematic review of randomized trials or n-of-1 trials Randomized trial or observational study with dramatic effect Non-randomized controlled cohort/follow-up study⁎⁎ Case-series, case-control studies, or historically controlled studies⁎⁎ Mechanism-based reasoning
What are the COMMON harms? (Treatment Harms) Systematic review of randomized trials, systematic review of nested case-control studies, n-of-1 trial with the patient you are raising the question about, or observational study with dramatic effect Individual randomized trial or (exceptionally) observational study with dramatic effect Non-randomized controlled cohort/follow-up study (post-marketing surveillance) provided there are sufficient numbers to rule out a common harm. (For long-term harms the duration of follow-up must be sufficient.)⁎⁎ Case-series, case-control, or historically controlled studies⁎⁎ Mechanism-based reasoning
What are the RARE harms? (Treatment Harms) Systematic review of randomized trials or n-of-1 trial Randomized trial or (exceptionally) observational study with dramatic effect Non-randomized controlled cohort/follow-up study (post-marketing surveillance) provided there are sufficient numbers to rule out a common harm. (For long-term harms the duration of follow-up must be sufficient.)⁎⁎ Case-series, case-control, or historically controlled studies⁎⁎ Mechanism-based reasoning
Is this (early detection) test worthwhile? (Screening) Systematic review of randomized trials Randomized trial Non-randomized controlled cohort/follow-up study⁎⁎ Case-series, case-control, or historically controlled studies⁎⁎ Mechanism-based reasoning


Level may be graded down on the basis of study quality, imprecision, indirectness (study PICO does not match questions PICO), because of inconsistency between studies, or because the absolute effect size is very small; Level may be graded up if there is a large or very large effect size.

⁎⁎ As always, a systematic review is generally better than an individual study.


Reference: http://www.cebm.net/index.aspx?o=5653; OCEBM Table of Evidence Working Group = Jeremy Howick, Iain Chalmers (James Lind Library), Paul Glasziou, Trish Greenhalgh, Carl Heneghan, Alessandro Liberati, Ivan Moschetti, Bob Phillips, Hazel Thornton, Olive Goddard and Mary Hodgkinson.

Version Oxford 2008

The Oxford Centre for Evidence-Based Medicine (OCEBM) 2008 Levels of Evidence is an earlier framework designed to categorize the quality of medical evidence based on study design and reliability. Like the 2011 update, it was created to help clinicians assess the strength of evidence when making healthcare decisions. The 2008 version ranks evidence according to study type.

The 2008 levels are organized from Level 1 (highest quality) to Level 5 (lowest quality).


Level Therapy / Prevention, Aetiology / Harm
1a SR (with homogeneity) of RCTs
1b Individual RCT (with narrow Confidence Interval”)
1c All or none
2a SR (with homogeneity) of cohort studies
2b Individual cohort study (including low quality RCT; e.g., <80% follow-up)
2c “Outcomes” Research; Ecological studies
3a SR (with homogeneity) of case-control studies
3b Individual Case-Control Study
4 Case-series (and poor quality cohort and case-control studies)
5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”


Reference: Produced by Bob Phillips, Chris Ball, Dave Sackett, Doug Badenoch, Sharon Straus, Brian Haynes, Martin Dawes since November 1998. Updated by Jeremy Howick March 2009.


Cochrane RoB Tool 2.0.

The Cochrane Risk of Bias Tool 2.0 (RoB 2.0) is an updated tool developed by the Cochrane Collaboration to assess the risk of bias in randomized controlled trials (RCTs). It is part of the Cochrane Handbook for Systematic Reviews of Interventions and is used to evaluate the methodological quality of studies included in systematic reviews.

Focus on Specific Domains

RoB 2.0 evaluates five key domains that could introduce bias in RCTs:

  • Randomization process: Assessing how participants were randomly assigned to groups.
  • Deviations from intended interventions: Considering whether participants received the interventions they were assigned to.
  • Missing outcome data: Evaluating how missing data were handled and its impact on results.
  • Measurement of the outcome: Checking if the outcome measurements were conducted in a way that minimized bias.
  • Selection of the reported result: Assessing whether the results reported were pre-specified and consistent with the planned analysis.

Rating each domain

Each domain is rated as:

  • Low risk of bias
  • Some concerns
  • High risk of bias

Purpose and Importance

The tool encourages reviewers to consider the context of the study and the potential impact of bias on the study's findings. It can be used for various types of outcomes (e.g., dichotomous, continuous) and different intervention types. By systematically assessing the risk of bias in included studies, reviewers can better determine the reliability of the findings and make informed conclusions about the effectiveness of interventions.


Reference: Sterne JAC, Savović J, Page MJ, Elbers RG, Blencowe NS, Boutron I, Cates CJ, Cheng H-Y, Corbett MS, Eldridge SM, Hernán MA, Hopewell S, Hróbjartsson A, Junqueira DR, Jüni P, Kirkham JJ, Lasserson T, Li T, McAleenan A, Reeves BC, Shepperd S, Shrier I, Stewart LA, Tilling K, White IR, Whiting PF, Higgins JPT. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ 2019; 366: l4898.


Analysis

Both intent-to-treat (ITT) and per protocol analyses are methods used in clinical trials to evaluate the effectiveness of an intervention, but they differ significantly in how they handle participant data and adherence to the study protocol.

Intent-to-Treat (ITT) Analysis

ITT analysis includes all randomized participants in the groups to which they were originally assigned, regardless of whether they completed the study, adhered to the protocol, or received the intended intervention.

Advantages:

  • By preserving randomization, ITT helps prevent bias that could arise from differential dropout rates or non-adherence.
  • ITT analysis provides a more realistic estimate of how the intervention would perform in a typical clinical setting, where not all patients adhere to treatment.

Disadvantages:

  • ITT can underestimate the treatment effect because it includes participants who did not complete the intervention as planned.

Modified Intent-to-Treat (mITT) Analysis

mITT Analysis includes all participants who were randomized but allows for certain modifications, such as:

  • Excluding participants who did not meet specific eligibility criteria after randomization.
  • Including only those who received at least some treatment (e.g., at least one dose of the intervention).
  • Excluding participants who withdrew consent or were lost to follow-up in a way that is predefined in the study protocol.

Per Protocol (PP) Analysis

Per protocol analysis includes only those participants who completed the study according to the original protocol. This means they adhered to the assigned treatment and followed all study procedures.

Advantages:

  • By focusing on participants who adhered to the protocol, per protocol analysis can provide a clearer picture of the effectiveness of the intervention.

Disadvantages:

  • Excluding non-compliant participants can introduce bias, as the reasons for non-compliance may be related to the treatment’s effectiveness or safety.
  • Per protocol results may not reflect real-world scenarios, as they do not account for non-adherence and dropouts common in clinical practice.