Update on Methods: Insufficient Evidence

Table of Contents

 

Preface

By Diana Petitti, MD, MPH;a Steven M. Teutsch, MD;b Mary B. Barton, MD, MPP;c George F. Sawaya, MD;d Judith K. Ockene, PhD, MEd;e Thomas DeWitt, MD,f on behalf of the United States Preventive Services Task Force

This article was first published in the Annals of Internal Medicine. Select for copyright and source information.

 Published Comment and Response:

Published Comment
Response of the USPSTF

Abstract

The United States Preventive Services Task Force (USPSTF) seeks to provide reliable and accurate evidence-based recommendations to primary care clinicians. However, clinicians indicate frustration with the lack of guidance provided by the USPSTF when the evidence is insufficient to make a recommendation. This publication describes a new USPSTF plan to commission its Evidence-based Practice Centers to collect information in four "domains" pertinent to clinical decisions about prevention and to report this information routinely. The four domains are: potential preventable burden, potential harm of the intervention, costs (both monetary and opportunity) and current practice. The process and rationale used to select these domains is presented, along with illustrations of the potential use of the information by clinicians to guide clinical decisionmaking when evidence is insufficient.

Introduction

The United States Preventive Services Task Force (USPSTF) is an independent panel of experts that is convened and supported by the Agency for Healthcare Research and Quality (AHRQ). The U.S. Congress has charged it to review the scientific evidence for clinical preventive services and develop evidence-based recommendations about their delivery. In its recommendations, the USPSTF seeks to maximize population health benefits while simultaneously minimizing harms. The primary care clinician is the target audience for USPSTF recommendations, but the recommendations are widely used by others.1 The USPSTF processes and methods are continually examined; recent updates have been published.1-3 The USPSTF Procedure Manual has been posted on the AHRQ website.4

In the current issue of Annals of Internal Medicine, the USPSTF reports that it concluded that the evidence to determine whether the benefits of skin cancer screening outweigh the harms was insufficient.5 No recommendation was made, and no letter grade was assigned. Instead, the USPSTF issued an I Statement.2 Insufficiency of evidence is a common occurrence for topics considered by the USPSTF. Even for screening topics that pertain to all or a large majority of adults, children, or adolescents, evidence is often insufficient.6 (Table 1).

The release of the I Statement for skin cancer screening provides an opportunity for the USPSTF to describe its plan to expand the kinds of information it commissions to be collected and reported routinely by Evidence Based Practice Centers, to describe the process that led to selection of this information, and to illustrate the uses of the information with examples.

Problem Statement

Primary care physicians and their professional societies expressed frustration with the frequency with which the USPSTF concluded that "evidence is insufficient" to make a recommendation. In the past, the USPSTF coupled this conclusion with a "recommendation" worded as follows:

...the USPSTF concludes that the evidence is insufficient to recommend for or against routine provision of xxx service.

Clinicians pointed out that this wording is not a recommendation. Anecdotally, the statement was characterized as "useless" and sometimes as "worse than useless."

In focus groups with practicing primary care providers a common request was for USPSTF guidance on a course of action with individual patients in situations where evidence about net benefit is insufficient. Professional society representatives reinforced the need for guidance.

The USPSTF and other bodies have generally held that the strongest argument for providing an intervention is based on scientific evidence provided by multiple large and well-conducted randomized clinical trials (RCTs). However, for the majority of clinical preventive services, this standard of evidence is unattainable.

Requirements for RCTs of behavioral counseling interventions are especially problematic, because the study interventions in gold-standard RCTs may be artificial. Tucker and Roth discuss this problem in the context of behavioral interventions for substance abuse.7 They point out that requiring "fidelity" in treatment delivery in a "gold-standard" RCT may eliminate the contextual aspects of the treatment experience and the adaptation of treatment to individual needs that underlies treatment success. Requiring a no-treatment or usual-treatment control condition may produce insurmountable barriers to recruitment. Requiring a control condition with the same number of contact hours as the treatment condition can compromise retention.

For different reasons, the conduct of RCTs for preventive services delivered to infants, children, and adolescents also presents challenges. For some preventive services, the long timeline required to improve health outcomes makes RCTs impractical. Consensus is lacking about the appropriate outcomes for preventive interventions in children, although agreement is universal that decreasing mortality is not the only goal of preventive services provided to this age group.

Finally, many valuable preventive interventions will never be subjected to RCT evaluation because a trial would be too expensive, recruiting enough participants is not feasible, or investigator interest or funding is lacking.

Acknowledging the Contribution of Non-RCT Evidence and Persisting Issues

Recognizing the paucity of evidence from RCTs, the USPSTF and other groups consider evidence from non-RCT study designs (such as cohort, cross-sectional, case-control, or quasi-experimental) as a standard strategy. Use of an analytic framework, which is an organizing principle for all recent or current USPSTF systematic reviews, permits incorporation of evidence from studies with a variety of designs and yields certainty that can theoretically approach the certainty of evidence derived from RCTs.

In constructing an analytic framework, clinical problems are conceptualized in terms of a sequence of key questions.8 A systematic review is generally done to answer each key question. When considered together, the key question evidence forms a chain of evidence that permits firm conclusions about net benefit.

In the case of skin cancer, notwithstanding the use of an analytic framework and consideration of study designs other than RCTs, the USPSTF could not conclude with even moderate certainty that the benefits of skin cancer screening by inspection outweighed the harms or that the harms outweighed benefits, making the evidence insufficient to make a recommendation.9 Even with consideration of non-RCT evidence within a structured causal framework, the problem of insufficient evidence persists.

A New Approach: Process and Outcome

In response to the concerns about the frequency of I statements, the frustration expressed by clinicians, and the call for guidance, the USPSTF held a workshop in spring 2005 to consider how better to meet the needs of its constituents when evidence is insufficient. The workshop involved members of the USPSTF, AHRQ staff, and scientists from the Evidence-Based Practice Center supporting the USPSTF. The charge to workshop participants was to develop a strategy to reduce confusion created by the wording of the "I recommendation," and to consider whether the USPSTF should "nuance the I"—that is, make a suggestion in favor of or against providing the service. At the workshop, attendees heard presentations from AHRQ and Evidence-Based Practice Center staff involved in the efforts of the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) working group.10-12 Other publications about approaches to grading recommendations were identified and reviewed.13

The workshop resulted in a decision to transform what had been called an "I recommendation" into an "I statement," as described in a prior publication.2 The workshop also led to a group decision to reject the proposal to "nuance the I." Nuancing would have resulted in recommendations that clinicians act routinely to offer a service even in the absence of at least moderate certainty that there are net benefits of the preventive service at the population level, thus violating an underlying principle guiding the work of USPSTF — avoidance of overall harm.

Prior to the workshop, attendees were charged with bringing to the meeting suggestions for criteria that might be used to "nuance the I," with practice relevance for clinicians, patients, or systems as the basis for these suggestions. During the workshop, further criteria were identified using a brainstorming technique. The product of these processes was a list of possible factors that could be used to "nuance the I". After rejecting the idea of "nuancing the I," the USPSTF decided to explore, after the workshop, whether provision of information about the factors identified during this process might be useful to clinicians. This follow-up work was delegated to the Methods Workgroup and members of the Evidence-based Practice Center.

The follow-up group noted that almost all of the factors fell into 1 of 4 groups. The 4 information groups came to be described as "domains," a term that connotes hierarchical ranking and is apt, even if accidental. That is, these information "domains" constitute a limited number of broadly applicable collections of factors, considerations, or attributes pertinent to decisionmaking about preventive services when evidence is insufficient to conclude, with certainty, that there is net benefit or net harm.

After more deliberation, in 2006 the Methods Workgroup proposed that the Evidence Based Practice Center gather pertinent information in the four domains as part of its evidence retrieval process. The full USPSTF accepted the proposal, and members provided further input to the descriptions of the domains. The current authors were asked to prepare a manuscript about the process and product on behalf of the USPSTF.

Domains and Rationale

The first domain is potential preventable burden of suffering from the condition. When evidence is insufficient, provision of a preventive intervention designed to prevent a serious condition (such as dementia) might be viewed more favorably than provision of a service designed to prevent a condition that does not cause as much suffering (such as skin rash). The USPSTF recognized that "burden of suffering" is subjective and involves judgment. In clinical settings, it should be informed by patient values and concerns.

The second domain is potential harm of the intervention. In the face of insufficient evidence, an intervention with a large potential for harm (such as major surgery) might be viewed less favorably than an intervention with a small potential for harm (such as advice to "watch less television"). The USPSTF again acknowledges the subjective nature and the difficulty of assessing potential harms. For example, how bad is a "mild" stroke?

The third domain is cost—not just monetary cost, but opportunity cost, in particular the amount of time a provider spends in order to provide the service, the amount of time the patient spends in order to partake of it, and the benefits that might derive from alternative uses of the time or money either for patients or clinicians or systems. Consideration of clinician time is especially important for preventive services with only insufficient evidence because providing them could "crowd out" provision of preventive services with proven value, services for conditions that require immediate action, or services more desired by the patient. For example, a decision to routinely inspect the skin could take up the time available to discuss smoking cessation, or to address an acute problem or a minor injury that the patient considers important.

The fourth domain is current practice. This domain was chosen because it is important to clinicians for at least 2 reasons. Clinicians justifiably fear that not doing something that is done on a widespread basis in the community may lead to litigation.14-15 More important, addressing patient expectations constitutes a crucial part of the clinician-patient relationship with respect to building trust and developing a collaborative therapeutic relationship. The consequences of not providing a service that is neither widely available nor widely used are less serious than not providing a service accepted by the medical profession and thus expected by patients. Furthermore, ingrained care practices are difficult to change, and efforts should preferentially be directed to changing those practices for which the evidence to support change is compelling.

Although the reviewers did not explicitly recognize it when these domains were chosen, the domains all involve consideration of the potential consequences-for patients, clinicians, and systems-of providing or not providing a service. Others writing about medical decisionmaking in the face of uncertainty have suggested that the consequences of action or inaction should play a prominent role in decisions.16-17

Decision Making

Decisionmakers do not have the luxury of waiting for certain evidence. Even though evidence is insufficient, the clinician must still provide advice, patients must make choices, and policymakers must establish policies.

Decisionmakers appropriately consider a broad array of information when making policies and recommendations for different settings.18-22 The Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) working group approach10-12 contrasts with that of the USPSTF. For every topic, the GRADE methodology results in a recommendation that is assigned 1 of 4 grades: weakly or strongly positive or negative. In arriving at the 4 recommendation grades, GRADE, like the USPSTF, assesses the quality of evidence about benefits, reviews the evidence about harms, and arrives at a judgment on the magnitude of benefits and harms and its confidence about them. Unlike the USPSTF system, the GRADE recommendation grades also take into account the importance of the outcome that the treatment prevents, the burdens of the therapy, the monetary costs of the therapy, and the guideline developers' estimation of the average person's values and preferences. In the GRADE approach, there is no category corresponding to the USPSTF I statement.

After deliberation, the USPSTF decided not to adopt the GRADE approach. It is beyond the scope of this article to provide an in-depth discussion of the differences in perspective and problems between GRADE and the USPSTF that are at the root of the decision to take have a different approach. We hypothesize that the exigencies of clinical practice in the case of treatment (and diagnosis) are much greater than for decisions about prevention. Choosing whether to treat or to pursue a diagnosis is a necessity for every patient with a disease or a complaint suggesting illness. In contrast, prevention is offered to an asymptomatic person as a putative "extra" good. If a test or service existed but there was no evidence of net benefit, a decision not to offer the service is perfectly acceptable, as the patient is healthy and free of symptoms.

The USPSTF does not intend to synthesize the information either within or across domains. When considering clinical preventive services, clinicians probably already consider these factors as part of their thinking process when making clinical judgments. The USPSTF seeks to make available information that might not otherwise be known and would not necessarily be easy to retrieve quickly and reliably. It is hoped that ready availability of information in each domain will make easier an explicit thinking process and that discussions with patients will be better informed and of higher quality.

Application of the 4 Domains in Clinical Practice

Skin Cancer Screening

This issue of Annals of Internal Medicine includes a recommendation statement and accompanying evidence report on skin cancer screening.59 Table 2 shows the 4 domains, and Table 3 shows how the domains pertain to clinical decisionmaking for skin cancer screening.23

Considering this information in a particular clinical situation, a clinician could decide against routinely inspecting the skin to detect skin cancer because the burden of suffering due to skin cancer is comparatively low, skin inspection takes time, and the practice of routine screening is not widespread. Alternatively, for a given patient with high lifetime exposure to ultraviolet light or in places where most people have high ultraviolet light exposure (for example, Arizona and Florida), the clinician might choose to recommend the service.

 Colorectal Cancer Screening

 The recently published USPSTF recommendation on colorectal cancer screening included two technologies for which evidence on net benefit was deemed insufficient.24 Information in the four domains was provided computerized tomography (CT) colonography in that publication and it is repeated here (Table 4). Considering this information, a clinician might opt to discuss CT colonography with a patient with objections to other established screening modalities if high quality CT colongraphy was available locally.

 Contrast: Screening for Lung Cancer Using Helical CT

 Screening for lung cancer using helical CT is another topic for which the USPSTF concluded that evidence of net benefit is insufficient.25 Table 5 shows information in the four domains for this service.26-29 Considering this information, a clinician might decide against recommending lung cancer screening even to a smoker because the test finds suspicious lesions that turn out not to be cancer in a high percentage of people screened and the evaluation of suspicious lesions can be invasive. Moreover, the test is neither widely available nor in widespread use.

 Contrast: Application to Screening for High Blood Pressure in Adolescents

 Screening for high blood pressure in adolescents is another topic for which the USPSTF concluded in 2003 that evidence was insufficient.30 This conclusion was based on the lack of evidence on the long-term outcomes of treatment for high blood pressure starting in adolescence and concern about the harms of long-term pharmacologic treatment started early. Considering this information, a clinician might decide in favor of assessing blood pressure routinely in adolescents because this requires little effort when added to the routine assessment of growth using height and weight and because identification of high blood pressure might provide an impetus for dietary and lifestyle changes that would alter the trajectory of blood pressure. Table 6 displays out how the new domains would facilitate this decisionmaking process.

Conclusions

For many topics considered by the USPSTF, the scientific evidence from research encompassing a variety of research designs does not permit even moderate certainty about the net benefit of the preventive service. Evidence about the net benefits of preventive services in subgroups defined by age, gender, race and other factors is likely to remain perpetually uncertain because additional subgroup questions are defined once evidence is obtained.

The challenge of decisionmaking under conditions of uncertainty is a recurring issue in medicine.31 When uncertainty about what course of action to recommend persists even after a thorough systematic review of evidence about clinical benefits and harms, the USPSTF will begin routinely to seek and provide structured information in the four domains selected for their relevance to prevention. The USPSTF intends that this information strategy will help guide the clinician's decision and enhance the discussion between the clinician and the patient and the patient's confidence about the decision.

The USPSTF recognizes that these domains do not define the universe of domains applicable to clinical prevention problems. It acknowledges the role of judgment in selection of the domains and the factors that comprise the domains. The USPSTF is receptive to consideration and explication of other domains that would be important for problems other than clinical prevention as well as to suggestions of alternatives to the domains selected by the USPSTF for clinical prevention. The USPSTF also hopes to generate more and sustained interest in developing a deeper understanding of what information best serves the goal of sound decisionmaking in conditions of uncertainty.

Notes

Author Affiliation

a Dr. Petitti: Department of Biomedical Informatics, Arizona State University, Phoenix, Arizona
b Dr. Teutsch: Outcomes Research, Merck & Co., Inc., West Point, Pennsylvania
c Dr. Barton: Center for Primary Care, Prevention and Clinical Partnerships, Agency for Healthcare Research and Quality
d Dr. Sawaya: Departments of Obstetrics, Gynecology and Reproductive Sciences, University of California, San Francisco, CA
e Dr. Ockene: Department of Medicine, University of Massachusetts Medical School, Worcester, Massachusetts
f Dr. DeWitt: Department of Pediatrics, Cincinnati Children's, University of Cincinnati College of Medicine, Cincinnati, Ohio

Disclaimer

Recommendations made by the USPSTF are independent of the U.S. government. They should not be construed as an official position of the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services.

Copyright and Source Information

This document is in the public domain within the United States.

Requests for linking or to incorporate content in electronic resources should be sent via the USPSTF contact form.

Source: Petitti, D., Teutsch, S.M., Barton, M.B., et al. Update on the Methods of the U.S. Preventive Services Task Force: Insufficient Evidence. Originally published in Ann Intern Med 2009;150:199-205.

References

  1. Guriguis-Blake J, Calonge N, Miller T, Siu A, Teutsch S, Whitlock E, U.S. Preventive Services Task Force. Current processes of the U.S. Preventive Services Task Force: refining evidence-based recommendation development. Ann Intern Med 2007;147:117-22.
  2. Barton MB, Miller T, Wolff T, Petitti D, LeFevre M, Sawaya G, Yawn B, Guirguis-Blake J, Calonge N, Harris R: U.S. Preventive Services Task Force. How to read the new recommendation statement: methods update from the U.S. Preventive Services Task Force. Ann Intern Med 2007;147:123-7.
  3. Sawaya GF, Guirguis-Blake J, LeFevre M, Harris R, Petitti D; U.S. Preventive Services Task Force. Update on the methods of the U.S. Preventive Services Task Force: estimating certainty and magnitude of net benefit. Ann Intern Med 2007;147:871-5.
  4. U.S. Preventive Services Task Force Procedure Manual. December 2016. Agency for Healthcare Research and Quality, Rockville, Maryland. Accessed at Procedure Manual on June 30, 2016.
  5. Screening for Skin Cancer: U.S. Preventive Services Task Force Recommendation Statement. Ann Intern Med 2009;150: XXX-XX.
  6. Guide to Clinical Preventive Services, 2008: Recommendations of the U.S. Preventive Services Task Force. AHRQ Publication No. 08-05122. Agency for Healthcare Research and Quality, Rockville, Maryland. Accessed at https://www.ahrq.gov/prevention/guidelines/index.html on March 4, 2021.
  7. Tucker JA, Roth DL. Extending the evidence hierarchy to enhance evidence-based practice for substance use disorders. Addiction 2006;101:918-32.
  8. Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, et al. Current methods of the U.S. Preventive Services Task Force: a review of the process. Am J Prev Med 2001;20:21-35.
  9. Wolff T, Tai E, Miller T. Screening for skin cancer: an update of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 2009;150:194-8.
  10. Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, et al. GRADE Working Group. Grading quality of evidence and strength of recommendations. BMJ 2004;328:1490.
  11. Atkins D, Eccles M, Flottorp S, Guyatt GH, Henry D, Hill S, et al. GRADE Working Group. Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches. BMC Health Serv Res 2004;4:38.
  12. Atkins D, Briss PA, Eccles M, Flottorp S, Guyatt GH, Harbour RT, et al. GRADE Working Group. Systems for grading the quality of evidence and the strength of recommendations II: pilot study of a new system. BMC Health Serv Res 2005;5:25.
  13. Ebell MH, Siwek J, Weiss BD, Woolf SH, Susman J, Ewigman B, et al. Strength of recommendation taxonomy (SORT): a patient-centered approach to grading evidence in the medical literature. J Am Board Fam Pract 2004;17:59-67.
  14. Merenstein D. A piece of my mind. Winners and losers. JAMA 2004;291:15-6.
  15. Krist AH, Woolf SH, Johnson RE. How physicians approach prostate cancer screening before and after losing a lawsuit. Ann Fam Med 2007;5:120-5.
  16. Feinstein AR. The "chagrin factor" and qualitative decision analysis. Arch Intern Med 1985;145:1257-9.
  17. Djulbegovic B, Hozo I, Schwartz A, McMasters KM. Acceptable regret in medical decisionmaking. Medical Hypotheses 1999;53:253-9.
  18. Lomas J, Culyer T, McCutcheon C, McAuley L, Law S. Conceptualizing and Combining Evidence for Health System Guidance. Ottawa, Canada: Canadian Health Services Research Foundation, 2005.
  19. Steinberg EP, Luce BR. Evidence based? Caveat emptor! Health Aff (Millwood) 2005;24:80-92.
  20. Teutsch S, Berger M. Evidence synthesis and evidence-based decisionmaking: related but distinct processes. Med Decision Making 2005;25:487-9.
  21. Clancy CM, Cronin K. Evidence-based decisionmaking: global evidence, local decisions. Health Aff (Millwood) 2005;24:151-62.
  22. Atkins D, Siegel J, Slutsky J. Making policy when the evidence is in dispute. Health Aff (Millwood) 2005;24:102-13.
  23. Bruce AJ, Brodland DG. Overview of skin cancer detection and prevention for the primary care physician. Mayo Clin Proc 2000;75:491-500.
  24. U.S. Preventive Services Task Force. Screening for colorectal cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med 2008;149:627-37.
  25. U.S. Preventive Services Task Force. Lung Cancer Screening: recommendation statement. Ann Intern Med 2004;140:738-9.
  26. Henschke CI, Yankelevitz DF, Libby DM, Pasmantier MW, Smith JP, Miettinen OS. International Early Lung Cancer Action Program Investigators. Survival of patients with stage I lung cancer detected on CT screening. N Engl J Med 2006;355:1763-71.
  27. Diederich S, Lenzen H. Radiation exposure associated with imaging of the chest: comparison of different radiographic and computed tomography techniques. Cancer 2000;89:2457-60.
  28. Brenner DJ, Elliston CD. Estimated radiation risks potentially associated with full-body CT screening. Radiology 2004;232:735-8.
  29. Swedish Lung Cancer Screening Program. Seattle, WA; 2008. http://www.swedish.org/body.cfm?id=347. Accessed November 24, 2008.
  30. U.S. Preventive Services Task Force Recommendation on High Blood Pressure Screening, 2003. In Guide to Clinical Preventive Services, 2005. Archive, Section 2. Recommendations for Adults: Heart and Vascular Diseases: Screening for High Blood Pressure.
  31. Helfand M. Using evidence reports: progress and challenges in evidence-based decisionmaking. Health Aff (Millwood) 2005;24:123-7.

Current as of: February 2009

Internet Citation: Update on Methods: Insufficient Evidence. U.S. Preventive Services Task Force. April 2017.