Procedure Manual Appendix VII. Criteria for Assessing External Validity (Generalizability) of Individual Studies

Each study that is identified as providing evidence to answer a key question is assessed according to its external validity (generalizability), using the following criteria.

Study population: The degree to which a study's subjects constitute a special population—either because they were selected from a larger eligible population or because they do not represent persons who are likely to seek or be candidates for the preventive service. The selection has the potential to affect the following:

  • Absolute risk: The background rate of outcomes in the study could be greater or less than what might be expected in asymptomatic persons because of the inclusion/exclusion criteria, nonparticipation, or other reasons.
  • Harms: The harms observed in the study could be greater or less than what might be expected in asymptomatic persons.

The following are features of the study population and the study design that may cause a participant's experience in the study to be different from what would be observed in the U.S. primary care population:

  • Demographic characteristics (i.e., age, sex, ethnicity, education, income): the criteria for inclusion/exclusion or nonparticipation do not encompass the range of persons who are likely to be candidates for the preventive service in the U.S. primary care population.
  • Comorbid conditions: the frequency of comorbid conditions in the study population does not represent the frequency likely to be encountered in persons who seek the preventive service in the U.S. primary care population.
  • Special inclusion/exclusion criteria: there are other special inclusion/exclusion criteria that make the study population not representative of the U.S. primary care population.
  • Refusal rate (i.e., ratio of included to not included but eligible participants): the refusal rate among eligible study subjects is high, making the study population not representative of the U.S. primary care population, even among eligible enrollees.
  • Adherence (i.e., run-in phase, frequent contact to monitor adherence): the study design has features that may increase the effect of the intervention in the study more than would be expected in a clinically observed population.
  • Stage or severity of disease: the selection of subjects for the study includes persons at a disease stage that is earlier or later than would be found in persons who are candidates for the preventive service.
  • Recruitment: the sources for recruiting subjects for the study and/or the effort and intensity of recruitment may distort the characteristics of the study subjects in ways that could increase the effect of the intervention as it is observed in the study.

Study setting: The degree to which the clinical experience in the setting in which the study was conducted is likely to be reproduced in other settings:

  • Health care system: the clinical experience in the system in which the study was conducted is not likely to be the same as that experienced in other systems (e.g., the system provides essential services for free when these services are only available at a high cost in other systems).
  • Country: the clinical experience in the country in which the study was conducted is not likely to be the same as that in the United States (e.g., services available in the United States are not widely available in the other country or vice versa).
  • Selection of participating centers: the clinical experience in which the study was conducted is not likely to be the same as in offices/hospitals/settings where the service is delivered to the U.S. primary care population (e.g., the center provides ancillary services that are not generally available).
  • Time, effort, and system cost for the intervention: the time, effort, and cost to develop the service in the study is more than would be available outside the study setting.

Study providers: The degree to which the providers in the study have the skills and expertise likely to be available in general settings:

  • Training to implement the intervention: providers in the study are given special training not likely to be available or required in U.S. primary care settings.
  • Expertise or skill to implement the intervention: providers in the study have expertise and/or skills at a higher level than would likely be encountered in typical settings.
  • Ancillary providers: the study intervention relies on ancillary providers who are not likely to be available in typical settings.

Global Rating of External Validity (Generalizability)

External validity is rated "good" if:

  • The study differs minimally from the U.S. primary care population/setting/providers and only in ways that are unlikely to affect the outcome; it is highly probable (>90%) that the clinical experience with the intervention observed in the study will be attained in the U.S. primary care setting.

External validity is rated "fair" if:

  • The study differs from the U.S. primary care population/setting/providers in a few ways that have the potential to affect the outcome in a clinically important way; it is moderately probable (50% to 89%) that the clinical experience with the intervention observed in the study will be attained in the U.S. primary care setting.

External validity is rated "poor" if:

  • The study differs from the U.S. primary care population/setting/providers in many ways that have a high likelihood of affecting the clinical outcome; probability is low (<50%) that the clinical experience with the intervention observed in the study will be attained in the U.S. primary care setting.

Current as of: July 2017
Internet Citation: Appendix VII. Criteria for Assessing External Validity (Generalizability) of Individual Studies. U.S. Preventive Services Task Force. July 2017.

Back to Previous Section

Proceed to Next Section