You Are Here:
A Review of the Evidence for the U.S. Preventive Services Task Force
Release Date: March 2011
By Roger Chou, MD; Tracy Dana, MLS; Christina Bougatsos, BS; Craig Fleming, MD; and Tracy Beil, MS
The information in this article is intended to help clinicians, employers, policymakers, and others make informed decisions about the provision of health care services. This article is intended as a reference and not as a substitute for clinical judgment.
This article may be used, in whole or in part, as the basis for the development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. AHRQ or U.S. Department of Health and Human Services endorsement of such derivative products may not be stated or implied.
This article was first published in Annals of Internal Medicine on March 1, 2011 (Ann Intern Med 2011;154:347-355; http://www.annals.org).
Background: Hearing loss is common in older adults. Screening could identify untreated hearing loss and lead to interventions to improve hearing-related function and quality of life.
Purpose: To update the 1996 U.S. Preventive Services Task Force evidence review on screening for hearing loss in primary care settings in adults aged 50 years or older.
Data Sources: MEDLINE (1950 and July 2010) and the Cochrane Library (through the second quarter of 2010).
Study Selection: Randomized trials, controlled observational studies, and studies on diagnostic accuracy were selected.
Data Extraction: Investigators abstracted details about the patient population, study design, data analysis, follow-up, and results and assessed quality by using predefined criteria.
Data Synthesis: Evidence on benefits and harms of screening for and treatments of hearing loss was synthesized qualitatively. One large (2305 participants) randomized trial found that screening for hearing loss was associated with increased hearing aid use at 1 year, but screening was not associated with improvements in hearing-related function. Good-quality evidence suggests that common screening tests can help identify patients at higher risk for hearing loss. One good-quality randomized trial found that immediate hearing aids were effective compared with wait-list control in improving hearing-related quality of life in patients with mild or moderate hearing loss and severe hearing-related handicap. We did not find direct evidence on harms of screening or treatments with hearing aids.
Limitation: Non–English-language studies were excluded, and studies of diagnostic accuracy in high-prevalence specialty settings were included.
Conclusion: Additional research is needed to understand the effects of screening for hearing loss compared with no screening on health outcomes and to confirm benefits of treatment under conditions likely to be encountered in most primary care settings.
Primary Funding Source: Agency for Healthcare Research and Quality.
The prevalence of hearing loss is 20% to 40% in adults aged 50 years or older and more than 80% for those aged 80 years or older (1–4). Hearing loss can affect quality of life and ability to function (5). Age-related hearing loss (presbycusis) is typically gradual, progressive, and bilateral (1). Other factors contributing to hearing loss in older adults include genetic factors, exposure to loud noises, exposure to ototoxic agents, history of inner ear infections, and presence of systemic diseases (such as diabetes mellitus) (6–8).
Older adults may not realize that they have hearing loss because it is relatively mild or slowly progressive; they may perceive hearing loss but not seek evaluation for it; or they may have difficulty recognizing or reporting hearing loss owing to comorbid conditions, such as cognitive impairment. Only 10% to 20% of older adults with hearing loss have ever used hearing aids (2, 9). Screening could identify people who could benefit from therapies for hearing loss.
In 1996, the USPSTF issued a recommendation to screen adults aged 50 years or older for hearing loss (grade B recommendation) (10). In 2009, the USPSTF commissioned a new evidence review to update its recommendation. The purpose of this report is to systematically evaluate the current evidence on screening for hearing loss in adults aged 50 years or older in primary care settings. (The full evidence review  is available on the USPSTF Web site, www.uspreventiveservicestaskforce.org.) The key questions, analytic framework (Appendix Figure), and scope of the report were developed in accordance with previously published USPSTF processes and methods (12–14). Additional details on study selection are provided in the Appendix.
The key questions were:
- Does screening of asymptomatic adults aged 50 years or older lead to improved health outcomes?
- How accurate are the hearing-loss screening methods among older adults, including questionnaires, clinical techniques (whispered voice test), and hand-held audiometry?
- How efficacious is the treatment of (screen-detected) hearing loss, namely amplification, in improving health outcomes?
- What are the adverse effects of hearing-loss screening in adults aged 50 years or older?
- What are the adverse effects of treatment of (screen-detected) hearing loss in adults aged 50 years or older?
We searched Ovid MEDLINE from 1950 to July 2010 and the Cochrane Database of Systematic Reviews and Central Register of Controlled Trials through the second quarter of 2010 to identify relevant articles. Appendix Table 1 contains the full search strategy. We also reviewed reference lists of relevant articles.
The Figure shows the flow of studies from initial identification to final inclusion or exclusion. We selected studies pertaining to screening, diagnosis, and treatment of hearing loss in adults aged 50 years or older by using predefined inclusion and exclusion criteria. (For details on study selection, go to the Appendix and Appendix Table 2.) Two reviewers evaluated each study to determine eligibility for inclusion. We restricted our review to published, English-language studies.
We used randomized, controlled trials (RCTs) and controlled observational studies to assess the effectiveness and harms of screening and treatment. For diagnostic accuracy, we included studies that compared a screening test with a reference standard.
Data Extraction and Quality Assessment
We abstracted details on patient population, study design, data analysis, follow-up, and results. One author abstracted data, and another verified the data. Two authors independently rated the internal validity of each study as “good,” “fair,” or “poor” by using predefined criteria developed by the USPSTF (Appendix Table 3) (14, 15). We also evaluated the applicability of studies to primary care screening on the basis of whether patients were recruited from primary care settings, prevalence and severity of hearing loss, proportion of patients with perceived hearing loss, and access to hearing aids (such as availability of free hearing aids). We resolved discrepancies in quality ratings by discussion and consensus.
For diagnostic accuracy studies, we used the diagti procedure in Stata, version 10 (StataCorp, College Station, Texas), to calculate sensitivities, specificities, and likelihood ratios. For studies that reported diagnostic accuracy based on more than 1 definition of hearing loss, we estimated median values on the basis of the Ventry and Weinstein criteria (for >40-dB hearing loss), the Speech Frequency Pure-Tone Average criteria (for >25-dB hearing loss), or the definition most similar to those used by other relevant studies. We used the cci procedure in Stata to calculate diagnostic odds ratios with exact 95% CIs.
Data Synthesis and Analysis
We assessed the overall strength of the body of evidence for each key question (“good,” “fair,” or “poor”) by using methods developed by the USPSTF on the basis of the number, quality, and size of studies; consistency of results among studies; and directness of evidence (14). We did not quantitatively pool results on diagnostic accuracy because of differences across studies in populations evaluated, definitions of hearing loss, screening tests evaluated, and screening cutoffs applied. Instead, we created descriptive statistics with the median sensitivity, specificity, and likelihood ratios (16), as well as associated ranges. We chose the total range, rather than the interquartile range, because certain outcomes were reported by only a few studies and the summary range highlights the greater uncertainty in the estimates. Too few randomized trials of hearing loss treatments were available to perform meta-analysis.
Role of the Funding Source
This study was funded by the Agency for Healthcare Research and Quality under a contract to support the work of the USPSTF. Agency staff and USPSTF members helped develop the scope of this work and reviewed draft manuscripts. Agency approval was required before this manuscript could be submitted for publication, but the authors are solely responsible for the content and the decision to submit it for publication.
Key Question 1
Does screening of asymptomatic adults aged 50 years or older for hearing loss lead to improved health outcomes?
We identified 1 randomized trial (17) of screening for hearing loss (Table 1 and Appendix Table 4). We rated the SAI-WHAT (Screening for Auditory Impairment—Which Hearing Assessment Test) trial as fair quality primarily because of high loss to follow-up and unclear blinding status of outcomes assessors. It compared 3 screening strategies (the AudioScope [Welch Allyn, Skaneateles Falls, New York], which is based on inability to hear a 40-dB tone at 2000 Hz in either ear; the Hearing Handicap Inventory for the Elderly—Screening Version [HHIE-S] [10 items; score ≥10; range, 0 to 40]; or the AudioScope plus the HHIE-S) with usual care without screening in 2305 predominantly male (94%) patients aged 50 years or older (mean age, 61 years) at a Veterans Affairs (VA) medical center. All enrollees were eligible to receive free, VA-issued hearing aids. About three quarters of patients reported perceived hearing loss at enrollment (on the basis of the question, “Do you think you have a hearing loss?”).
Rates of positive screenings were 19% in the AudioScope group, 59% in the HHIE-S group, and 64% in the combined group. Hearing aid use at 1 year, the primary outcome, was 6.3% in the AudioScope group, 4.1% in the HHIE-S group, 7.4% in the combined group, and 3.3% in the control group (P = 0.03 for between-group difference). In a post hoc stratified analysis, hearing aid use was greater among patients with perceived hearing loss (5.7% to 9.6% in screened groups vs. 4.4% in control group), but hearing aid use was minimal regardless of screening status among patients without perceived hearing loss (0% to 1.6%).
The proportion of patients who had a minimum clinically important difference (≥6-point improvement on a 0- to 100-point scale) on the Inner Effectiveness of Aural Rehabilitation scale (a measure of hearing-related function), a secondary outcome of the trial, did not differ at 1 year (36% to 40% in the screened groups vs. 36% in the unscreened group; P = 0.39). Post hoc analyses also showed no differences in the proportion who had improvements in hearing-related function according to whether patients had perceived hearing loss, except in a subgroup that was also 65 years of age or older (54% in the AudioScope group, 34% in the HHIE-S group, 40% in the combined group, and 34% in the control group).
Key Question 2
How accurate are the hearing-loss screening methods among older adults, including questionnaires, clinical techniques (whispered voice test), and hand-held audiometry?
Twenty studies evaluated the diagnostic accuracy of various screening tests (Appendix Table 5) (22–41). Four studies evaluated clinical tests (23, 26, 31, 37), 8 evaluated single-question screening (23, 25, 28, 33, 34, 36, 38, 40), 9 evaluated a hearing questionnaire (28–30, 32, 33, 35, 36, 39, 41), and 6 evaluated a hand-held audiometric device (22, 24, 26, 27, 30, 32). Four studies were population-based (25, 28, 33, 36), 4 recruited patients from primary care or community-based settings (30, 32, 35, 41), and the remainder recruited patients from specialty or other high-prevalence settings or evaluated nursing-home residents (24, 40).
We rated the quality of 7 studies as good (23, 25, 28, 30, 32, 33, 35) and the remainder as fair (Appendix Table 6). The most common methodological shortcomings were failure to describe a representative spectrum of patients, failure to report interpretation of the reference standard blinded to results of the screening test, and failure to describe a random or consecutive series of patients. All studies except for 1 used pure-tone audiometry as the reference standard, and 4 studies used a portable audiometer instead of standard audiometry (24, 34, 38, 40). One study performed an audiometric examination but used an audiologist assessment as the reference standard (41). Table 2 summarizes the main results on diagnostic accuracy.
Whispered Voice, Finger Rub, and Watch Tick Tests
One good-quality (23) and 3 fair-quality (26, 31, 37) studies evaluated the whispered voice test at 2 feet for identifying hearing loss greater than 25 or 30 dB (Appendix Table 7). Likelihood ratio (LR) estimates varied, with a median positive LR of 5.1 (range, 2.3 to 7.4) and median negative LR of 0.03 (range, 0.007 to 0.73). The good-quality study reported the weakest LRs (positive LR, 2.3 [95% CI, 1.3 to 3.8]; negative LR, 0.73 [CI, 0.61 to 0.87]) (22). One fair-quality study found inability to hear a whispered voice at 6 inches (positive LR, 72 [CI, 4.6 to 1140]) or a conversation voice at 2 feet (positive LR, 46 [CI, 2.9 to 740]) to be more useful than inability to hear a whispered voice at 2 feet (positive LR, 5.7 [CI, 3.1 to 11]) for identifying hearing loss, but estimates were imprecise and overlapped (30). Normal results with the first 2 tests were less useful than the whispered voice test at 2 feet for identifying persons without hearing loss (negative LRs, 0.27 [CI, 0.19 to 0.39] and 0.53 [CI, 0.43 to 0.66], respectively, vs. 0.008 [CI, 0.0005 to 0.13]).
The good-quality study also evaluated the accuracy of the finger rub and watch tick tests at 6 inches for detecting hearing loss greater than 25 dB (23). Compared with the whispered voice test, inability to hear a finger rub or watch tick was more useful for identifying hearing loss (positive LR, 10 [CI, 26 to 43] and 70 [CI, 4.4 to 1120], respectively); normal results were similarly useful for identifying persons without hearing loss (negative LR, 0.75 [CI, 0.68 to 0.84] and 0.57 [CI, 0.46 to 0.66]).
Five good-quality (23, 25, 28, 33, 36) and 3 fair-quality (34, 38, 40) studies evaluated a single screening question about perceived hearing difficulties (Appendix Table 8). For detection of hearing loss greater than 25 dB, 6 studies found that a positive response to a single question increased the likelihood of hearing loss (median positive LR, 3.0 [range, 2.4 to 3.8]) (23, 25, 33, 36, 38). Usefulness of a negative response varied (median negative LR, 0.40 [range, 0.33 to 0.82]). For detection of hearing loss greater than 40 dB, 3 good-quality studies found a median positive LR of 2.5 (range, 2.1 to 3.1) and median negative LR of 0.26 (range, 0.13 to 0.41) (25, 28, 36). One fair-quality study of nursing home residents reported a weaker positive LR (1.4 [CI, 1.2 to 1.8]) and similar negative LR (0.61 [CI, 0.43 to 0.87) compared with studies of community-dwelling older adults (40).
Five good-quality (28, 30, 32, 33, 36) and 3 fair-quality (35, 39, 41) studies evaluated the HHIE-S, and 1 fair-quality study evaluated the American Academy of Otolaryngology—Head and Neck Surgery 5-minute hearing test (29) (Appendix Table 9).
On the basis of an HHIE-S cutoff score greater than 8, 4 good-quality studies reported a median positive LR of 3.5 (range, 2.4 to 11) and negative LR of 0.52 (range, 0.43 to 0.70) for detection of hearing loss greater than 25 dB (30, 32, 33, 36). One fair-quality study reported a somewhat lower positive LR and similar negative LR (2.3 and 0.38, respectively [CIs not calculable]) based on an audiologist evaluation reference standard rather than audiometry (41). Studies on the accuracy of HHIE-S cutoff scores greater than 8 for identifying hearing loss greater than 40 dB reported similar likelihood ratios (28, 30, 32, 36, 39). Changing the HHIE-S threshold from greater than 8 to greater than 24 increased the positive LR for identification of hearing loss greater than 40 dB from 3.1 to 10 and increased the negative LR from 0.37 to 0.77 in 1 good-quality study (30) but had little effect on LR estimates in another good-quality study (32).
One fair-quality study found that the 5-minute hearing test had positive LRs ranging from 1.1 to 9.9 and negative LRs ranging from 0.47 to 0.76 for detection of hearing loss greater than 25 dB, depending on the cutoff score evaluated (29).
Hand-Held Audiometric Devices
Two good-quality (30, 32) and 4 fair-quality (22, 24, 26, 27) studies evaluated the AudioScope hand-held audiometric device (Appendix Table 10). The frequencies and intensities of the tones tested with the AudioScope varied. For detection of hearing loss greater than 25 dB (based on Speech Frequency Pure-Tone Average criteria), 1 good-quality study found that the AudioScope (based on the ability to hear a 2000-Hz tone at 40 dB) had a positive LR of 5.8 (CI, 3.4 to 9.8) and a negative LR of 0.40 (CI not calculable) (32). For detection of hearing loss greater than 30 dB, a fair-quality study found that the AudioScope (based on ability to hear 500-, 1000-, 2000-, and 4000-Hz tones at 25 dB) had a positive LR of 3.1 and a negative LR of 0.10 (CIs not calculable) (22). For detection of hearing loss greater than 40 dB, 3 studies of community-dwelling older adults found that the AudioScope (based on ability to hear tones between 500 and 4000 Hz at 40 dB) had a median positive LR of 3.4 (range, 1.7 to 4.9) and median negative LR of 0.05 (range, 0.03 to 0.08) (26, 30, 32). One fair-quality study of nursing-home residents found that the AudioScope (based on failure to hear a 1000- or 2000-Hz tone in both ears) was associated with a much weaker positive LR (1.3 [CI, 1.0 to 1.5]) but similar negative LR (0.08 [CI, 0.01 to 0.61]) (24).
Direct Comparisons of Different Types of Screening Tests
Six good-quality studies directly compared the diagnostic accuracy of different screening tests (23, 28, 30, 32, 33, 36). One study found that the whispered voice test and single-question screening had similar positive LRs (2.3 [CI, 1.3 to 3.8] and 2.5 [CI, 1.0 to 5.9], respectively) and negative LRs (0.73 [CI, 0.61 to 0.87] and 0.82 [CI, 0.68 to 0.99]), but the watch tick and finger rub tests had substantially stronger positive LRs (70 [CI, 4.4 to 1120] and 10 [CI, 2.6 to 43], respectively) and similar negative LRs (0.57 [CI, 0.49 to 0.66] and 0.75 [CI, 0.68 to 0.84]) (23). Three studies showed a consistent tradeoff with the HHIE-S compared with single-question screening, with somewhat stronger positive and weaker negative LRs (23, 28, 33, 36). Two studies found that normal results on AudioScope were generally associated with stronger negative LRs (0.05 and 0.24) compared with the HHIE-S (0.37 and 0.76), although LR estimates varied depending on the HHIE-S cutoff score evaluated and the criteria used to define hearing loss (30, 32).
Key Question 3
How efficacious is the treatment of (screen-detected) hearing loss, namely amplification, in improving health outcomes?
We identified 4 RCTs on treatment of hearing loss (Table 1) (18–21). We rated the quality for 1 trial as good (19), 2 as fair (18, 21), and 1 as poor (20) (Appendix Table 4). Shortcomings of the fair-quality trials included potentially important baseline differences between groups and failure to describe intention-to-treat analysis (21) and failure to describe randomization or allocation concealment methods or loss to follow-up (18). The poor-quality trial did not describe allocation concealment, use of intention-to-treat analysis, or loss to follow-up and reported outcomes incompletely (20). All of the trials had characteristics that could limit generalizability to screening in primary care, including recruitment of mostly white male veterans (19, 21), restriction to patients eligible for free hearing aids (21), inclusion of patients referred for suspected hearing problems (19), enrollment of dependent older adults (20), and inclusion of patients using hearing aids (18).
The good-quality RCT (n = 194) randomly assigned veterans (mean age, 72 years) to immediate hearing aids or wait-list control for 4 months (19). About two thirds of patients were enrolled from a primary care setting on the basis of a positive AudioScope screening result for hearing loss greater than 40 dB. The others were referred to the trial because of suspected hearing problems. Mean pure-tone threshold was 52 dB and was similar among screening-detected and referred patients. Mean baseline HHIE score was about 50 (25 items; range, 0 to 100), indicating severe effects on hearing-related quality of life and function (42).
At 4 months, HHIE or Quantified Denver Scale (QDS) (a measure of perceived communication difficulties) scores did not change from baseline in the control group. In the hearing aid group, the HHIE score improved from a mean of 49 at baseline to 15 at 4 months and the QDS score improved from 59 to 36. The mean between-group difference in change from baseline was 34 (CI, 27 to 41) on the HHIE and 24 (CI, 17 to 31) on the QDS. Results were similar in the subgroup of screening-detected patients. Statistically significant but small (<1 point) effects on the Geriatric Depression Scale (0- to 15-point scale) and Short Portable Mental Status Questionnaire (0- to 10-point scale) scores were also observed in the hearing aid group, but baseline scores indicated only mild depression or cognitive dysfunction. A follow-up study found that improvements in HHIE and QDS scores were sustained in the hearing aid group through 12 months (43).
A second, fair-quality trial (n = 64) enrolled veterans (mean age, 68 years) with less severe hearing loss (mean pure-tone threshold of 32 dB) (21). Patients eligible for free VA-issued hearing aids (n = 30) were randomly assigned to a standard nondirectional (n = 14) or a programmable, directional digital hearing aid (n = 16). Those ineligible for free hearing aids were randomly assigned to no treatment (n = 15) or an assistive listening device (n = 15). Baseline differences across the intervention groups in Abbreviated Profile of Hearing Aid Benefit (APHAB) score (0- to 100-point scale) were statistically significant (range, 38 to 52; P = 0.04) and were likely to be clinically significant for baseline HHIE scores (range, 28 to 50).
At 3-month follow-up, trivial improvements from baseline on HHIE scores occurred in the no-treatment and assistive listening device groups (mean change, 2.2 and 4.4 points, respectively), but both types of hearing aids were associated with clinically significant improvements (mean, 17 and 31 points with standard and programmable hearing aids, respectively). Changes in APHAB scores were small in the assistive listening device and no-treatment groups (6 and 3 points, respectively), with no change in Revised QDS scores. Although both hearing aid groups had greater improvements in hearing-related outcomes than the no-treatment and assistive listening device groups, these were baseline differences between groups, and results are subject to additional confounding because patients were randomly assigned separately on the basis of eligibility for free hearing aids.
In another fair-quality crossover trial (n = 80), a subgroup of patients not using hearing aids at enrollment (mean pure-tone threshold hearing loss of 37 dB and mean HHIE score of 30) found no clear differences between hearing aids, an assistive listening device, or both and no amplification on HHIE scores and other measures of function or quality of life (18). A poor-quality trial (n = 133) found that older adults who were randomly assigned to hearing aids did not experience improvement in Geriatric Depression Scale scores at 6 months and did not report results in adults randomly assigned to no hearing aids (20).
Key Question 4
What are the adverse effects of hearing-loss screening in adults aged 50 years or older?
No randomized trials or controlled observational studies evaluated potential harms (such as anxiety, labeling, or other psychosocial effects) associated with screening for hearing loss.
Key Question 5
What are the adverse effects of treatment of (screen-detected) hearing loss in adults aged 50 years or older?
Harms were not reported in any trial of hearing aids, and we identified no controlled observational studies on potential harms. Adverse effects described in case reports include dermatitis, accidental retention of molds, cerumen impaction, otitis externa, or associated middle ear problems (44–46).