Jessica R. Sullivan*, Christina Carrano and Homira Osman
Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, USA
Received date: February 25, 2015; Accepted date: May 15, 2015; Published date: May 22, 2015
Citation: Sullivan JR, Carrano C, Osman H (2015) Working Memory and Speech Recognition Performance in Noise: Implications for Classroom Accommodations. Commun Disord Deaf Stud Hearing Aids 3:136. doi:10.4172/2375-4427.1000136
Copyright: © 2015 Sullivan JR, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Communication Disorders, Deaf Studies & Hearing Aids
Purpose: The aim of this study was to compare children’s performance on speech recognition and working memory tasks with two noise source configurations: back and side. Method: Children with normal hearing between the ages of 8-10 years of age participated in this study. Working memory and speech recognition in noise were administered in a counterbalanced manner across listening conditions. Results: Speech recognition performance in noise was significantly poorer when presented at 180 than from 90 degrees azimuth. There was no effect of noise source configuration on working memory performance. However, working memory performance in noise, regardless of position, were significantly poorer compared to quiet. No relationship was present between auditory working memory in noise and speech recognition in noise, when noise was presented at 90° azimuth. Conclusion: Children use perceptual cues and cognitive resources based on the difficulty of the task and audibility of the signal, Cognitive resources are largely called upon when listening conditions are more adverse and tasks become complex.
Working memory; Speech recognition in noise; Children; Cognition
Elevated noise levels can be detrimental to the learning and academic performance of children with typical development , with sensory impairments , and those with learning difficulties . Children experience trouble decoding and processing auditory information in noise because of their need for a higher sound pressure level (SPL) , higher audibility index [5,6], lower reverberation times [7,8] and favorable signal-to-noise ratios (SNR) when compared to adults [9,10]. Unfortunately, children spend a majority of their time in listening environments, such as at school, where interference from external (i.e., automobile traffic) and internal (i.e., individuals talking, movement of tables and chairs) noise sources is consistently present . The combinations of these noise types make classroom environments acoustically challenging for children.
Adverse listening conditions result in poorer speech recognition in children compared to adults (e.g., [12,13]. Despite the American National Standards Institute  recommended SNR of at least +15 dB for U.S. classrooms, studies indicate that classroom noise levels actually range from +5 to -7 dB SNR  providing children with less than ideal learning environments. Studies indicate that these adverse conditions are known to support word identification scores of no greater than 60% correct for children [8,15]. A study by Sato and Bradley  examined the speech recognition in noise performance of children ages 6, 8 and 11 years old at school in their own classrooms. In each classroom, rather than altering the level of the noise, the speech playback level was changed relative to the existing natural ambient noise, to vary the signal to noise ratio experienced by the children. For 80% of the children in each age group to exhibit near-ideal speech recognition performance, which was defined as scores of 95% or higher, the signal to noise ratios of +20, +18, and +15 dB would be required for the 6 year olds, the 8 year olds, and the 11 year olds, respectively. These findings demonstrate a developmental effect: young children are less able to identify speech in the presence of noise, especially if the noise is competing speech  and adult-like performance is not achieved until adolescence. However, children are not identifying isolated single words in classrooms, but are instead being asked to integrate, analyze, and comprehend new complex information. To do this, children are required to use sensory processes that extract the acoustic-phonetic information from the speech signal while simultaneously drawing upon cognitive processes that efficiently map this information onto memory representations . It is well established that working memory is a cognitive process that plays a role in speech recognition and can be negatively affected by external distractors such as noise (e.g., ).
Working memory is a cognitive process that is conceptualized as a limited capacity system, which temporarily maintains and stores information, while also supporting human thought process by providing an interface between perception, long-term memory, and action [19,20]. One component of this model is auditory working memory, which is defined as a temporary system under attentional control that stores and processes sound based material, and is positively correlated with an individual’s speech perception in noise [21,22]. A study done by Osman and Sullivan  evaluated children’s auditory working memory through measures of forward digit recall, backward digit recall, and listening recall in the presence of four-talker babble. Results showed that auditory working memory performance was systematically reduced at 0 dB SNR and -5 dB SNR in a group of 8-10 year old children with normal hearing. While the effect of unfavorable SNRs has been studied for working memory and speech in noise (SIN), it is unknown how the effect of noise changes as spatial location or noise source configuration changes. Because children are unable to anticipate, understand, and cope with degraded listening environments as well as adults, it is important to understand the specific effects of noise on tasks relevant for classroom learning.
The degree of challenge a child experiences in noise is ruled by the complexity of the listening condition (e.g., SNR, type of noise) and the task requirements (e.g., word identification, comprehension of a passage) [9,23]. As listening becomes more challenging, sensory processes become less effective and top-down processing or cognitive resources are more necessary. Adults are able to utilize their working memory, linguistic structure, contextual cues, and prior knowledge to support listening in noise, compared to children who do not have as much experience to help provide support. Klatte, Lachmann, and Meis  studied the effects of classroom noise, background speech, and reverberation on speech recognition and auditory comprehension tasks in a group of school-aged children and adults. Adults were unaffected by noise and background speech on both speech perception and auditory comprehension tasks. Whereas in children, background speech had a substantial effect on children’s auditory comprehension, but a small effect on their speech perception when compared to classroom noise. This suggests that background noise interferes with higher-order cognitive processes required for children’s comprehension compared to simple speech recognition. Noise reduced children’s auditory comprehension by disrupting the temporary representation of the incoming speech in working memory . This is consistent with Valente and colleagues’  findings: increasing levels of background noise and reverberation negatively affected performance in comprehension tasks compared to minimal effects of noise in measures of sentence-recognition for 8-12 year old children. Together, these results suggest that as complexity of task increases, there is a greater need for explicit engagement of top-down processes.
Children’s performance on peripheral auditory tasks, such as speech recognition in noise, improves with use of spatial separation, frequency separation, asynchronous temporal onset and modulation cues for separation of target signal and noise . Studies have demonstrated that both adults and children demonstrate improved speech intelligibility when target speech and competing speech are spatially separate [26-29]. Looking specifically at children, Litovsky  assessed 4.5 and 7 year old children’s spondaic word identification in noise. The location of noise varied between the front speaker (0° azimuth) and the side speaker (90° azimuth).
Children had a spatial advantage: they had better identification scores when noise was from the side compared to the front speaker. Improved performance in the side condition was attributed to the differential SNRs at the two ears . This spatial advantage has been documented in children as young as 3 years old . This suggests that in a complex acoustic listening environment, such as a noisy classroom, they might find it easier to attain information if the source of interest is spatially segregated from noise sources . Spatial release from masking extends to other tasks such as sentence recognition in noise (i.e., Hearing in Noise Test (HINT) ). While perceptual-sensory cues lead to improvements in speech-in-noise performance, it is unknown whether these cues alone are adequate for higher-level cognitive tasks.
To date, there is limited information on the effect of noise and its location on the relationship between auditory working memory and speech recognition in the pediatric population. While Osman and Sullivan  found that auditory working performance in children with normal hearing decreased substantially as SNR became unfavorable, the exact relationship between working memory in noise and speech recognition in noise and the effects of noise source configurations on that relationship were not examined. It is proposed that background noise places an explicit demand for cognitive resources, leaving fewer resources for storage and retrieval . Children spend a majority of their time in school environments that are acoustically complex, with multiple sources varying in location, amplitude, and time. Given that children spend a majority of their day in such multi-acoustic source environments where listening involves both sensory and cognitive processes, the purpose of this study was to examine the effects of background noise configuration on both auditory and cognitive-focused tasks relevant for classroom learning. With this knowledge, we as clinicians and educators can develop strategies to help ease the challenges that children experience in educational and real life settings. We hypothesize that noise from the back speaker configuration will affect performance on working memory and speech recognition performance to a greater extent than from the side, given the availability and benefit of sensory-spatial cues.
Ten children with normal hearing between the ages of 8 to 10 years (mean age 9 years and 3 months, SD 78 months) participated in this study. Hearing status was verified using a screening audiometer at 20 dB HL at frequencies including 500 Hz, 1000 Hz, 2000 Hz, and 4000 Hz. According to parent report, all participants had normal speech language development and were not enrolled in any special education services. The participants were monolingual English-speakers with no reported history of neurological or cognitive disorders/difficulties. The University of Washington Communication Studies Participant Pool was used to recruit participants. Prior to the beginning of the study, each participant and their parent received a verbal description of the tasks to be performed. Appropriate consent and assent were obtained from each participant in accordance with the policies of the University of Washington Institutional Review Board.
The entire test protocol was administered to each participant in a single one-hour test session. All testing was completed in a double-walled sound booth and the child was seated 1.5 meters from the front speaker, 1.5 meters from the side speaker, and 1.5 meters from the back speaker. For both speech recognition and working memory measures, the noise was held constant at 65 dBA as measured at the location of the participant’s head through the use of a Sound Quest Pro sound level meter. Participants were given practice trials prior to the onset of each task in order to ensure that the children understood the directions of the task. All of the tasks and conditions were completely counterbalanced, to account for possible carryover effects and to ensure that each participant completed each task in each condition.
Speech recognition in noise
Speech recognition in noise was assessed using the automated version Hearing in Noise Test (HINT) , The speech stimulus for the HINT are sentences adapted from Bamford-Kowal-Bench (BKB)  and recorded English-speaking male. Given the adaptive nature of this task, the level for each sentence varied based on the response of the participant. The noise, a speech-shaped masker that matched the average long-term spectrum of sentences, remained constant at a 65 dBA throughout the entirety of the test. The competing noise was presented from the side (90°) speaker and back speaker (180°). Key words were scored for 20 sentences to determine the child’s SNR-50, the SNR for which they accurately repeated the sentences 50% the time.
Auditory working memory
The backward digit recall subset from the Working Memory Test Battery for Children  was administered to the child in three listening conditions: quiet, noise-side (90°), and noise-back (180°). This subtest was selected as it involves both processing and storage aspects, which is important for measuring auditory working memory accurately. To ensure consistency across tasks, the noise from the HINT (same speech-shaped noise) was also used. The speech stimulus was presented monitored-live voice by an English-speaking female: the level was held constant at 65 dBA from the front speaker.
Participants were given a set of digits, ranging from two to six, and were asked to recall the sequence of numbers in reverse order. For example, the sequence 6-2-1 was corrected recalled as 1-2-6. The task was administered in a span procedure: the length of the sequence would increase by one for every four correct trials. When the child committed two errors in a set of six trials, the testing stopped and their score was calculated based on total correct trials across spans (sets). A second examiner verified the examiner’s judgments.
Statistical analyses were performed using SPSS software, Version 19. The means and standard deviations for the sample of children are shown in Table 1. A repeated-measures analysis of variance (ANOVA) revealed a statistically significant effect of noise source configuration on working memory performance, F (2,18)=12.56, p<0.001, partial ω2=0.583. This effect size is large according to Cohen’s (1988) standards, indicating that 58% of the variation in performance was accounted for by differences in noise-source configuration. Pairwise comparisons using Bonferroni adjustment revealed significant differences in performance in quiet compared to the noise-back (p=0.010) and noise-side conditions (p=0.06), but no significant difference between the two noise position conditions (p=0.893).
|HINT Noise Back||3.14||(3.21)|
|HINT Noise Side||-1.84||(4.06)|
|WM Noise Quiet||14.7||(1.16)|
|WM Noise Back||11.3||(2.45)|
|WM Noise Side||12.00||(3.27)|
Table 1: Means and standard deviations for each experimental condition.
Another repeated-measures ANOVA determined a statistically significant effect of noise source configuration on speech recognition HINT performance, F (1,9)=111.01, p<0.001, partial ω2=0.925. Pairwise comparisons using Bonferroni adjustment revealed significant differences in HINT performance in noise-back and noise-side conditions (p<0.0001). Children performed significantly better in the noise-side condition than in the noise-back condition. The mean score for speech recognition with noise coming from 90° azimuth was -1.84 dB SNR, as compared to an average score of 3.17 dB SNR with noise coming from 180° azimuth.
To examine the relationship between speech recognition and auditory working memory in noise, Pearson correlations (two-tailed) at the significance level of 0.01 were calculated. No significant relationships were present between auditory working memory in any noise condition and speech recognition in any noise condition, which perhaps suggests differing underlying processes (Figures 1 and 2).
An error analysis of the backward digit recall task in noise was performed for each noise source location. Because there were no significant differences in errors for each spatial location, errors were collapsed across spans. Figure 3 illustrates the proportion of errors made across spans. The average span for children in this study was four, which had the highest proportion of errors. There were two categories of errors: item and order. Item errors occurred when digits not included in the target stimuli were recalled. While order errors were when the digits included in the target stimuli were recalled but in the incorrect order. Of all errors made in noise, the majority of errors were identified as order errors; this indicates that the errors children make in noise are due to increased processing demands and not a result of perceptual masking.
The purpose of this study was to compare the effect of spatial location on children’s speech recognition and working memory performance in noise. Children were assessed on an auditory task (speech recognition) and a cognitive task (working memory) relevant for classroom learning. For both tasks, target speech was presented from the front speaker while speech-shaped noise was presented from the right side speaker at 90 and back speaker at 180 degrees azimuth. The working memory task (backward digit recall) was also administered in quiet to allow for comparison in the noise conditions. The relationship between speech recognition and working memory in each noise condition were examined. We had hypothesized that children’s working memory and speech recognition performance would be most affected when noise was from the back, compared to the side. Our hypothesis was only partially supported: speech recognition performance was better in the side condition, but working memory performance was not improved by the access of spatial cues. The speech recognition results suggest that children’s ability to take advantage of spatial cues increases when the processing demands are reduced, as was the case with the speech-in-noise task. It is widely accepted that listeners have better localization acuity when speech is presented from the front and noise presented from either side. Researchers suggest a potential explanation for these findings could be that spatial cues from the side are perceptually more accessible for listeners compared when noise is from behind . This indicates that when recalling auditory information in noise cues aid audibility and provides an advantage when processing demands are low.
At this stage, we can only speculate that working memory play a greater role when tasks become more complex and listening environments become adverse. Working memory has been described as a capacity limited system that involves the temporary storage and processing of information . Further examination of the types and frequency of errors made during the working memory task (backward digit recall) in noise demonstrated an increase as a child approached the limits of his or her capacity. The greatest proportion of order errors occurred in the fourth span, which was the limit for most children; this is consistent with pervious findings . The increased proportion of order errors across span suggests capacity limits were exceeded because of the simultaneous processing demands of recalling digits in reverse order in the presences of noise [39-41]. Taken together, these results suggest that children may still struggle with complex listening task in acoustically adverse situations (i.e. following directions in a classroom).
Previous studies have found a positive association between working memory and speech recognition for adults in adverse listening conditions [41,42]. To date a limited number of studies have investigated the cognitive abilities in relation spatial location (e.g. [37,43,44]. The present study suggests that in listening conditions when perceptual cues to aid audibility are limited and external factors increase in processing demand, working memory resources are called upon more. However, in the condition with noise when spatial cues are available there is a not as much of an increase in processing demands the role of working memory is implicit. The current study was consistent with similar studies in the adult literature that found a strong relationship between speech recognition and working memory with a front-back spatial orientation ([37,43,44]. Neher and colleagues  suggested that that when spatial cues are available due to a separation of target and masker (e.g. noise from the side and speech from the front) acoustic and perceptual cues are more available and the need for cognitive resource is reduced. When working memory was assessed with noise from the back and the side (right) the children did not demonstrate any advantage of spatial cues. Findings suggest that the increased processing demand added by having to reverse the numbers prevented any benefit from the spatial cues. However, the advantage of spatial separation was demonstrated on the speech recognition task.
The primary limitation of this study is its small sample size. Including children between the ages of 7-12 could have expanded the sample size and generalizability of this study. Another possible improvement to this study could have been including other noise types (i.e., competing talkers, fan/air condition noise) and spatial configurations (i.e., noise from the top, noise from the front speaker, noise from 45° azimuth, roving between speakers) to simulate a real classroom environment. This study limited the noise source conditions to the back and side speakers because they are two most common sources of student-generated noise in the classroom (e.g., during group discussion) and to obtain a preliminary understanding of noise source on auditory and cognitive tasks. In the future, a systematic investigation of noise source should take place to carefully understand and tease apart the spatial advantage that may be provided with certain noise source configurations. Likewise, future studies might include tasks of greater complexity for this age group (e.g., letter-number sequencing working memory task, AZ-Bio speech recognition task, auditory comprehension). As with any pediatric behavioral study, it is important to note that attention may have contributed to the scores produced by children, but the within-subjects study design was helpful in minimizing the effect of inattention or fatigue on our findings as each child was compared with itself across conditions.
While the present study is laboratory-based, it provides a first look at the possible effects of noise on working memory when audibility is compromised in the classroom. In situations when the processing demands are increased and audibility is reduced children with typical development are negatively affect by noise. Presently, our clinical assessments have focused on a child’s ability to identify pure or FM warble tones, single words, and short sentences, all simple auditory-focused tasks. Assessments of working memory are currently not included in clinical protocols but could provide educational audiologists and teachers with useful information about the skills involved when listening in challenging environments that children face, especially those with hearing loss. Accommodations can be made to reduce external factors that increase processing demands for all children especially those with hearing loss and language impairments. For example, reducing the noise levels in the classroom by installing acoustic tiles and carpeting can aid in improving the overall signal to noise ratio and reverberation. In addition, the use of personal or group FM systems can also be a great benefit for improving audibility and reducing the cognitive processing demands brought about by noise during direct instruction in the classroom.
Differences in children’s performance on speech recognition and working memory tasks highlight the effect of noise source location. Children did not demonstrate an advantage of spatial separation of target and masker on the working memory task compared to the speech recognition task. The results of this study substantiate previous findings that children can benefit from spatial cues when performing simple auditory-based speech recognition or discrimination in noise tasks. Spatial cues aid in audibility and thereby reduce the processing demand for a simple auditory task, which in turn limits the negative effect of noise. For a complex task, which involves processing and storage aspects, spatial cues do not provide sufficient improvement in noise. Thus, the data in this study indicate that tasks requiring more cognitive processing are negatively affected by noise and its location. This implies that future studies should investigate the effect of noise on auditory tasks similar to daily classroom activities to obtain a true assessment of how children perform in the real-world.
We are grateful to the University of Washington Pediatric Aural Habilitation Lab, the P30 Grant for participant recruitment. Funding from the University of Washington Royalty Research Fund A73769 was provided to Jessica Sullivan.