Claro Digital Perception Processing

places on the basilar membrane as well as in the brain. ... the basilar membrane is not only stimulated .... signal component at mid-frequencies (top part of the.
860KB taille 65 téléchargements 401 vues
Claro Digital Perception Processing TM

TM

Sound processing with a human perspective Introduction Signal processing in hearing aids has always been directed towards amplifying signals according to physical sound levels. As more is learned about psychoacoustics and auditory processing, it is apparent that auditory perception goes beyond functioning as a simple microphone or sound level meter. The cochlea is a complex sound processor which actively influences the way that we perceive sounds. This "processor” is capable of handling a remarkable range of stimuli from very quiet (around 0 dB HL) to very loud (about 120 dB HL), resulting in a dynamic range better than that of an audio CD. In addition, the cochlea uses a complex filtering scheme to amplify sounds selectively. Sophisticated use of digital technology, coupled with a deeper understanding of psychoacoustics and cochlear signal processing offers the opportunity to approximate the complexity of the normal auditory system with complex signal processing.

Claro Digital Perception Processing

Introduction In Claro we combine the most advanced digital technology with proven psychoacoustic models (Launer, 1995; Dau et al., 1996a; Dau et al., 1996b; Moore et al., 1999) based on the complex perceptual patterns in the cochlea. In the cochlea, gain and output characteristics are governed by perception and not physical quantities. The processing employed in Claro is modeled in this manner and is called Digital Perception Processing (DPP). Claro applies two psychoacoustic models: a model of normal cochlea function and a model of individual impaired cochlea function. The DPP algorithm uses this data to set gain and power characteristics. The use of this strategy results in clearer sound and

Speech region

0

5

0

2

0

10

4

6

0.25 0.125

15

8

10

Cochlea

20

12

25

14

1 0.5

16

18

4 2

Figure 1 Scales of length, Claro critical band rate and frequency on the unwound cochlea. Note that the scales of length and critical band rate are linear, but that of the

8

30 mm

length

20 Bark

Claro critical bands

10 kHz

frequency

appropriate loudness perception in all listening situations. In order to understand DPP it is necessary to first examine the mechanisms for sound processing in the normal cochlea.

Auditory perception in the normal cochlea As sound enters the ear, it stimulates the tympanic membrane and vibrates the ossicular chain in response to pressure changes. This energy is transmitted to the oval window of the cochlea, and a compression wave travels along the cochlea moving the basilar membrane (BM). The organ of corti, which sits on the BM, supports rows of sensors, the hair cells. When displaced, these hair cells generate neural impulses. This is the signal that is finally transmitted to the auditory cortex. The complex sounds we hear (such as speech and music) are analyzed along the BM, according to their different frequency components. Different frequencies are mapped to different places on the basilar membrane as well as in the brain. When the ear is stimulated by a pure tone the basilar membrane is not only stimulated at a specific point, the excitation is spread over a wider area. Other areas close to the characteristic site of the pure tone stimulus are also affected. This effect can be described as a bandpass filter with a distinct center frequency and a variable filter slope. The whole cochlea can be thought of as a filterbank which covers the entire audible frequency range.

frequency is logarithmic. Adapted from Zwicker and Fastl (1990)

2

Critical bands These cochlear filters are known as critical bands. If we unwind the cochlea, as shown schematically in Figure 1, we find that the audible speech range (from 0.125 kHz to 10 kHz) is represented by 20 critical bands. The bandwidth (Hz) of these filters is called the critical bandwidth and is dependent on both frequency and level. There are interactions and mutual dependencies in

Level of test tone [dB SPL]

1

∆f = 100

200

4 kHz

800 Hz

60

40

20

0 0.05

0.02

0.1

0.2

1

0.5

2

Frequency of test tone [kHz]

5

10

20

fc = Center frequency ∆f = bandwidth

Figure 2 Excitation level at different center frequencies (fc). With increasing frequency the bandwidth ∆f also increases. Adapted from Zwicker and Fastl (1990).

[dB]

Excitation pattern and masking The way that we perceive sound in terms of pitch and loudness is determined by the excitation patterns produced in the cochlea. This is the pattern of neural responses evoked by a signal. For a complex tone, every spectral component of the signal excites a characteristic region on the basilar membrane. For a single signal component the excitation pattern can be determined directly from the level dependent shape of the cochlear excitation levels (Figure 4). The overall excitation pattern is the maximum excitation resulting from all frequency components of a complex signal. Figure 3 shows the excitation pattern for a three-tone complex. The softer signal component has an excitation pattern which falls below that of the louder signal components. The quiet component is therefore masked by the louder components and is not perceived. This cochlear mechanism helps decipher the complex acoustic information that enters the ear, by processing and transmitting to the auditory cortex only the important components of a signal. Psychoacoustic masking experiments have resulted in the development of a model of critical band filtering in the cochlea (Fletcher and Munson, 1930; Zwicker and Fastl, 1990; Moore, 1997). The frequency resolution of the critical bands as derived from these various psychoacoustic experiments is shown in Figure 1.

fc = 0.25 80

100

80

Input level Excitation level

the firing patterns of different auditory neurons because the filters are overlapping and coupled to each other. The bandwidth of the critical band filters is frequency dependent and approximately logarithmically scaled. The bands become wider with increasing frequency (Figure 2). One critical band is roughly equivalent to 1.3 mm on the basilar membrane. The output of this auditory filter bank forms the basis of the cochlear excitation pattern and is the signal "sent” to the brain.

60

40

20

0 0

1

10

Frequency [kHz] Figure 3 Excitation level for a 3 tone complex sound. The bars represent the components of the input signal, the dotted lines show the excitation from all 3 tones, whereas the red curve depicts the overall excitation that results in auditory perception. The excitation pattern of the softest tone (orange) is masked and is not perceived.

3

Level dependency of critcal band filters The shape of the critical band filters is variable and their characteristics additionally depend on the signal level. Figure 4 shows the masking pattern of a 1 kHz tone deduced from psychoacoustic experiments (Zwicker and Fastl, 1990). With increasing signal level the upward slope of the exitation pattern becomes shallower. In the lower frequencies the filter shape remains constant. This means that a loud low frequency signal has a stronger masking effect on signals at higher frequencies than a soft signal of the same frequency. The excitation patterns are asymmetric with a constant, steep slope on the low frequency side and a shallow, level dependent slope in the high frequencies. This phenomenon is commonly referred to as the upward spread of masking.

Threshold, excitation level [dB]

Claro Digital Perception Processing

100

L=100 dB 80

80 60

60 40

40 20

20

0 0

2

4

6

8

10

12

14

16

18

20

22

Frequency/critical band Figure 4 Excitation level versus critical band rate for a noise one critical band wide with a center frequency of 1 kHz and the critical band levels (L) indicated. The broken line indicates the threshold in quiet. Adapted from

Velocity [µm/s]

Zwicker and Fastl (1990).

Cochlea dynamic range The very wide dynamic range of signals (as large as 1 to 120 dB) which occur in everyday life cannot be encoded directly into neural responses, since most neurons have a much smaller dynamic range (about 30 dB). The healthy cochlea compresses the dynamic range of incoming sounds to enable encoding by the auditory system. This compressive effect is performed by the outer hair cells, which actively influence the mechanics of the cochlea for soft to medium loud sounds (cochlea amplifier). Figure 5 shows the input/output characteristic of the cochlea as a function of the velocity of basilar membrane displacement (Ruggero and Rich, 1991). The linear growth at very low levels becomes compressive in the mid-range and linear again at high signal levels. The compression ratio at mid-levels is about 3 to 5. With outer hair cell damage present in sensorineural hearing loss, this normally compressive function becomes linear and the cochlea looses its compressive characteristics. The slope of the cochlear input/output curve becomes much steeper, and is additionally shifted to higher input levels.

10000

1000

100

0 20

40

100

Sound Level [dB SPL] Figure 5 Input-output function on the basilar membrane immediately preceding (solid lines) and following (broken lines) an injection of a drug that disables the function of the outer hair cells in chinchillas. Shortly after the injection (11-19 minutes) the input-output function for the center frequency is markedly altered. The biggest alteration is at low sound levels, while at high levels the response was quite normal. This clearly shows that the compressive nonlinearity of the cochlea is lost with reduced outer hair cell function. Adapted from Ruggero and Rich (1991).

4

80

60

24

The result of reduced outer hair cell function is twofold: Audibility is lost (threshold shift) and the highly specialized cochlear signal processing mechanisms of compression are lost (recruitment). To compensate for these effects, hearing instruments apply amplification and compression to the acoustic signal that is presented to the impaired ear. Loudness summation Psychoacoustic experiments have also shown that we perceive narrow and broadband sounds differently (Zwicker et al., 1957; Moore, 1995). This is due to the filtering of sounds in critical bands and the non-linearity of the cochlea. In order to perceive a narrowband and a broadband signal equally loud, the level of the narrowband signal has to be 10-20 dB above the level of the broadband sound. This is because the narrowband sound excites only one critical band filter, while the broadband sound stimulates the entire cochlea. Because in normal cochlea processing the signal is compressed, the loudness perception resulting from stimulating several bands is higher than when only one band is stimulated. This phenomenon is known as loudness summation.

frequency dependent. DPP in Claro amplifies sound to compensate more appropriately for the altered processing that occurs in the impaired cochlea. Claro does this by listening to the sounds with models of both the intact cochlea and the impaired cochlea. Masking One of the main benefits of the perception based signal processing within Claro is the correct consideration of the upward spread of masking effect which cannot be accounted for in multiband compression systems. In such systems, where the bands are uncoupled, symmetrical and not level dependent, high gain is applied to frequency bands with a low level input signal, even when these signal components in the normal cochlea are spectrally masked. Therefore, sound components that are meant to remain inaudible are amplified, adding noise and affecting sound quality and clarity. Claro’s DPP model takes the spectral masking of sounds into account and amplifies sounds in a way that is perceptually correct. The consideration of the upward spread of masking reduces unnecessary amplification of low level signal components. Figure 6 schematically shows the effect of DPP compared to a multiband compression system.

Digital Perception Processing in Claro Digital Perception Processing (DPP) within Claro incorporates a proven psychoacoustic model which calculates the perceptual patterns created by the normal cochlea (Launer, 1995; Dau et al., 1996a; Dau et al., 1996b; Moore et al., 1999). This model controls the loudness in twenty critical bands, rather than relying on physical sound parameters. As described in the previous section, the cochlea’s critical-band filters are asymmetric, coupled, level and frequency dependent. The 20 critical bands in Claro’s Digital Perception Processing are cochlea-like and therefore also asymmetric, coupled, level and

Loudness control DPP in Claro continuously monitors the loudness of incoming sounds. This information is used to calculate the appropriate gain that is applied to the signal. A correct loudness impression for Claro users is achieved in all situations. In traditional multiband systems, loudness growth functions simply map input sound pressures to output sound levels. Hearing systems controlled on this basis do not aways create the appropriate loudness perception. Some sounds may be over-amplified, others under-amplified. This is because other important parameters 5

Excitation [dB]

Claro Digital Perception Processing

Input Signal

dB

Cochlea

Hz

Hz Multiband system

Figure 6

Hz

A three-tone signal that enters the cochlea generates an excitation pattern that causes masking of the soft signal component at mid-frequencies (top part of the diagram). A multiband compression system (middle part) applies high gain especially to soft signal components

Claro DPP

unmasking those signals, negatively affecting the

Hz

noisiness of the perceived sound. DPP-amplification in Claro (bottom part) calculates the excitation pattern in 20 critical bands and the masked components remain inaudible. This results in cleaner and clearer sound reproduction.

UCL

Threshold

Output Level [dB SPL]

Loudness Limiting: Loud sounds Compression: Range of speech sounds Expansion: Soft squelch for low level signals Loudness control

Exitation Level [dB SPL]

Figure 7 Operating range of Claro DPP. In the range covering the usable speech frequencies, gain is controlled by loudness and perceptually appropriate compression is applied to the signal (green area). The black arrows show the gain that is applied to the input signal. The signals are mapped into the residual auditory range. For very high input levels (shaded in orange) the loudness limiter is active, restricting the output loudness to an acceptable level. In very soft environments a dynamic expansion (soft squelch) reduces inappropriate high gain (blue area).

6

affecting loudness, such as the dynamic properties of the sound, spectral masking, and bandwidth are not considered. Claro’s perception processing provides the perceptually correct amount of gain in all situations regardless of the bandwidth of the signal. With Claro, sounds that should be equally loud are heard with appropriate loudness. Loudness limiting As part of the Digital Perception Processing scheme, not only is the gain controlled by perception patterns but also the limitation of output levels. This is a significant improvement over physically based limiting systems. Such limiters do not take into consideration all of the necessary parameters to limit sounds based on their correct, natural perception: DPP on the other hand, limits output to a specific loudness level that is individually calculated and tuned during the fitting process. In addition to the loudness limiting of DPP, Claro also incorporates an instantaneous, physically based compression limiting system with an adaptive recovery time. This avoids excessive sound pressure levels resulting from impulsive sounds and prevents the receiver from saturating.

Soft squelch In addition to gain and output processing, the DPP model also includes a soft-squelch for very soft input signals. This is a dynamic expansion below the kneepoint of compression. In very quiet situations, compression hearing instruments tend to turn the gain up. This causes environmental noise or internal noise to be amplified to an audible level. DPP uses an expansive characteristic (the opposite of compression) below the kneepoint instead of linear amplification for noise reduction at low input levels. The useful range of operation, ranges from the kneepoint, which is set to the listener’s loudness perception of "very soft" (but is also fitter adjustable), up to the limiting at very high input levels. Figure 7 schematically shows the operating range of Claro DPP. Personalizing the DPP algorithm To optimally match the perception models in Claro to the individual needs of the client the Loudness Perception Profile should be measured. The individual slope of the loudness growth function in addition to the hearing threshold is an important parameter for optimizing the DPP algorithm. Knowledge of the individual loudness growth parameters increases the performance of DPP (see also Claro Background Story on the Loudness Perception Profile). The DPP algorithms Listening needs vary with each acoustic environment. Automatic program selection with AutoSelect enables Claro to react appropriately to changes in the acoustic environment (see also Claro Background Story on AutoSelect). Therefore two different modifications of the DPP algorithm have been developed. QuietAdapt DPP is optimized for hearing in quiet. It ensures natural sound, undistorted by inappropriate amplification. By preserving the perceptual pattern of complex sounds, QuietAdapt ensures that soft signal components are not

over-amplified and do not create additional noise. NoiseAdapt DPP for optimum hearing performance in noise is designed to preserve audibility and to minimize the upward spread of masking. Here, DPP works in conjunction with Claro’s solutions for hearing in noise, the Fine-scale Noise Canceler and Adaptive digital AudioZoom. Sound is spectrally reshaped only when the upward spread of masking would otherwise prevent clear hearing. Gain is regulated to preserve audibility and maintain comfort. Temporal aspects Several studies have shown that individuals with similar hearing loss may not have the same preferences for the temporal aspects of compression. While some prefer fast acting compression systems others prefer slow time constants (Kiessling et al., 1997). These preferences cannot be predicted from the configuration of the hearing loss. For this reason, Phonak has developed two different DPP variations: fast adaptive DPP designed to restore the syllabic structure of speech and slow adaptive DPP designed to restore long term overall loudness. Depending on individual preference either of these two schemes may be applied.

Summary Phonak’s Digital Perception Processing uses cochlea-like analysis in 20 critical bands corresponding to the normal ear's analysis in the speech frequencies. Signal processing in these 20 overlapping, level and frequency dependent, yet asymmetric critical frequency bands, is used to calculate the cochlea excitation patterns. This is then applied to control the gain and output of Claro hearing instruments. The perceptually correct amount of gain is achieved irrespective of the type of input sound or the recruitment pattern. This results in more natural sound quality while the correct loudness perception is maintained in all situations, regardless of the input signal. 7

Claro Digital Perception Processing

Dau T., Puschel D. and Kohlrausch A. A quantitative model of the "effective" signal processing in the auditory system. I. Model structure. J Acoust Soc Am 1996a; 99:6; 3615-22. Dau T., Puschel D. and Kohlrausch A. A quantitative model of the "effective" signal processing in the auditory system. II. Simulations and measurements. J Acoust Soc Am 1996b; 99:6; 3623-31. Fletcher H. and Munson W. A. The relation between masking and loudness. J Acoust Soc Am 1930; 9:1-10. Kießling J., Margolf-Hackl S. and Hartmann A. Nutzen und Akzeptanz unterschiedlicher Kompressionssysteme. Eine Fallstudie zum Vergleich von SC+aRT und WDRC am Beispiel des Phonak Phona P2. Hörakustik 1997; 32:9; 4-5, 8-9,12-14.

Moore B. C. J. Perceptual Consequences of Cochlear Damage. Oxford: Oxford Psychology series; No. 28, 1995. Moore B. C. J. An Introduction to the Psychology of Hearing San Diego: Academic Press, 1997. Moore B. C. J., Glasberg B. R. and Vickers D. A. Further evaluation of a model of loudness perception applied to cochlear hearing loss. J Acoust Soc Am 1999; 106:2; 898-907. Ruggero M. A. and Rich N. C. Furosemide alters organ of Corti mechanics: Evidence for feedback of outer hair cells upon the basilar membrane. Journal of Neuroscience 1991; 11:1057-1067. Zwicker E. and Fastl H. Psychoacoustics - Facts and Models. Berlin: Springer-Verlag, 1990. Zwicker E., Flottorp G. and Stevens S. S. Critical band width in loudness summation. J Acoust Soc Am 1957; 29:548-557.

Launer S. Loudness Perception in Listeners with Sensorineural Hearing Impairment. Ph.D. Thesis, University of Oldenburg. 1995.

www.phonak.com

28146(GB)/0400/cu Printed in Switzerland © Phonak AG All rights reserved

Bibliography