Om måling af taleforståelse Speech intelligibility measurements Torben Poulsen 1
Høretabets fem dimensioner Nedsat følsomhed (= forhøjet høretærskel) audiogram, hørekurve Unormal opfattelse af lydstyrke recruitment (begrænset dynamikområde) Nedsat evne til at høre hvor lyde kommer fra lokalisation Nedsat evne til at skelne toner fra hinanden skelnetab, dårligere høre-filtre Nedsat evne til at adskille lyde i tid skelnetab, kan ikke høre pauser Nedsat evne til at forstå tale Audiogrammet er ikke nok! 2
- og hvad kan et høreapparat så gøre? Nedsat følsomhed (= forhøjet høretærskel) audiogram, hørekurve Unormal opfattelse af lydstyrke recruitment (begrænset dynamikområde) Nedsat evne til at høre hvor lyden kommer fra lokalisation Nedsat evne til at skelne toner fra hinanden skelnetab, dårligere høre-filtre Nedsat evne til at adskille lyde i tid skelnetab, kan ikke høre pauser OK Ikke endnu 3
Hørekurven fortæller ikke alt 7 personer Vi har målt evne til at adskille toner Dårlig Vi har målt taleforståelse i støj Samme hørekurve Normal hørelse 7 personer med samme hørekurve God God Taleforståelse Hørekurven viser ikke hvor godt man forstår tale! Dårlig 4
Oversigt Princip for talekommunikation Om talesignalet (level, spectrum etc.) Forståelighed (understanding) af tale Måling af taleforståelse Dantale I. Statistiske overvejelser Dantale II Dansk HINT (Dantale III?) Lydeksempler Afslutning 5
Speech Communication Flanagan, 1997 (Encyclopedia) Kompliceret system 6
Kropssprog Indhold Stemmeleje 7
Speech production Flanagan, 1972 8
Speech production, stemte lyde db 12 db 12 db Linear axis 125 250 375 500 625 Frequency, Hz Time function: Periodic signal, saw tooth shaped Borden & Harris, 1980 Source spectrum: Line spectrum, -12 db / octave (Male, fundamental frq.125 Hz) 9
Human vocal tract Oral cavity: From Velum to Lips Pharyngeal cavity: From Larynx to Velum Jusczyk, 1997 (Encyclopedia) 10
Vowel spectrum Vowel 1. formant 2. formant 3. formant i 225 2200 3000 a 700 1200 2500 u 250 700 2200 Formants 11
Spectogram Frequncy Time 12
Unvoiced (consonant) Source Model Spectrum Example: /s/ Flanagan, 1997 (Encyclopedia) 13
Speech level Average speech level in an anechoic room at 1 m distance: Male speech level: Female speech level: 65 db SPL 62 db SPL Dynamic range of the speech: 30 db AI: +12 db, -18 db Vocal effort Raised +5 db, loud +12 db, shout +20 db STI: +15 db, -15 db 14
Speech spectrum, 1/3 octaves Male and female spectra are almost identical above 200 Hz Average of 17 different languages LTASS: Long-Term Average Speech Spectrum 15
Spectrum level, db/hz Level, db SPL 60 50 40 30 20 10 0 Speech Spectrum Råben Alm. niveau 200 500 2000 5000 100 1000 10000 Frequency, Hz 1/3 octave speech spectrum Shout Loud Raised Normal Overall sound pressure level Shout 82,3 db Loud 74,9 db Raised 68,3 db Normal 62,4 db From ANSI 3.5-1997 SII calculation 16
Kan man forstå hvad der bliver sagt? 17
Opdeling, segmenting THEREDONATEAKETTLEOFTENCHIPS THE RED ON A TEA KETTLE OFTEN CHIPS THERE, DON ATE A KETTLE OF TEN CHIPS Example in Danish: Koge over Sammenlignet med svensk og norsk er dansk er et vanskeligt sprog i denne sammenhæng 18
Example, London Airport What do we hear? Arjevbin Fayed and Bybeiev Rhibodie please contact Airport Information I ve just been fired, and bye-bye everybody Another example: Makollig Jezvahted and Levdaroum DeBahzted please go to the airport information desk My colleague just farted, and left the room, the bastard 19
Speech Intelligibility Speaker Transmission system Listener Text-in Monitor In the clinic Text-out Experimenter % intelligibility is the ratio between Text-out and Text-in Presentation: Single words, words in a text Speech elements: Words, sentences, numbers, logatoms Speaker: Pronunciation, speed, vocal effort Listener: Hearing sensitivity, training, vocabulary Transmission system: Distortion, reverberation, noise 20
Effect of HP- and LP-filtering Low pass, LP High pass, HP Intelligibility of high-pass and low-pass filtered CVC items (Fletcher, 1953) Full text, no filter. 20 kbs (File 313 KB) 21
Interrupted speech Full signal 1 second Interrupted signal Illustration shows 5 Hz interruption Full signal Modulation frequency: 5 khz 500 Hz 50 Hz 5 Hz 0.5 Hz ON time: 0.1 ms 1 ms 10 ms 100 ms 1 s 22
Physiological correlates of degraded temporal fine structure sensitivity with sensorineural hearing loss Michael G. Heinz, Ph.D. Associate Professor of Speech, Language, Hearing Sciences and Biomedical Engineering, Purdue University, USA Abstract Recent perceptual studies have suggested that listeners with sensorineural hearing loss (SNHL) have a reduced ability to use temporal fine-structure cues for speech and pitch perception. These results have fueled an active debate about the role of temporal coding in normal and impaired hearing, and have important implications for improving the ability of hearing aids and cochlear implants to restore speech perception in noise. 23
Peak clipping of speech 24
Et problem: At forstå tale i støj Talen er 10 db svagere end støjen SNR = -10 db Tale og støj har samme styrke SNR = 0 db decibel 25. maj 2013 Hørelse og høretab 25
Dantale I Standardized speech material 8 wordlists 25 words in each list Equal representation of phonemes in the lists (not phonetically balanced) Single syllable words Female speaker Dantale 0 db SNR Examples: RIS, FOR, SMAL, KORT, SE, ØL 26
Dantale I, spectrum Spectrum of Dantale noise, speech shaped noise amplitude modulated corresponding to 4 concurrent speakers Slope: -12 db/octave Spectrum of words WITHOUT pauses between words 27
Dantale I, Variance of lists Dantale in DanNoise 28
Dantale I, individual results Individual results from 10 subjects Dantale words in Dantale Noise Average Different persons score differently on the same material 29
Standard deviation of results Increasing number of words decreasing st. dev. Greatest variance around the 50% level 30
Minimum difference between two results in order for the difference to be significant 25-word list, Dantale 1 50-word list Result of the highest score Example, Dantale, 25 words Without HA: 60 % intelligibility With HA: 80 % intelligibility No significant change in intelligibility! 31
Averaging of results Intelligibility score, % Test person 1 Test person 2 Use parametric averaging Simple average Typical parameters for averaging: SNR at 50% intelligibility Slope at 50% intelligibility SNR, db 32
Dantale II Per valgte otte pæne masker. Michael ejer seks nye huse. Linda solgte fem store kasser. Niels købte fjorten gamle jakker. Anders vandt tre fine blomster. Birgit ser tolv røde skabe. Kirsten får ni smukke ringe. Ingrid låner ti flotte planter. Henning havde syv hvide biler. Ulla finder tyve sjove gaver. Ulla ejer fem røde jakker. Birgit får tre store planter. Linda solgte otte flotte huse. Michael havde fjorten fine kasser. Kirsten ser ni pæne ringe. Niels finder tyve gamle masker. Anders valgte seks sjove gaver. Henning låner syv smukke skabe. Ingrid købte tolv hvide biler. Per vandt ti nye blomster. Fem ord i hver sætning Navn, verbum, antal, adjektiv, navneord Træningseffekt! 33
Dantale II Computer based self recording of intelligibility Fem ord i hver sætning Navn, verbum, antal, adjektiv, navneord Træningseffekt! Ellen R. Pedersen 34
HINT: Hearing In Noise Test HINT minder om DANTALE II, men HINT består af naturlige sætninger. Fem ord i hver sætning. HINT bruger sætnings-score i stedet for ord-score Spectrum 35
Eksempel på HINT kørsel Jens Bo Nielsen 36
Conclusion Speech intelligibility measurements Many sources of variance Many repetitions are necessary Many test subject are necessary Detailed analysis of results In the clinic (where you have only one person/patient/client) instruction look for the (very) big changes! 37
Happy end Slut finale Tak for jeres opmærksomhed Spørgsmål inden frokost? 38
39