Login / Signup

Automatic recognition of second language speech-in-noise.

Seung-Eun KimBronya R ChernyakOlga SeleznovaJoseph KeshetMatthew GoldrickAnn R Bradlow
Published in: JASA express letters (2024)
Measuring how well human listeners recognize speech under varying environmental conditions (speech intelligibility) is a challenge for theoretical, technological, and clinical approaches to speech communication. The current gold standard-human transcription-is time- and resource-intensive. Recent advances in automatic speech recognition (ASR) systems raise the possibility of automating intelligibility measurement. This study tested 4 state-of-the-art ASR systems with second language speech-in-noise and found that one, whisper, performed at or above human listener accuracy. However, the content of whisper's responses diverged substantially from human responses, especially at lower signal-to-noise ratios, suggesting both opportunities and limitations for ASR--based speech intelligibility modeling.
Keyphrases
  • endothelial cells
  • induced pluripotent stem cells
  • pluripotent stem cells
  • hearing loss
  • air pollution
  • deep learning
  • autism spectrum disorder
  • risk assessment
  • silver nanoparticles