Login / Signup

Comparing human and machine speech recognition in noise with QuickSIN.

Malcolm SlaneyMatthew B Fitzgerald
Published in: JASA express letters (2024)
A test is proposed to characterize the performance of speech recognition systems. The QuickSIN test is used by audiologists to measure the ability of humans to recognize continuous speech in noise. This test yields the signal-to-noise ratio at which individuals can correctly recognize 50% of the keywords in low-context sentences. It is argued that a metric for automatic speech recognizers will ground the performance of automatic speech-in-noise recognizers to human abilities. Here, it is demonstrated that the performance of modern recognizers, built using millions of hours of unsupervised training data, is anywhere from normal to mildly impaired in noise compared to human participants.
Keyphrases
  • endothelial cells
  • air pollution
  • deep learning
  • machine learning
  • induced pluripotent stem cells
  • pluripotent stem cells
  • virtual reality