Login / Signup

A high-performance neuroprosthesis for speech decoding and avatar control.

Sean L MetzgerKaylo T LittlejohnAlexander B SilvaDavid A MosesMargaret P SeatonRan WangMaximilian E DoughertyJessie R LiuPeter WuMichael A BergerInga ZhuravlevaAdelyn Tu-ChanKarunesh GangulyGopala K AnumanchipalliEdward F Chang
Published in: Nature (2023)
Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive 1 . Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant's pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis.
Keyphrases
  • hearing loss
  • clinical trial
  • deep learning
  • high density
  • machine learning
  • artificial intelligence
  • high resolution
  • big data
  • mass spectrometry
  • body composition
  • electronic health record
  • preterm birth