Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes.

Kiki van der HeijdenPrachi PatelStephan Bickel Jose L Herrero Ashesh D Mehta Nima Mesgarani

Published in: bioRxiv : the preprint server for biology (2024)

Cortical responses to an single talker exhibit a distributed gradient, ranging from sites that are sensitive to both a talker's voice and location (dual-feature sensitive sites) to sites that are sensitive to either voice or location (single-feature sensitive sites).Population response patterns of dual-feature sensitive sites encode voice and location features of the attended talker in multi-talker scenes jointly and with equal precision.Despite their sensitivity to a single feature at the level of individual cortical sites, population response patterns of single-feature sensitive sites also encode location and voice features of a talker jointly, but with higher precision for the feature they are primarily sensitive to.Neural sites which selectively track an attended speech stream concurrently encode the attended talker's voice and location features.Attention selectively enhances temporal coherence between voice and location selective sites over time.Joint population coding as well as temporal coherence mechanisms underlie distributed multi-dimensional auditory object encoding in auditory cortex.

Keyphrases