Vowel and formant representation in the human auditory speech cortex

Yulia Oganian; Ilina Bhaya-Grossman; Keith Johnson; Edward F Chang

doi:10.1016/j.neuron.2023.04.004

Vowel and formant representation in the human auditory speech cortex

Neuron. 2023 Jul 5;111(13):2105-2118.e4. doi: 10.1016/j.neuron.2023.04.004. Epub 2023 Apr 26.

Authors

Yulia Oganian¹, Ilina Bhaya-Grossman², Keith Johnson³, Edward F Chang⁴

Affiliations

¹ Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
² Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA 94720, USA.
³ Department of Linguistics, University of California, Berkeley, Berkeley, CA, USA.
⁴ Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA. Electronic address: edward.chang@ucsf.edu.

Abstract

Vowels, a fundamental component of human speech across all languages, are cued acoustically by formants, resonance frequencies of the vocal tract shape during speaking. An outstanding question in neurolinguistics is how formants are processed neurally during speech perception. To address this, we collected high-density intracranial recordings from the human speech cortex on the superior temporal gyrus (STG) while participants listened to continuous speech. We found that two-dimensional receptive fields based on the first two formants provided the best characterization of vowel sound representation. Neural activity at single sites was highly selective for zones in this formant space. Furthermore, formant tuning is adjusted dynamically for speaker-specific spectral context. However, the entire population of formant-encoding sites was required to accurately decode single vowels. Overall, our results reveal that complex acoustic tuning in the two-dimensional formant space underlies local vowel representations in STG. As a population code, this gives rise to phonological vowel perception.

Keywords: auditory perception; intracranial electrophysiology; language; speech; speech normalization; vowel formants.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, Non-U.S. Gov't

MeSH terms

Auditory Cortex*
Auditory Perception
Humans
Phonetics
Speech
Speech Perception*

Abstract

Publication types

MeSH terms

Grants and funding