March 7, 2025 – A new Hebrew University study has uncovered how the brain seamlessly transforms sounds, speech patterns, and words into everyday conversation flow. These findings have the potential to improve speech recognition technology and help develop better tools for people with communication challenges.

The study, published in Nature Human Behaviour, used advanced technology to analyze over 100 hours of brain activity during real-life discussions, which revealed the intricate pathways that allow people to effortlessly speak and understand.

The team, which included Google Research, the Neuroscience Institute at Princeton University, and the NYU Langone Comprehensive Epilepsy Center, used a speech-to-text model called Whisper, which helps break down language into three levels: simple sounds, speech patterns, and the meaning of words. These were compared to brain activity using advanced computer models to explore the neural basis of human conversations.

“Our findings help us understand how the brain processes conversations in real-life settings,” said lead researcher Dr. Goldstein from the Hebrew University Department of Cognitive and Brain Sciences and the Business School. “By connecting different layers of language, we’re uncovering the mechanics behind something we all do ¾ naturally talking and understanding each other.”

The results showed that the new computational framework could predict brain activity with great accuracy. Even when applied to conversations that were not part of the original data, the model correctly matched different parts of the brain to specific language functions. For example, regions involved in hearing and speaking aligned with sound and speech patterns, while areas involved in higher-level understanding aligned with the meaning of words.

The researchers also found that the brain processes language in a sequence. Before we speak, our brain changes from thinking about words to forming sounds. After we’ve listened, our brain then works backward to make sense of what we heard. This new framework was found to be more effective at capturing these complex processes than older methods.

Insights gained from the study mark an important step toward building more advanced tools to understand how the brain handles language in real-world situations, whether it’s chatting with a friend or engaging in a debate, and transforming this knowledge into technological advancements.

The research paper titled “A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations” is available in Nature Human Behaviour and can be accessed here.

DOI 10.1038/s41562-025-02105-9

Researchers:

Ariel Goldstein1,2, Haocheng Wang3, Leonard Niekerken3,4, Mariano Schain2, Zaid Zada3, Bobbi Aubrey3, Tom Sheffer2, Samuel A. Nastase3, Harshvardhan Gazula3,5, Aditi Singh3, Aditi Rao3, Gina Choe3, Catherine Kim3, Werner Doyle6, Daniel Friedman6, Sasha Devore6, Patricia Dugan6, Avinatan Hassidim2, Michael Brenner2,7, Yossi Matias2, Orrin Devinsky6, Adeen Flinker6, Uri Hasson3

Institutions:

  1. Department of Cognitive and Brain Sciences and Business School, Hebrew University, Jerusalem, Israel
  2. Google Research, Mountain View, CA, USA
  3. Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
  4. Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
  5. Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
  6. New York University School of Medicine, New York, NY, USA
  7. School of Engineering and Applied Science, Harvard University, Boston, MA, USA