浙江大学丁鼐团队取得一项新突破。他们研究出语言理解中的成分约束词预测。2026年4月21日出版的《自然—神经科学》杂志发表了这项成果。
在这里,该研究团队对这一猜想进行了测试,研究大脑在听连接语音时是否会尽可能准确地预测每个即将到来的单词。在对普通话使用者进行的三个脑磁图实验中,研究小组证明了与单词不可预测性相关的反应,即单词惊喜计算主题的大型语言模型,对于正在进行的组成部分内的单词的反应明显强于跨主要组成部分边界的单词,并且这种效应被组成部分边界的确定性进一步调节。这种成分边界效应也可以在行为上观察到,除非言语呈现得非常缓慢,并且通过分析对自然英语叙述的皮质电图反应数据集得到证实。成分边界效应表明,语言系统并不仅仅优化词预测精度;相反,它通过语言上下文表示的成分约束管理来平衡单词预测的贡献。
据悉,下一个单词的预测被假设为人类语言系统的核心计算目标,类似于当前的大型语言模型。
附:英文原文
Title: Constituent-constrained word prediction during language comprehension
Author: Zou, Jiajie, Poeppel, David, Ding, Nai
Issue&Volume: 2026-04-21
Abstract: Next-word prediction has been hypothesized as the central computational objective of the human language system, akin to that of current large language models. Here we put this conjecture to the test, investigating whether the brain predicts each upcoming word as precisely as possible when listening to connected speech. In three magnetoencephalography experiments with Mandarin Chinese speakers, we demonstrate that the response related to word unpredictability, that is, word surprisal calculated using large language models, is significantly stronger for words within an ongoing constituent than words across a major constituent boundary, and this effect is further modulated by the certainty of a constituent boundary. This constituent-boundary effect is also observed behaviorally unless speech is very slowly presented, and it is confirmed by analyzing a dataset of electrocorticography responses to natural English narratives. The constituent-boundary effect demonstrates that the language system does not solely optimize word-prediction precision; rather, it balances word-prediction contributions by constituent-constrained management of linguistic contextual representations.
DOI: 10.1038/s41593-026-02272-6
Source: https://www.nature.com/articles/s41593-026-02272-6
Nature Neuroscience:《自然—神经科学》,创刊于1998年。隶属于施普林格·自然出版集团,最新IF:28.771
官方网址:https://www.nature.com/neuro/
投稿链接:https://mts-nn.nature.com/cgi-bin/main.plex
