量子场论及之后大型语言模型中的临界学习—小柯机器人

首页 | 新闻 | 博客 | 院士 | 人才 | 会议 | 基金·项目 | 论文 | 视频·直播 | 小柯机器人 | 医学科普

当前位置：科学网首页 > 小柯机器人 >详情

量子场论及之后大型语言模型中的临界学习

作者：小柯机器人发布时间：2025/11/18 18:12:30

本期文章：《中国物理快报》：Volume 41 Issue 12

近日，中国科学院理论物理研究所陈锟团队研究了量子场论及之后大型语言模型中的临界学习。相关论文于2025年11月17日发表在《中国物理快报》杂志上。

基础物理学常面临符号计算复杂、且缺乏范例或既定原理指引的难题。虽然人工智能展现出潜力，但其通常需要海量数据训练的特性制约了在信息稀缺前沿领域的应用。

研究组提出临界态学习法——一种将大语言模型调节至锐化学习转变区的强化学习方案，以应对信息稀缺困境。在该转变区，大语言模型能以最小数据量实现最佳泛化能力，这在7位数七进制加法这一非平凡算术推理测试中得到验证。为阐释此峰值现象，研究组构建了最小概念网络模型以捕捉大语言模型关联标记的核心机制。该模型在单一示例训练后同样呈现锐化学习转变，且转变过程展现出典型二阶相变特征，尤其表现为幂律分布的解路径长度。

在此临界点，系统通过底层无标度探索实现了对泛化至关重要的“临界思维模式”最大化。这表明大语言模型通过运行在临界态达到峰值性能，该状态下的探索动力学使其能够抽绎底层运算规则。研究组在量子场论中验证了临界态学习法：基于少量符号松原求和示例，经临界态学习法调节至临界点的80亿参数大语言模型成功求解未见的高阶问题，显著超越规模更大的模型。临界态学习法由此运用物理原理中的临界现象，为人工智能应对基础物理学中数据稀疏的复杂挑战提供了新范式。

附：英文原文

Title: Learning-at-Criticality in Large Language Models for Quantum Field Theory and Beyond

Author: Xiansheng Cai, Sihan Hu, Tao Wang, Yuan Huang, Pan Zhang, Youjin Deng, Kun Chen

Issue&Volume: 2025-11-17

Abstract: Fundamental physics often confronts complex symbolic problems with few guiding exemplars or established principles. While artificial intelligence (AI) offers promise, its typical need for vast datasets to learn from hinders its use in these information-scarce frontiers. We introduce learning at criticality (LaC), a reinforcement learning scheme that tunes large language models (LLMs) to a sharp learning transition, addressing this information scarcity. At this transition, LLMs achieve peak generalization from minimal data, exemplified by 7-digit base-7 addition—a test of nontrivial arithmetic reasoning. To elucidate this peak, we analyze a minimal concept-network model designed to capture the essence of how LLMs might link tokens. Trained on a single exemplar, this model also undergoes a sharp learning transition. This transition exhibits hallmarks of a second-order phase transition, notably power-law distributed solution path lengths. At this critical point, the system maximizes a “critical thinking pattern” crucial for generalization, enabled by the underlying scale-free exploration. This suggests LLMs reach peak performance by operating at criticality, where such explorative dynamics enable the extraction of underlying operational rules. We demonstrate LaC in quantum field theory: an 8B-parameter LLM, tuned to its critical point by LaC using a few exemplars of symbolic Matsubara sums, solves unseen, higher-order problems, significantly outperforming far larger models. LaC thus leverages critical phenomena, a physical principle, to empower AI for complex, data-sparse challenges in fundamental physics.

DOI: 10.1088/0256-307X/42/12/120002

Source: http://cpl.iphy.ac.cn/en/article/doi/10.1088/0256-307X/42/12/120002viewType=HTML

期刊信息

Chinese Physics Letters：《中国物理快报》，创刊于1985年。隶属于中国物理学会，最新IF：3.5

官方网址：https://cpl.iphy.ac.cn/EN/0256-307X/current.shtml
投稿链接：https://editorial.iphy.ac.cn/journalx_cpl_cn/authorLogOn.action?mag_Id=4