中国科学院动物研究所李鑫等合作取得重要工作进展。他们研究开发了GeneCompass工具,能够利用已知的跨物种基础模型破译普遍的基因调控机制。相关研究成果2024年10月8日在线发表于《细胞研究》杂志上。
据介绍,破译不同生物体中的普遍基因调控机制,对于提高人们对基本生命过程的认识和促进临床应用具有巨大的潜力。然而,传统的研究范式主要关注个体模式生物,并没有整合跨物种的各种细胞类型。单细胞测序和深度学习技术的最新突破,为应对这一挑战提供了前所未有的机遇。
研究人员建立了一个包含超过1.2亿个人类和小鼠单细胞转录组的广泛数据集。经过数据预处理,获得了101768420个单细胞转录组,并开发了一个基于知识的跨物种基础模型,名为GeneCompass。在预培训期间,GeneCompass有效地整合了四种类型的先前生物学知识,以自我监督的方式增强了人们对基因调控机制的理解。
通过对多个下游任务进行微调,GeneCompass在单个物种的不同应用中表现优于最先进的模型,并开辟了跨物种生物研究的新领域。研究人员还利用GeneCompass寻找与细胞命运转变相关的关键因素,并表明预测的候选基因可以成功诱导人类胚胎干细胞分化为性腺命运。
总之,GeneCompass展示了使用人工智能技术破译通用基因调控机制的优势,并显示出加速发现关键细胞命运调节因子和候选药物靶点的巨大潜力。
附:英文原文
Title: GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model
Author: Yang, Xiaodong, Liu, Guole, Feng, Guihai, Bu, Dechao, Wang, Pengfei, Jiang, Jie, Chen, Shubai, Yang, Qinmeng, Miao, Hefan, Zhang, Yiyang, Man, Zhenpeng, Liang, Zhongming, Wang, Zichen, Li, Yaning, Li, Zheng, Liu, Yana, Tian, Yao, Liu, Wenhao, Li, Cong, Li, Ao, Dong, Jingxi, Hu, Zhilong, Fang, Chen, Cui, Lina, Deng, Zixu, Jiang, Haiping, Cui, Wentao, Zhang, Jiahao, Yang, Zhaohui, Li, Handong, He, Xingjian, Zhong, Liqun, Zhou, Jiaheng, Wang, Zijian, Long, Qingqing, Xu, Ping, Wang, Hongmei, Meng, Zhen, Wang, Xuezhi, Wang, Yangang, Wang, Yong, Zhang, Shihua, Guo, Jingtao, Zhao, Yi, Zhou, Yuanchun, Li, Fei, Liu, Jing, Chen, Yiqiang, Yang, Ge, Li, Xin
Issue&Volume: 2024-10-08
Abstract: Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.
DOI: 10.1038/s41422-024-01034-y
Source: https://www.nature.com/articles/s41422-024-01034-y
Cell Research:《细胞研究》,创刊于1990年。隶属于施普林格·自然出版集团,最新IF:20.057
官方网址:https://www.nature.com/cr/
投稿链接:https://mts-cr.nature.com/cgi-bin/main.plex