军事医学科学院生物信息学中心Shengqi Wang等共同合作,近期取得重要工作进展。他们研究提出,可以通过多模式深度表征学习对突变效应的零样本预测来指导蛋白质工程。相关研究成果2024年7月5日在线发表于《细胞研究》杂志上。
据介绍,氨基酸序列的突变可以引起蛋白质功能的改变。准确和无监督地预测突变效应在生物技术和生物医学中至关重要,但仍然是一个重要挑战。
研究人员提出了蛋白质突变效应预测器(ProMEP),这是一种通用的无多序列比对方法,可以实现突变效应的零样本预测。研究人员开发了一个嵌入ProMEP的多模式深度学习模型,对约1.6亿个蛋白质的序列和结构环境进行了综合学习。ProMEP在突变效应预测方面达到了最先进的性能,并且速度也大大提高,实现了高效和智能的蛋白质工程。
具体而言,ProMEP准确预测了基因编辑酶TnpB和TadA的突变后果,并成功地指导了高性能基因编辑工具及其工程变体的开发。TnpB的5位点突变体的基因编辑效率高达74.04%(野生型为24.66%);并且基于TadA 15位点突变体开发的碱基编辑工具(除了将脱氧腺苷脱氨酶活性提供给TadA的A106V/D18N双突变之外)显示出高达77.27%的A-G转换频率(相比于ABE8e的69.80%,ABE8e是以前的基于TadA的腺嘌呤碱基编辑器),与ABE8e相比具有显著减少的旁切和脱靶效应。ProMEP不仅在预测蛋白质突变效应方面表现出卓越的性能,而且在指导蛋白质工程方面也表现出强大的能力。
因此,ProMEP能够有效探索巨大的蛋白质空间,促进蛋白质的实用设计,从而推进生物医学和合成生物学的研究。
附:英文原文
Title: Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering
Author: Cheng, Peng, Mao, Cong, Tang, Jin, Yang, Sen, Cheng, Yu, Wang, Wuke, Gu, Qiuxi, Han, Wei, Chen, Hao, Li, Sihan, Chen, Yaofeng, Zhou, Jianglin, Li, Wuju, Pan, Aimin, Zhao, Suwen, Huang, Xingxu, Zhu, Shiqiang, Zhang, Jun, Shu, Wenjie, Wang, Shengqi
Issue&Volume: 2024-07-05
Abstract: Mutations in amino acid sequences can provoke changes in protein function. Accurate and unsupervised prediction of mutation effects is critical in biotechnology and biomedicine, but remains a fundamental challenge. To resolve this challenge, here we present Protein Mutational Effect Predictor (ProMEP), a general and multiple sequence alignment-free method that enables zero-shot prediction of mutation effects. A multimodal deep representation learning model embedded in ProMEP was developed to comprehensively learn both sequence and structure contexts from ~160 million proteins. ProMEP achieves state-of-the-art performance in mutational effect prediction and accomplishes a tremendous improvement in speed, enabling efficient and intelligent protein engineering. Specifically, ProMEP accurately forecasts mutational consequences on the gene-editing enzymes TnpB and TadA, and successfully guides the development of high-performance gene-editing tools with their engineered variants. The gene-editing efficiency of a 5-site mutant of TnpB reaches up to 74.04% (vs 24.66% for the wild type); and the base editing tool developed on the basis of a TadA 15-site mutant (in addition to the A106V/D108N double mutation that renders deoxyadenosine deaminase activity to TadA) exhibits an A-to-G conversion frequency of up to 77.27% (vs 69.80% for ABE8e, a previous TadA-based adenine base editor) with significantly reduced bystander and off-target effects compared to ABE8e. ProMEP not only showcases superior performance in predicting mutational effects on proteins but also demonstrates a great capability to guide protein engineering. Therefore, ProMEP enables efficient exploration of the gigantic protein space and facilitates practical design of proteins, thereby advancing studies in biomedicine and synthetic biology.
DOI: 10.1038/s41422-024-00989-2
Source: https://www.nature.com/articles/s41422-024-00989-2
Cell Research:《细胞研究》,创刊于1990年。隶属于施普林格·自然出版集团,最新IF:20.057
官方网址:https://www.nature.com/cr/
投稿链接:https://mts-cr.nature.com/cgi-bin/main.plex