当前位置:科学网首页 > 小柯机器人 >详情
科学家绘制出983578名个体的蛋白质编码变异深度目录
作者:小柯机器人 发布时间:2024/5/22 16:14:35

美国再生元遗传中心Suganthi Balasubramanian等研究人员,合作绘制出983578名个体的蛋白质编码变异深度目录。相关论文于2024年5月20日在线发表在《自然》杂志上。

研究人员展示了人类蛋白质编码变异的目录,该目录来自不同人群中 983,578 个个体的外显子测序。再生元基因中心百万外显子组数据(RGC-ME)中有23%来自非洲、东亚、美洲土著、中东和南亚血统的非欧洲人。该目录包括1040多万个错义变异和110多万个预测功能缺失(pLOF)变异。研究人员在4848个基因中发现了具有罕见的双等位pLOF变异的个体,其中1751个基因以前从未报道过。

通过对杂合功能缺失选择的精确定量估计,研究人员确定了3988个不耐受功能缺失的基因,其中包括86个以前被评估为耐受的基因和1153个缺乏既定疾病注释的基因。研究人员还定义了高分辨率的错义缺失区域。值得注意的是,有1482个基因尽管对pLOF变异具有耐受性,但其区域的错义变异却已耗尽。

最后,研究人员估计有3%的个体存在可用于临床的基因变异,而ClinVar中报告的11773个意义不明的变异很可能是有害的隐性剪接位点。为了方便对变异的解释和以遗传学为依据的精准医疗,研究人员通过一个公开的变异等位基因频率浏览器,提供了来自RGC-ME的这一重要编码变异资源。

据介绍,对功能有重大影响的罕见编码变异有助于深入了解基因的生物学特性。然而,确定其频率需要大量样本。

附:英文原文

Title: A deep catalogue of protein-coding variation in 983,578 individuals

Author: Sun, Kathie Y., Bai, Xiaodong, Chen, Siying, Bao, Suying, Zhang, Chuanyi, Kapoor, Manav, Backman, Joshua, Joseph, Tyler, Maxwell, Evan, Mitra, George, Gorovits, Alexander, Mansfield, Adam, Boutkov, Boris, Gokhale, Sujit, Habegger, Lukas, Marcketta, Anthony, Locke, Adam E., Ganel, Liron, Hawes, Alicia, Kessler, Michael D., Sharma, Deepika, Staples, Jeffrey, Bovijn, Jonas, Gelfman, Sahar, Di Gioia, Alessandro, Rajagopal, Veera M., Lopez, Alexander, Varela, Jennifer Rico, Alegre, Jesus, Berumen, Jaime, Tapia-Conyer, Roberto, Kuri-Morales, Pablo, Torres, Jason, Emberson, Jonathan, Collins, Rory, Cantor, Michael, Thornton, Timothy, Kang, Hyun Min, Overton, John D., Shuldiner, Alan R., Cremona, M. Laura, Nafde, Mona, Baras, Aris, Abecasis, Goncalo, Marchini, Jonathan, Reid, Jeffrey G., Salerno, William, Balasubramanian, Suganthi

Issue&Volume: 2024-05-20

Abstract: Rare coding variants that significantly impact function provide insights into the biology of a gene1-3. However, ascertaining their frequency requires large sample sizes4-8. Here, we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. 23% of the Regeneron Genetics Center Million Exome data (RGC-ME) comes from non-European individuals of African, East Asian, Indigenous American, Middle Eastern, and South Asian ancestry. This catalogue includes over 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss-of-function, we identify 3,988 loss-of-function intolerant genes, including 86 that were previously assessed as tolerant and 1,153 lacking established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions depleted of missense variants despite being tolerant to pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this important resource of coding variation from the RGC-ME accessible via a public variant allele frequency browser.

DOI: 10.1038/s41586-024-07556-0

Source: https://www.nature.com/articles/s41586-024-07556-0

期刊信息

Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html