当前位置:科学网首页 > 小柯机器人 >详情
作者:小柯机器人 发布时间:2024/5/22 16:14:35

美国再生元遗传中心Suganthi Balasubramanian等研究人员,合作绘制出983578名个体的蛋白质编码变异深度目录。相关论文于2024年5月20日在线发表在《自然》杂志上。

研究人员展示了人类蛋白质编码变异的目录,该目录来自不同人群中 983,578 个个体的外显子测序。再生元基因中心百万外显子组数据(RGC-ME)中有23%来自非洲、东亚、美洲土著、中东和南亚血统的非欧洲人。该目录包括1040多万个错义变异和110多万个预测功能缺失(pLOF)变异。研究人员在4848个基因中发现了具有罕见的双等位pLOF变异的个体,其中1751个基因以前从未报道过。





Title: A deep catalogue of protein-coding variation in 983,578 individuals

Author: Sun, Kathie Y., Bai, Xiaodong, Chen, Siying, Bao, Suying, Zhang, Chuanyi, Kapoor, Manav, Backman, Joshua, Joseph, Tyler, Maxwell, Evan, Mitra, George, Gorovits, Alexander, Mansfield, Adam, Boutkov, Boris, Gokhale, Sujit, Habegger, Lukas, Marcketta, Anthony, Locke, Adam E., Ganel, Liron, Hawes, Alicia, Kessler, Michael D., Sharma, Deepika, Staples, Jeffrey, Bovijn, Jonas, Gelfman, Sahar, Di Gioia, Alessandro, Rajagopal, Veera M., Lopez, Alexander, Varela, Jennifer Rico, Alegre, Jesus, Berumen, Jaime, Tapia-Conyer, Roberto, Kuri-Morales, Pablo, Torres, Jason, Emberson, Jonathan, Collins, Rory, Cantor, Michael, Thornton, Timothy, Kang, Hyun Min, Overton, John D., Shuldiner, Alan R., Cremona, M. Laura, Nafde, Mona, Baras, Aris, Abecasis, Goncalo, Marchini, Jonathan, Reid, Jeffrey G., Salerno, William, Balasubramanian, Suganthi

Issue&Volume: 2024-05-20

Abstract: Rare coding variants that significantly impact function provide insights into the biology of a gene1-3. However, ascertaining their frequency requires large sample sizes4-8. Here, we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. 23% of the Regeneron Genetics Center Million Exome data (RGC-ME) comes from non-European individuals of African, East Asian, Indigenous American, Middle Eastern, and South Asian ancestry. This catalogue includes over 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss-of-function, we identify 3,988 loss-of-function intolerant genes, including 86 that were previously assessed as tolerant and 1,153 lacking established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions depleted of missense variants despite being tolerant to pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this important resource of coding variation from the RGC-ME accessible via a public variant allele frequency browser.

DOI: 10.1038/s41586-024-07556-0

Source: https://www.nature.com/articles/s41586-024-07556-0

