当前位置:科学网首页 > 小柯机器人 >详情
科学家利用76156名人类基因组的变异绘制基因组突变受限图谱
作者:小柯机器人 发布时间:2023/12/8 15:12:32

近日,美国博德研究所Konrad J. Karczewski等研究人员合作利用76156名人类基因组的变异绘制基因组突变受限图谱。相关论文于2023年12月6日在线发表在《自然》杂志上。

研究人员汇总、处理并发布了来自基因组聚合数据库(gnomAD,目前最大的公开开放人类基因组等位基因频率参考数据集)的76156名人类基因组数据集,并利用这些数据集构建了全基因组的基因组受限图谱(单倍体变异的基因组非编码约束(Gnocchi))。研究人员提出了一个经过改进的突变模型,该模型结合了局部序列上下文和区域基因组特征来检测变异耗竭。不出所料,蛋白质编码序列的平均受限强于非编码区域。

在非编码基因组中,受限区域富含已知的调控元件和变异,这些元件和变异与复杂的人类疾病和性状有牵连,从而促进了生物注释、疾病关联和自然选择与非编码DNA分析之间的三角关系。受限较多的调控元件往往会调控受限较多的蛋白编码基因,这反过来又表明,非编码受限可以帮助识别目前基因受限指标尚未识别的受限基因。这项研究表明,这种全基因组受限图谱可以改善对人类功能性遗传变异的识别和解释。

据介绍,纯化自然选择(受限)导致的破坏性变异的减少已被广泛用于研究人类疾病的蛋白质编码基因,但评估非蛋白质编码区域受限的尝试却被证明更为困难。

附:英文原文

Title: A genomic mutational constraint map using variation in 76,156 human genomes

Author: Chen, Siwei, Francioli, Laurent C., Goodrich, Julia K., Collins, Ryan L., Kanai, Masahiro, Wang, Qingbo, Alfldi, Jessica, Watts, Nicholas A., Vittal, Christopher, Gauthier, Laura D., Poterba, Timothy, Wilson, Michael W., Tarasova, Yekaterina, Phu, William, Grant, Riley, Yohannes, Mary T., Koenig, Zan, Farjoun, Yossi, Banks, Eric, Donnelly, Stacey, Gabriel, Stacey, Gupta, Namrata, Ferriera, Steven, Tolonen, Charlotte, Novod, Sam, Bergelson, Louis, Roazen, David, Ruano-Rubio, Valentin, Covarrubias, Miguel, Llanwarne, Christopher, Petrillo, Nikelle, Wade, Gordon, Jeandet, Thibault, Munshi, Ruchi, Tibbetts, Kathleen, ODonnell-Luria, Anne, Solomonson, Matthew, Seed, Cotton, Martin, Alicia R., Talkowski, Michael E., Rehm, Heidi L., Daly, Mark J., Tiao, Grace, Neale, Benjamin M., MacArthur, Daniel G., Karczewski, Konrad J.

Issue&Volume: 2023-12-06

Abstract: The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1,2,3,4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)—the largest public open-access human genome allele frequency reference dataset—and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.

DOI: 10.1038/s41586-023-06045-0

Source: https://www.nature.com/articles/s41586-023-06045-0

期刊信息

Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html