海因里希海涅大学Tobias Marschall小组近日取得一项新成果。经过不懈努力,他们的研究开发出了几乎完整的人类基因组中复杂的遗传变异。该项研究成果发表在2025年7月23日出版的《自然》上。
在这里,研究组对65个不同的人类基因组进行测序,并构建了130个单倍型分解组装(中位数连续性为130Mb),关闭了92%的先前组装缺口,并达到了39%的染色体端粒到端粒状态。研究团队强调了复杂位点的完整序列连续性,包括主要组织相容性复合体(MHC), SMN1/SMN2, NBPF8和AMY1/AMY2,并完全解决了1852个复杂结构变异。
此外,该研究团队完全组装并验证了1246个人类着丝粒。研究组发现α-卫星高阶重复阵列长度的变化高达30倍,并表征了移动元件插入α-卫星高阶重复阵列的模式。虽然大多数着丝粒预测着丝粒附着的单一位点,但表观遗传学分析表明,7%的着丝粒存在两个低甲基化区域。将他们的数据与泛基因组参考文献草案相结合,显著提高了短读数据的基因分型准确性,使全基因组推断的中位质量值达到45。使用这种方法,每个个体检测到26115个结构变异,大大增加了现在适合下游疾病关联研究的结构变异数量。
研究人员表示,构建泛基因组参考和了解复杂结构变异的程度需要不同的完整人类基因组集。
附:英文原文
Title: Complex genetic variation in nearly complete human genomes
Author: Logsdon, Glennis A., Ebert, Peter, Audano, Peter A., Loftus, Mark, Porubsky, David, Ebler, Jana, Yilmaz, Feyza, Hallast, Pille, Prodanov, Timofey, Yoo, DongAhn, Paisie, Carolyn A., Harvey, William T., Zhao, Xuefang, Martino, Gianni V., Henglin, Mir, Munson, Katherine M., Rabbani, Keon, Chin, Chen-Shan, Gu, Bida, Ashraf, Hufsah, Scholz, Stephan, Austine-Orimoloye, Olanrewaju, Balachandran, Parithi, Bonder, Marc Jan, Cheng, Haoyu, Chong, Zechen, Crabtree, Jonathan, Gerstein, Mark, Guethlein, Lisbeth A., Hasenfeld, Patrick, Hickey, Glenn, Hoekzema, Kendra, Hunt, Sarah E., Jensen, Matthew, Jiang, Yunzhe, Koren, Sergey, Kwon, Youngjun, Li, Chong, Li, Heng, Li, Jiaqi, Norman, Paul J., Oshima, Keisuke K., Paten, Benedict, Phillippy, Adam M., Pollock, Nicholas R., Rausch, Tobias, Rautiainen, Mikko, Song, Yuwei, Sylev, Arda, Sulovari, Arvis, Surapaneni, Likhitha, Tsapalou, Vasiliki, Zhou, Weichen, Zhou, Ying, Zhu, Qihui, Zody, Michael C., Mills, Ryan E., Devine, Scott E., Shi, Xinghua, Talkowski, Michael E., Chaisson, Mark J. P., Dilthey, Alexander T.
Issue&Volume: 2025-07-23
Abstract: Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (median continuity of 130Mb), closing 92% of all previous assembly gaps1,2 and reaching telomere-to-telomere status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8 and AMY1/AMY2, and fully resolve 1,852 complex structural variants. In addition, we completely assemble and validate 1,246 human centromeres. We find up to 30-fold variation in α-satellite higher-order repeat array length and characterize the pattern of mobile element insertions into α-satellite higher-order repeat arrays. Although most centromeres predict a single site of kinetochore attachment, epigenetic analysis suggests the presence of two hypomethylated regions for 7% of centromeres. Combining our data with the draft pangenome reference1 significantly enhances genotyping accuracy from short-read data, enabling whole-genome inference3 to a median quality value of 45. Using this approach, 26,115 structural variants per individual are detected, substantially increasing the number of structural variants now amenable to downstream disease association studies.
DOI: 10.1038/s41586-025-09140-6
Source: https://www.nature.com/articles/s41586-025-09140-6
Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html