新加坡A*STAR基因组研究所Mile Šiki?小组宣布他们探明了使用HERRO校正的单纯形纳米孔读数进行端粒间组装。2026年4月27日出版的《自然》杂志发表了这项成果。
课题组表明,比较的结果是可能的主题错误纠正超长ONT单纯形读取,然后组装他们的最先进的从头组装方法。为了实现这一目标,该研究组开发了基于深度学习的HERRO(单倍型感知错误纠正)框架,该框架在仔细保存相关基因组序列差异的同时纠正ONT Simplex读取。考虑到区分单倍型或基因组重复拷贝的信息位置,HERRO对二倍体人类基因组的读取精度提高了100倍。通过将HERRO与Verkko组装器相结合,该课题组重建了多达32条染色体端粒到端粒,包括染色体X和Y,并在多个人类基因组中始终获得100 Mb或更高的NGA50值。HERRO同时支持R9.4.1和R10.4.1 ONT Simplex读取,并且可以很好地推广到其他物种。这些结果表明,校正错误的ONT reads可以降低测序成本,提高基因组分析的质量。
据介绍,端粒到端粒(T2T)阶段组装正在成为参考质量基因组的基准,尽管它们在技术和财政上仍然要求很高,特别是在规模上。为二倍体和多倍体基因组生成这样的组装通常需要将高精度长读取(如PacBio HiFi16或现已弃用的ONT Duplex2读取)与超长ONT Simplex读取相结合。使用多种平台或方法会增加成本和所需的基因组DNA数量。
附:英文原文
Title: Telomere-to-Telomere Assembly Using HERRO-Corrected Simplex Nanopore Reads
Author: Stanojevi, Dominik, Lin, Dehui, Nurk, Sergey, de Sessions, Paola Florez, iki, Mile
Issue&Volume: 2026-04-27
Abstract: Telomere-to-telomere (T2T) phased assemblies are emerging as a benchmark for reference-quality genomes1,17, though they remain technically and financially demanding, particularly at scale. Generating such assemblies for diploid and polyploid genomes typically involves combining high-accuracy long reads, such as PacBio HiFi16 or the now-deprecated ONT Duplex2 reads, with ultra-long ONT Simplex reads. Using multiple platforms or methods increases the cost and the required amount of genomic DNA. Here, we show that comparable results are possible using error correction of ultra-long ONT Simplex reads and then assembling them using state-of-the-art de novo assembly methods. To achieve this, we have developed the deep learning-based HERRO (Haplotype-aware ERRor cOrrection) framework, which corrects ONT Simplex reads while carefully preserving differences in related genomic sequences. Taking into account informative positions that differentiate the haplotypes or genomic repeat copies, HERRO achieves an increase of read accuracy of up to 100-fold for diploid human genomes. By combining HERRO with the Verkko17 assembler, we reconstruct up to 32 chromosomes telomere-to-telomere, including chromosomes X and Y, and consistently achieve NGA50 values of 100 Mb or higher across several human genomes. HERRO supports both R9.4.1 and R10.4.1 ONT Simplex reads and generalizes well to other species. These results show that error-corrected ONT reads can lower sequencing costs and improve the quality of genomic analyses.
DOI: 10.1038/s41586-026-10563-y
Source: https://www.nature.com/articles/s41586-026-10563-y
Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html
