当前位置:科学网首页 > 小柯机器人 >详情
科学家利用OMArk工具进行基因库注释的质量评价
作者:小柯机器人 发布时间:2024/2/25 12:19:31

瑞士洛桑大学Yannis Nevers团队近期取得重要工作进展,他们研究利用OMArk工具进行基因库注释的质量评价。相关研究成果2024年2月21日在线发表于《自然—生物技术》杂志上。

据介绍,在生物多样性基因组学时代,确保蛋白质编码基因库的注释准确至关重要。评估基因组注释的现有先进技术大多注重测量基因库的完整性,但对其它错误(如基因过度预测或污染)并没有重视。

研究人员开发了OMArk,这是一个软件包,它依赖于查询蛋白质组和生命树中预先计算的基因家族之间的快速、无比对的序列比较。OMArk不仅评估了整个基因库相对于密切相关物种的完整性,还评估了其一致性,并报告了可能的污染事件。使用OMArk对1805个UniProt真核参考蛋白质组进行分析,证明了73个蛋白质组中存在污染的有力证据,并确定了使用片段化斑胸雀蛋白质组作为参考导致的鸟类基因注释中的错误传播。

总之,这一研究体现了根据蛋白质组的质量指标对其进行比较和排序的重要性。

附:英文原文

Title: Quality assessment of gene repertoire annotations with OMArk

Author: Nevers, Yannis, Warwick Vesztrocy, Alex, Rossier, Victor, Train, Clment-Marie, Altenhoff, Adrian, Dessimoz, Christophe, Glover, Natasha M.

Issue&Volume: 2024-02-21

Abstract: In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.

DOI: 10.1038/s41587-024-02147-w

Source: https://www.nature.com/articles/s41587-024-02147-w

期刊信息

Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164
官方网址:https://www.nature.com/nbt/
投稿链接:https://mts-nbt.nature.com/cgi-bin/main.plex