当前位置:科学网首页 > 小柯机器人 >详情
新算法助力隐私保护性基因组数据分析与查询
作者:小柯机器人 发布时间:2020/3/18 12:51:46

美国国立卫生研究院S. Cenk Sahinalp研究小组取得一项新进展,他们开发了在安全区域内进行基因组数据分析和查询的描绘算法。相关论文于2020年3月4日在线发表在《自然—方法学》杂志上。

研究人员报道了SkSES(https://github.com/ndokmai/sgx-genome-variants-search),这是一种用于保护隐私的协作GWAS的硬件-软件混合方法,它将最先进的加密协议的运行时间缩短了两个数量级。SkSES方法基于当前微处理器(尤其是英特尔的SGX)提供的可信执行环境(TEE)。为了克服TEE的严格内存限制,SkSES采用了新颖的“素描”算法,可在输入VCF文件中维护有关基因组变异的基本统计信息。通过额外结合有效的数据压缩和减少人群分层的方法,SkSES可以快速、准确且以隐私保护的方式在队列中识别出最靠前的k基因组变异。
 
据介绍,全基因组关联研究(GWAS),尤其是关于罕见病的研究,可能需要在多个机构之间交换敏感的基因组数据。由于出于隐私考虑,基因组数据共享通常是不可行的,因此开发了诸如安全多方计算(SMC)协议之类的加密方法,旨在提供保护隐私的协作GWAS。不幸的是,这些方法的计算开销对于人类基因组规模的数据仍然是不现实的。
 
附:英文原文

Title: Sketching algorithms for genomic data analysis and querying in a secure enclave

Author: Can Kockan, Kaiyuan Zhu, Natnatee Dokmai, Nikolai Karpov, M. Oguzhan Kulekci, David P. Woodruff, S. Cenk Sahinalp

Issue&Volume: 2020-03-04

Abstract: Genome-wide association studies (GWAS), especially on rare diseases, may necessitate exchange of sensitive genomic data between multiple institutions. Since genomic data sharing is often infeasible due to privacy concerns, cryptographic methods, such as secure multiparty computation (SMC) protocols, have been developed with the aim of offering privacy-preserving collaborative GWAS. Unfortunately, the computational overhead of these methods remain prohibitive for human-genome-scale data. Here we introduce SkSES (https://github.com/ndokmai/sgx-genome-variants-search), a hardware–software hybrid approach for privacy-preserving collaborative GWAS, which improves the running time of the most advanced cryptographic protocols by two orders of magnitude. The SkSES approach is based on trusted execution environments (TEEs) offered by current-generation microprocessors—in particular, Intel’s SGX. To overcome the severe memory limitation of the TEEs, SkSES employs novel ‘sketching’ algorithms that maintain essential statistical information on genomic variants in input VCF files. By additionally incorporating efficient data compression and population stratification reduction methods, SkSES identifies the top k genomic variants in a cohort quickly, accurately and in a privacy-preserving manner.

DOI: 10.1038/s41592-020-0761-8

Source: https://www.nature.com/articles/s41592-020-0761-8

期刊信息

Nature Methods:《自然—方法学》,创刊于2004年。隶属于施普林格·自然出版集团,最新IF:28.467
官方网址:https://www.nature.com/nmeth/
投稿链接:https://mts-nmeth.nature.com/cgi-bin/main.plex