英国维康桑格研究所Martin Hemberg课题组开发出快速检索大量单细胞数据集合的scfind程序。该项研究成果发表在2021年.3月1日出版的《自然-方法学》上。
为了快速和直观地查询单细胞数据,研究人员开发了scfind,这是一种单细胞分析工具,可帮助快速搜索细胞图谱中与生物学或临床相关的标记基因。使用六个小鼠的细胞转录组图谱数据,研究人员展示了scfind如何用于评估标记基因、进行计算机门控以及鉴定细胞类型特异性基因和管家基因。
此外,研究人员还研发了一个子查询优化例程,以确保冗长而复杂的查询可产生有意义的结果。为了使scfind更加便于用户使用,研究人员使用PubMed摘要索引和自然语言处理技术方便任意查询。最后,研究展示了如何通过将单细胞ATAC-seq数据与转录组数据相结合,将scfind用于多组学分析的范例。
据介绍,单细胞技术使分析数百万个细胞成为可能,但要使这些资源必须有易于查询和检索的方法。
附:英文原文
Title: Fast searches of large collections of single-cell data using scfind
Author: Jimmy Tsz Hang Lee, Nikolaos Patikas, Vladimir Yu Kiselev, Martin Hemberg
Issue&Volume: 2021-03-01
Abstract: Single-cell technologies have made it possible to profile millions of cells, but for these resources to be useful they must be easy to query and access. To facilitate interactive and intuitive access to single-cell data we have developed scfind, a single-cell analysis tool that facilitates fast search of biologically or clinically relevant marker genes in cell atlases. Using transcriptome data from six mouse cell atlases, we show how scfind can be used to evaluate marker genes, perform in silico gating, and identify both cell-type-specific and housekeeping genes. Moreover, we have developed a subquery optimization routine to ensure that long and complex queries return meaningful results. To make scfind more user friendly, we use indices of PubMed abstracts and techniques from natural language processing to allow for arbitrary queries. Finally, we show how scfind can be used for multi-omics analyses by combining single-cell ATAC-seq data with transcriptome data. Advances in single-cell sequencing technologies enable generation of datasets of millions of cells. scfind facilitates efficient and sophisticated gene search in massive single-cell datasets.
DOI: 10.1038/s41592-021-01076-9
Source: https://www.nature.com/articles/s41592-021-01076-9
	Nature Methods:《自然—方法学》,创刊于2004年。隶属于施普林格·自然出版集团,最新IF:28.467
	官方网址:https://www.nature.com/nmeth/
	投稿链接:https://mts-nmeth.nature.com/cgi-bin/main.plex
