当前位置:科学网首页 > 小柯机器人 >详情
科学家使用紧凑特征集将非TCGA癌症样本分类为TCGA分子亚型
作者:小柯机器人 发布时间:2025/1/3 17:56:30

美国范安德尔研究所Peter W. Laird等研究人员合作使用紧凑特征集将非TCGA癌症样本分类为TCGA分子亚型。2025年1月2日,国际知名学术期刊《癌细胞》在线发表了这一成果。

利用来自8791个癌症基因组图谱(TCGA)肿瘤样本的多组学数据(这些样本包含来自26个不同癌症队列的106个亚型),研究人员通过应用五种不同的机器学习方法来构建了基于少量特征的模型。这些模型可以将新样本分类为先前定义的TCGA分子亚型——这是将分子亚型应用于临床的一个步骤。

研究人员使用外部数据集验证了所选分类器的性能。预测性能和分类器选择的特征为不同的机器学习方法和基因组数据平台提供了深刻的见解。对于每种癌症和数据类型,研究人员提供了顶级模型的容器化版本作为公共资源。

据介绍,分子亚型,如TCGA定义的分子亚型,划定了癌症的基本生物学特征,为患者的预后和治疗方案提供了希望。然而,大多数用于发现亚型的方法并不适用于将亚型标签分配给来自其他研究或临床试验的新癌症样本。

附:英文原文

Title: Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets

Author: Kyle Ellrott, Christopher K. Wong, Christina Yau, Mauro A.A. Castro, Jordan A. Lee, Brian J. Karlberg, Jasleen K. Grewal, Vincenzo Lagani, Bahar Tercan, Verena Friedl, Toshinori Hinoue, Vladislav Uzunangelov, Lindsay Westlake, Xavier Loinaz, Ina Felau, Peggy I. Wang, Anab Kemal, Samantha J. Caesar-Johnson, Ilya Shmulevich, Alexander J. Lazar, Ioannis Tsamardinos, Katherine A. Hoadley, A. Gordon Robertson, Theo A. Knijnenburg, Christopher C. Benz, Joshua M. Stuart, Jean C. Zenklusen, Andrew D. Cherniack, Peter W. Laird

Issue&Volume: 2025-01-02

Abstract: Molecular subtypes, such as defined by The Cancer Genome Atlas (TCGA), delineate a cancer’s underlying biology, bringing hope to inform a patient’s prognosis and treatment plan. However, most approaches used in the discovery of subtypes are not suitable for assigning subtype labels to new cancer specimens from other studies or clinical trials. Here, we address this barrier by applying five different machine learning approaches to multi-omic data from 8,791 TCGA tumor samples comprising 106 subtypes from 26 different cancer cohorts to build models based upon small numbers of features that can classify new samples into previously defined TCGA molecular subtypes—a step toward molecular subtype application in the clinic. We validate select classifiers using external datasets. Predictive performance and classifier-selected features yield insight into the different machine-learning approaches and genomic data platforms. For each cancer and data type we provide containerized versions of the top-performing models as a public resource.

DOI: 10.1016/j.ccell.2024.12.002

Source: https://www.cell.com/cancer-cell/abstract/S1535-6108(24)00477-X

期刊信息

Cancer Cell:《癌细胞》,创刊于2002年。隶属于细胞出版社,最新IF:38.585
官方网址:https://www.cell.com/cancer-cell/home
投稿链接:https://www.editorialmanager.com/cancer-cell/default.aspx