Bregmannian consensus clustering for cancer subtypes analysis. Academic Article uri icon

Overview

abstract

  • Cancer subtype analysis, as an extension of cancer diagnosis, can be regarded as a consensus clustering problem. This analysis is beneficial for providing patients with more accurate treatment. Consensus clustering refers to a situation in which several different clusters have been obtained for a particular data set, and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose to generalize the traditional consensus clustering methods in three manners: (1) We provide Bregmannian consensus clustering (BCC), where the loss between the consensus clustering result and all the input clusterings are generalized from a traditional Euclidean distance to a general Bregman loss; (2) we generalize the BCC to a weighted case, where each input clustering has different weights, providing a better solution for the final clustering result; and (3) we propose a novel semi-supervised consensus clustering, which adds some must-link and cannot-link constraints compared with the first two methods. Then, we obtain three cancer (breast, lung, colorectal cancer) data sets from The Cancer Genome Atlas (TCGA). Each data set has three data types (mRNA, mircoRNA, methylation), and each is respectively used to test the accuracy of the proposed algorithms for clusterings. The experimental results demonstrate that the highest aggregation accuracy of the weighted BCC (WBCC) on cancer data sets is 90.2%. Moreover, although the lowest accuracy is 62.3%, it is higher than other methods on the same data set. Therefore, we conclude that as compared with the competition, our method is more effective.

publication date

  • January 13, 2020

Research

keywords

  • Algorithms
  • Gene Expression Profiling
  • Neoplasms

Identity

Scopus Document Identifier

  • 85077945046

Digital Object Identifier (DOI)

  • 10.1016/j.cmpb.2020.105337

PubMed ID

  • 31962279

Additional Document Info

volume

  • 189