[Frontiers In Bioscience, Landmark, 22, 1697-1712, June 1, 2017]

Pathway-based classification of breast cancer subtypes

Alex Graudenzi1,2, Claudia Cava1, Gloria Bertoli1, Bastian Fromm3, Kjersti Flatmark3,4,5, Giancarlo Mauri2,6, Isabella Castiglioni1

1Institute of Molecular Bioimaging and Physiology of the Italian National Research Council (IBFM-CNR), Milan, Italy, 2Department of Informatics, Systems and Communication, University of Milan-Bicocca, Milan, Italy, 3Department of Tumor Biology, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway, 4 Department of Gastroenterological Surgery, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway, 5Institute of Clinical Medicine, University of Oslo, Oslo, Norway, 6 SYSBIO Centre of Systems Biology (SYSBIO), 20126 Milan, Italy


1. Abstract
2. Introduction
3. Biological background
4. Methods
4.1. Data sources
4.2. Multiclass classification of BC subtypes
4.2.1. Enrichment of relevant pathway for feature selection
4.2.2. SVM-based OvO classifier
4.2.3. Dataset preprocessing
4.2.4. Features selection
5. Results
5.1. Relevant pathway enrichment
5.2. Classification performance evaluation
5.2.1. Comparison with other techniques
6. Discussion
7. Acknowledgments
8. References


Cancer heterogeneity represents a major hurdle in the development of effective theranostic strategies, as it prevents to devise unique and maximally efficient diagnostic, prognostic and therapeutic procedures even for patients affected by the same tumor type. Computational techniques can nowadays leverage the huge and ever increasing amount of (epi)genomic data to tackle this problem, therefore providing new and valuable instruments for decision support to biologists and pathologists, in the broad sphere of precision medicine. In this context, we here introduce a novel cancer subtype classifier from gene expression data and we apply it to two different Breast Cancer datasets, from TCGA and GEO repositories. The classifier is based on Support Vector Machines and relies on the information about the relevant pathways involved in breast cancer development to reduce the huge variable space. Among the main results, we show that the classifier accuracy is preserved at excellent values even when the variable space is reduced by a 20-fold, hence providing a precious tool for cancer patient profiling even in case of limited experimental resources.


1 Notice that the complex interaction among pathways ruling cancer development is sometimes referred to as pathway cloud (30).

2 website: https://gdc-portal.nci.nih.gov/projects/TCGA-BRCA.

3 website: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE58212.

4 As different classifiers might result in different sample-class associations we here do not show the classification results for each single classifier, yet we provide a performance evaluation of the method based on average values of accuracy, precision and recall.

Key Words: Cancer Subtypes Classification, Breast Cancer, BC, Pathway Enrichment; Differentially Expressed Genes, DEG, Review

Send correspondence to: Alex Graudenzi, Institute of Molecular Bioimaging and Physiology of the Italian National Research Council (IBFM-CNR), Milan, Italy, Tel: 390221717552, Fax: 390221717558, E-mail: alex.graudenzi@unimib.it