[Frontiers In Bioscience, Landmark, 22, 1697-1712, June 1, 2017]

Pathway-based classification of breast cancer subtypes

Alex Graudenzi1,2, Claudia Cava1, Gloria Bertoli1, Bastian Fromm3, Kjersti Flatmark3,4,5, Giancarlo Mauri2,6, Isabella Castiglioni1

1Institute of Molecular Bioimaging and Physiology of the Italian National Research Council (IBFM-CNR), Milan, Italy, 2Department of Informatics, Systems and Communication, University of Milan-Bicocca, Milan, Italy, 3Department of Tumor Biology, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway, 4 Department of Gastroenterological Surgery, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway, 5Institute of Clinical Medicine, University of Oslo, Oslo, Norway, 6 SYSBIO Centre of Systems Biology (SYSBIO), 20126 Milan, Italy

TABLE OF CONTENTS

1. Abstract
2. Introduction
3. Biological background
4. Methods
4.1. Data sources
4.2. Multiclass classification of BC subtypes
4.2.1. Enrichment of relevant pathway for feature selection
4.2.2. SVM-based OvO classifier
4.2.3. Dataset preprocessing
4.2.4. Features selection
5. Results
5.1. Relevant pathway enrichment
5.2. Classification performance evaluation
5.2.1. Comparison with other techniques
6. Discussion
7. Acknowledgments
8. References

1. ABSTRACT

Cancer heterogeneity represents a major hurdle in the development of effective theranostic strategies, as it prevents to devise unique and maximally efficient diagnostic, prognostic and therapeutic procedures even for patients affected by the same tumor type. Computational techniques can nowadays leverage the huge and ever increasing amount of (epi)genomic data to tackle this problem, therefore providing new and valuable instruments for decision support to biologists and pathologists, in the broad sphere of precision medicine. In this context, we here introduce a novel cancer subtype classifier from gene expression data and we apply it to two different Breast Cancer datasets, from TCGA and GEO repositories. The classifier is based on Support Vector Machines and relies on the information about the relevant pathways involved in breast cancer development to reduce the huge variable space. Among the main results, we show that the classifier accuracy is preserved at excellent values even when the variable space is reduced by a 20-fold, hence providing a precious tool for cancer patient profiling even in case of limited experimental resources.

8. REFERENCES

1. M Gerlinger, AJ Rowan, S Horswell, J Larkin, D Endesfelder, E Gronroos, P Martinez, N Matthews, A Stewart, P Tarpey, I Varela, B Phillimore, S Begum, NQ McDonald, A Butler, D Jones, K Raine, C Latimer, CR Santos, M Nohadani, AC Eklund, B Spencer-Dene, G Clark, L Pickering, G Stamp, M Gore, Z Szallasi, J Downward, PA Futreal, C Swanton: Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med, 366(10):883-92 (2012)
DOI:10.1056/NEJMoa1113205

2. R Fisher, L Pusztai, C Swanton: Cancer heterogeneity: implications for targeted therapeutics. Br J Cancer, 108(3):479-85 (2013)
DOI:10.1038/bjc.2012.581

3. RA Burrell, C Swanton: Tumour heterogeneity and the evolution of polyclonal drug resistance. Mol Oncol, 8(6):1095-111 (2014)
DOI:10.1016/j.molonc.2014.06.005

4. R Mirnezami, J Nicholson, A Darzi. Preparing for precision medicine. N Engl J Med, 366(6):489-91 (2012)
DOI:10.1056/NEJMp1114866

5. National Cancer Institute; National Genome Research Institute (2015) The Cancer Genome Atlas (Natl Inst Health, Bethesda). Available at https://tcga-data.nci.nih.gov/tcga. Accessed Sept 30, 2016.

6. G Caravagna, A Graudenzi, D Ramazzotti, R Sanz-Pamplona, L De Sano, G Mauri, V Moreno, M Antoniotti, B Mishra: Algorithmic methods to infer the evolutionary trajectories in cancer progression. Proc Natl Acad Sci U S A, 113(28):E4025-34 (2016)
DOI:10.1073/pnas.1520213113

7. C Cava, G Bertoli, I Castiglioni: Integrating genetics and epigenetics in breast cancer: biological insights, experimental, computational methods and therapeutic potential. BMC Syst Biol, 1,9:62 (2015)
DOI:10.1186/s12918-015-0211-x

8. A Colaprico, TC Silva, C Olsen, L Garofano, C Cava, D Garolini, TS Sabedot, TM Malta, SM Pagnotta, I Castiglioni, M Ceccarelli, G Bontempi, H Noushmehr: TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res,44(8):e71 (2016)
DOI:10.1093/nar/gkv1507

9. C Cava, I Zoppis, M Gariboldi, I Castiglioni, G Mauri, M Antoniotti: Copy–Number Alterations for Tumor Progression Inference. Lecture Notes in Computer Science, 7885:104-109 (2013)
DOI:10.1007/978-3-642-38326-7_16

10. T Sorlie, CM Perou, R Tibshirani, T Aas, S Geisler, H Johnsen, T Hastie, MB Eisen, M van de Rijn, SS Jeffrey, T Thorsen, H Quist, JC Matese, PO Brown, D Botstein, PE Lonning, AL Borresen-Dale: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A, 98(19):10869–10874 (2001)
DOI:10.1073/pnas.191367098

11. J Khan, JS Wei, M Ringner, LH Saal, M Ladanyi, F Westermann, F Berthold, M Schwab, CR Antonescu, C Peterson, PS Meltzer: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med, 7(6):673–679 (2001)
DOI:10.1038/89044

12. D Singh, PG Febbo, K Ross, DG Jackson, J Manola, C Ladd, P Tamayo, AA Renshaw, AV D’Amico, JP Richie, ES Lander, M Loda, PW Kantoff, TR Golub, WR Sellers: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2):203–209 (2002)
DOI:10.1016/S1535-6108(02)00030-2

13. LJ van ’t Veer, Dai H, MJ van de Vijver, YD He, AA Hart, M Mao, HL Peterse, K van der Kooy, MJ Marton, AT Witteveen, GJ Schreiber, RM Kerkhoven, C Roberts, PS Linsley, R Bernards, SH Friend: Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415(6871):530–536 (2002)
DOI:10.1038/415530a

14. C Sotiriou, P Wirapati, S Loi, A Harris, S Fox, J Smeds, H Nordgren, P Farmer, V Praz, B Haibe-Kains, C Desmedt, D Larsimont, F Cardoso, H Peterse, D Nuyten, M Buyse, MJ Van de Vijver, J Bergh, M Piccart, M Delorenzi: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst, 98(4):262–272 (2006)
DOI:10.1093/jnci/djj052

15. V Popovici, W Chen, BG Gallas, C Hatzis, W Shi, FW Samuelson, Y Nikolsky, M Tsyganova, A Ishkin, T Nikolskaya, KR Hess, V Valero, D Booser, M Delorenzi, GN Hortobagyi, L Shi, WF Symmans, L Pusztai: Effect of training-sample size and classification difficulty on the accuracy of genomic predictors. Breast Cancer Res, 12(1):R5 (2010)
DOI:10.1186/bcr2468

16. AV Ivshina, J George, O Senko, B Mow, TC Putti, J Smeds, T Lindahl, Y Pawitan, P Hall, H Nordgren, JE Wong, ET Liu, J Bergh, VA Kuznetsov, LD Miller: Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res, 1;66(21):10292-301 (2006)
DOI:10.1158/0008-5472.CAN-05-4414

17. C Sotiriou, P Wirapati, S Loi, A Harris, S Fox, J Smeds, H Nordgren, P Farmer, V Praz, B Haibe-Kains, C Desmedt, D Larsimont, F Cardoso, H Peterse, D Nuyten, M Buyse, MJ Van de Vijver, J Bergh, M Piccart, M Delorenzi: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst, 98(4):262-72 (2009)
DOI:10.1093/jnci/djj052

18. C Cava, G Bertoli, M Ripamonti, G Mauri, I Zoppis, PA Della Rosa, MC Gilardi, I Castiglioni: Integration of mRNA expression profile, copy number alterations, and microRNA expression levels in breast cancer to improve grade definition. PLoS One, 9(5):e97681 (2014)
DOI:10.1371/journal.pone.0097681

19. PC Miller, J Clarke, T Koru-Sengul, J Brinkman, D El-Ashry: A novel MAPK-microRNA signature is predictive of hormone-therapy resistance and poor outcome in ER-positive breast cancer. Clin Cancer Res,21(2):373-85 (2015)
DOI:10.1158/1078-0432.CCR-14-2053

20. TM Severson, J Peeters, I Majewski, M Michaut, A Bosma, PC Schouten, SF Chin, B Pereira, MA Goldgraben, T Bismeijer, RJ Kluin, JJ Muris, K Jirström, RM Kerkhoven, L Wessels, C Caldas, R Bernards, IM Simon, S Linn: BRCA1-like signature in triple negative breast cancer: Molecular and clinical characterization reveals subgroups with therapeutic potential. Mol Oncol, 9(8):1528-38 (2015)
DOI:10.1016/j.molonc.2015.04.011

21. SG Zhao, M Shilkrut, C Speers, M Liu, K Wilder-Romans, TS Lawrence, LJ Pierce, FY Feng: Development and validation of a novel platform-independent metastasis signature in human breast cancer. PLoS One, 10(5):e0126631 (2015)
DOI:10.1371/journal.pone.0126631

22. C Cava, I Zoppis, G Mauri, M Ripamonti, F Gallivanone, C Salvatore, MC Gilardi, I Castiglioni: Combination of gene expression and genome copy number alteration has a prognostic value for breast cancer. Conf Proc IEEE Eng Med Biol Soc, 2013:608-11 (2013)
DOI:10.1109/embc.2013.6609573

23. VD Haakensen, V Nygaard, L Greger, MR Aure, B Fromm, IR Bukholm, T Lüders, SF Chin, A Git, C Caldas, VN Kristensen, A Brazma, AL Børresen-Dale, E Hovig, Å Helland: Subtype-specific micro-RNA expression signatures in breast cancer progression. Int J Cancer,139(5):1117-28 (2016)
DOI:10.1002/ijc.30142

24. A Colaprico, C Cava, G Bertoli, G Bontempi, I Castiglioni: Integrative Analysis with Monte Carlo Cross-Validation Reveals miRNAs Regulating Pathways Cross-Talk in Aggressive Breast Cancer. Biomed Res Int, 2015:831314 (2015)
DOI:10.1155/2015/831314

25. J Tomfohr, J Lu, TB Kepler: Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics, 6:225 (2005)
DOI:10.1186/1471-2105-6-225

26. F Rapaport, A Zinovyev, M Dutreix, E Barillot, JP Vert: Classification of microarray data using gene networks. BMC Bioinformatics, 8:35 (2007)
DOI:10.1186/1471-2105-8-35

27. J Su, BJ Yoon, ER: Dougherty: Accurate and reliable cancer classification based on probabilistic inference of pathway activity. PLoS One,4(12):e8161 (2009)
DOI:10.1371/journal.pone.0008161

28. E Lee, HY Chuang, JW Kim, T Ideker, D Lee: Inferring pathway activity toward precise disease classification. PLoS comput biol, 4(11), e1000217 (2008)
DOI:10.1371/journal.pcbi.1000217

29. L Yang, C Ainali, S Tsoka, LG Papageorgiou: Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework. BMC Bioinformatics,15:390 (2014)
DOI:10.1186/s12859-014-0390-2

30. A Zhavoronkov, AA Buzdin, AV Garazha, NM Borisov, AA Moskalev: Signaling pathway cloud regulation for in silico screening and ranking of the potential geroprotective drugs. Front Genet,5:49 (2014)
DOI:10.3389/fgene.2014.00049

31. E Senkus, S Kyriakides, F Penault-Llorca, P Poortmans, A Thompson, S Zackrisson, F Cardoso: ESMO Guidelines Working Group.. Primary breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol, 24 Suppl 6:vi7-23 (2013)
DOI:10.1093/annonc/mdt284

32. Cancer Genome Atlas Network: Comprehensive molecular portraits of human breast tumours. Nature, 490(7418):61-70 (2012)
DOI:10.1038/nature11412

33. R Edgar, M Domrachev, AE Lash: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res, 30(1):207-10 (2002)
DOI:10.1093/nar/30.1.207

34. CM Perou, T Sørlie, MB Eisen, M van de Rijn, SS Jeffrey, CA Rees, JR Pollack, DT Ross, H Johnsen, LA Akslen, O Fluge, A Pergamenschikov, C Williams, SX Zhu, PE Lønning, AL Børresen-Dale, PO Brown, D Botstein: Molecular portraits of human breast tumours. Nature,406(6797):747-52 (2000)
DOI:10.1038/35021093

35. T Sorlie, R Tibshirani, J Parker, T Hastie, JS Marron, A Nobel, S Deng, H Johnsen, R Pesich, S Geisler, J Demeter, CM Perou, PE Lønning, PO Brown, AL Børresen-Dale, D Botstein: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A,100(14):8418-23 (2003)
DOI:10.1073/pnas.0932692100

36. JS Parker, M Mullins, MC Cheang, S Leung, D Voduc, T Vickery, S Davies, C Fauron, X He, Z Hu, JF Quackenbush, IJ Stijleman, J Palazzo, JS Marron, AB Nobel, E Mardis, TO Nielsen, MJ Ellis, CM Perou, PS Bernard: Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol, 27(8):1160-7 (2009)
DOI:10.1200/JCO.2008.18.1370

37. SG Wu, ZY He, Q Li, FY Li, Q Lin, HX Lin, XX Guan: Predictive value of breast cancer molecular subtypes in Chinese patients with four or more positive nodes after postmastectomy radiotherapy. Breast, 21(5):657-61 (2012)
DOI:10.1016/j.breast.2012.07.004

38. M Kyndi, FB Sørensen, H Knudsen, M Overgaard, HM Nielsen, J Overgaard, Danish Breast Cancer Cooperative Group: Estrogen receptor, progesterone receptor,HER-2, and response to postmastectomy radiotherapy in high-risk breast cancer: the Danish Breast Cancer Cooperative Group. J Clin Oncol,26(9):1419-26 (2008)
DOI:10.1200/JCO.2007.14.5565

39. KD Voduc, MC Cheang, S Tyldesley, K Gelmon, TO Nielsen, H Kennecke: Breast cancer subtypes and the risk of local and regional relapse. J Clin Oncol, 28(10):1684-91 (2010)
DOI:10.1200/JCO.2009.24.9284

40. TO Nielsen, FD Hsu, K Jensen, M Cheang, G Karaca, Z Hu, T Hernandez-Boussard, C Livasy, D Cowan, L Dressler, LA Akslen, J Ragaz, AM Gown, CB Gilks, M van de Rijn, CM Perou: Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res, 10(16):5367-74 (2004)
DOI:10.1158/1078-0432.CCR-04-0220

41. MC Cheang, SK Chia, D Voduc, D Gao, S Leung, J Snider, M Watson, S Davies, PS Bernard, JS Parker, CM Perou, MJ Ellis, TO Nielsen: Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer. J Natl Cancer Inst,101(10):736-50 (2009)
DOI:10.1093/jnci/djp082

42. Y Benjamini, Y Hochberg: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B: Methodological, 57(1), 289–300 (1995)

43. A Fabregat, K Sidiropoulos, P Garapati, M Gillespie, K Hausmann, R Haw, B Jassal, S Jupe, F Korninger, S McKay, L Matthews, B May, M Milacic, K Rothfels, V Shamovsky, M Webber, J Weiser, M Williams, G Wu, L Stein, H Hermjakob, P D'Eustachio: The Reactome pathway Knowledgebase. Nucleic Acids Res,44(D1):D481-7 (2016)
DOI:10.1093/nar/gkv1351

44. D Nishimura: BioCarta. Biotech Software & Internet Report, 2(3):117–120 (2001)
DOI:10.1089/152791601750294344

45. M Kanehisa, S Goto: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 28: 27–30 (2000)
DOI:10.1093/nar/28.1.27

46. V Vapnik: The nature of statistical learning theory. Springer science & business media (2013)

47. TS Furey, N Cristianini, N Duffy, DW Bednarski, M Schummer, D Haussler: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics,16(10):906-14 (2000)
DOI:10.1093/bioinformatics/16.10.906

48. XX Niu, CY Suen: A novel hybrid CNN–SVM classifier for recognizing handwritten digits. Pattern Recognition, 45(4), 1318-1325 (2012)
DOI:10.1016/j.patcog.2011.09.021

49. A Falanga, MN Levine, R Consonni, G Gritti, F Delaini, E Oldani, JA Julian, T Barbui: The effect of very-low-dose warfarin on markers of hypercoagulation in metastatic breast cancer: results from a randomized trial. Thromb Haemost, 79(1):23-7 (1998)

50. P Marcato, CA Dean, CA Giacomantonio, PW Lee: Aldehyde dehydrogenase: its role as a cancer stem cell marker comes down to the specific isoform. Cell Cycle, 10(9):1378-84 (2011)
DOI:10.4161/cc.10.9.15486

51. W Shih, S Yamada: N-cadherin-mediated cell-cell adhesion promotes cell migration in a three-dimensional matrix. J Cell Sci, 125(Pt15):3661-70 (2012)
DOI:10.1242/jcs.103861

52. R Lamb, S Lehn, L Rogerson, RB Clarke, G Landberg: Cell cycle regulators cyclin D1 and CDK4/6 have estrogen receptor-dependent divergent functions in breast cancer migration and stem cell-like activity. Cell Cycle,12(15):2384-94 (2013)
DOI:10.4161/cc.25403

53. A Nagarajan, P Malvi, N Wajapeyee: Oncogene-Directed Alterations in Cancer Cell Metabolism. Trends in Cancer, 2(7), 365-377 (2016)
DOI:10.1016/j.trecan.2016.06.002

54. F Di Virgilio: Purines, purinergic receptors, and cancer. Cancer Res,72(21):5441–7 (2012)
DOI:10.1158/0008-5472.CAN-12-1600

55. P Mehlen, C Delloye-Bourgeois, A Chédotal: Novel roles for Slits and netrins: axon guidance cues as anticancer targets? Nat Rev Cancer, 11(3):188–97 (2011)
DOI:10.1038/nrc3005

56. S Ramaswamy, P Tamayo, R Rifkin, S Mukherjee, CH Yeang, M Angelo, C Ladd, M Reich, E Latulippe, JP Mesirov, T Poggio, W Gerald, M Loda, ES Lander, TR Golub: Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci U S A, 98(26):15149-54 (2001)
DOI:10.1073/pnas.211566398

57. JC Ang, H Haron, HNA Hamed: Semi-supervised SVM-based feature selection for cancer classification using microarray gene expression data. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer International Publishing 468-477 (2015)

58. Z Cai, D Xu, Q Zhang, J Zhang, SM Ngai, J Shao: Classification of lung cancer using ensemble-based feature selection and machine learning methods. Mol Biosyst,11(3):791-800 (2015)
DOI:10.1039/C4MB00659C

59. N Bandyopadhyay, T Kahveci, S Goodison, Y Sun, S Ranka: Pathway-Based Feature Selection Algorithm for Cancer Microarray Data. Adv Bioinformatics. 2009:532989 (2009)
DOI:10.1155/2009/532989

60. W Engchuan, JH Chan: Pathway activity transformation for multi-class classification of lung cancer datasets. Neurocomputing 165: 81-89 (2015)
DOI:10.1016/j.neucom.2014.08.096

61. W Liu, X Bai, Y Liu, W Wang, J Han, Q Wang, Y Xu, C Zhang, S Zhang, X Li, Z Ren, J Zhang, C Li: Topologically inferring pathway activity toward precise cancer classification via integrating genomic and metabolomic data: prostate cancer as a case. Sci Rep,5:13192 (2015)
DOI:10.1038/srep13192

62. H Wang, H Zhang, Z Dai, MS Chen, Z Yuan: TSG: a new algorithm for binary and multi-class cancer classification and informative genes selection. BMC Med Genomics, 6 Suppl 1:S3 (2013)
DOI:10.1186/1755-8794-6-S1-S3

63. HS Eo, JY Heo, Y Choi, Y Hwang, HS Choi: A pathway-based classification of breast cancer integrating data on differentially expressed genes, copy number variations and microRNA target genes. Mol Cells, 34(4):393-8 (2012)
DOI:10.1007/s10059-012-0177-0

64. S Kim, M Kon, C DeLisi: Pathway-based classification of cancer subtypes. Biol Direct, 7:21 (2012)
DOI:10.1186/1745-6150-7-21

65. M List, AC Hauschild, Q Tan, TA Kruse, J Mollenhauer, J Baumbach, R Batra: Classification of breast cancer subtypes by combining gene expression and DNA methylation data. J Integr Bioinform,11(2):236 (2014)
DOI:10.2390/biecoll-jib-2014-236

Footnotes:

1 Notice that the complex interaction among pathways ruling cancer development is sometimes referred to as pathway cloud (30).

2 website: https://gdc-portal.nci.nih.gov/projects/TCGA-BRCA.

3 website: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE58212.

4 As different classifiers might result in different sample-class associations we here do not show the classification results for each single classifier, yet we provide a performance evaluation of the method based on average values of accuracy, precision and recall.

Key Words: Cancer Subtypes Classification, Breast Cancer, BC, Pathway Enrichment; Differentially Expressed Genes, DEG, Review

Send correspondence to: Alex Graudenzi, Institute of Molecular Bioimaging and Physiology of the Italian National Research Council (IBFM-CNR), Milan, Italy, Tel: 390221717552, Fax: 390221717558, E-mail: alex.graudenzi@unimib.it

>