[Frontiers in Bioscience S2, 1127-1144, June 1, 2010] |
|
|
System biology analysis of cell cycle pathway involved in hepatocellular carcinoma Meiqian Sun1, Wenjuan Mo1, Xuping Fu1, Gang Wu1, Yan Huang1, Rong Tang1, Yi Guo1, Minyan Qiu2, Feng Zhao3, Lin Li4, Shengdong Huang3, Yumin Mao1, Yao Li1, Yi Xie1
1 TABLE OF CONTENTS
1. ABSTRACT To investigate genetic mechanisms of hepatocarcinogenesis and identify potential anticancer targets in hepatocellular carcinoma (HCC), we analyzed microarray gene expression profiles between 33 HCCs and their corresponding noncancerous liver tissues. Functional analysis of differentially-expressed genes in HCC indicated that cell cycle dysregulation plays an important role in hepatocarcinogenesis. Based on 14 differentially-expressed genes involved in cell cycle in HCC, we applied Structural Equation Modeling (SEM) to establish a potential genetic network which could assist understanding of HCC molecular mechanisms. siRNA-mediated knock-down of two significantly up-regulated genes, minichromosome maintenance protein 2 (MCM2) and cyclin B1 (CCNB1), in HCC cells (SMMC-7721 and QGY-7703) induced G2/M-phase arrest, apoptosis and antiproliferation in HCC. Some up-regulated cell cycle-related genes in HCC were down-regulated following specific depletion of MCM2 or/and CCNB1 in HCC cells, which might well validate and complement the reconstructed cell cycle network. This study may contribute to further disclose hepatocarcinogenesis mechanism through systematically analyzed the HCC-related-cell-cycle pathway. This study also shows that MCM2 and CCNB1 could be promising prognostic and therapeutic targets for HCC. 2. INTRODUCTION Hepatocellular carcinoma (HCC) is one of the most frequent human cancers. Incidence is increasing and HCC has risen to become the 5th commonest malignancy worldwide and the third leading cause of cancer-related death(1-3). The highest frequencies are found in sub-Saharan Africa and far eastern Asia, where hepatitis B virus (HBV) and hepatitis C virus (HCV) infections are endemic, and in regions where food contaminated with Aflatoxin B1 is consumed(1, 4). Although currently relatively low, the incidence of HCC is rising in developed western countries(1, 5-7). The major known risk factors for HCC are viral (chronic hepatitis B and hepatitis C), toxic (alcohol and aflatoxins), metabolic (diabetes and non-alcoholic fatty liver disease, hereditary haemochromatosis) and immune-related (primary biliary cirrhosis and autoimmune hepatitis)(1, 8). HCC is prevalently male associated with Male/Female ratios ranging from 1.3 to 12.9 according to the geographic area. It is a rapidly fatal disease, with a life expectancy of about 6 months from the time of diagnosis. Partial liver resection or liver transplantation are potentially curative, but only a minority of cases are amenable to these treatments for various reasons including the fact that most patients at risk are not diagnosed in time. So far, no first-line therapy has emerged for advanced HCC. Cytotoxic chemotherapy has proven ineffective(9). HCC is characterized by a very poor prognosis and is associated with high mortality(9-11). Therefore, it is very important to focus research efforts on disclosing the mechanism of HCC carcinogenesis, which could be helpful to improve the current diagnosis and therapy of HCC. Hepatocarcinogenesis is a slow process during which genomic changes progressively alter the hepatocellular phenotype to produce cellular intermediates that evolve into hepatocellular carcinoma. The development of HCC is a multistep process associated with changes in host gene expression, some of which correlate with the appearance and progression of tumor. An understanding of the molecular pathogenesis of HCC may provide new markers for tumor staging, for assessment of the relative risk of tumor formation, and open new opportunities for novel diagnosis and therapeutic intervention of HCC.(12, 13) Progress in basic scientific research has led to a better understanding of molecular mechanism responsible for HCC (12-15). For example, altered DNA methylation, genomic alterations (13, 14) (16, 17) and many deregulated genes such as HBx, TP53, IGF2, CDKN2A (p16INK4A), RB1, PTEN, DLC1, MMP, APC, CTNNB1, and AXIN1 (12-15, 18) may play roles in development of HCC. Biological pathways such as upregulation of mitogenic pathways, MAPK/ERK, ras/raf/MAPK, NFκ-B, ERBB2/NEU, JAK/STAT, and Wnt/β-catenin signal transduction pathways (14) have been found to be altered in HCC. Since uncontrolled cell proliferation is the hallmark of cancer and the dysregulations of cell cycle are directly related to uncontrolled cell proliferation, the dysregulations of cell cycle pathway may play a pivotal role in the mechanism of HCC carcinogenesis. HCC cells have typically acquired damage to genes that directly regulate their cell cycles. Many cell-cycle-related genes have been reported to be closely involved in the HCC carcinogenesis, such as p53(19), p21(20), ATM(21), activation of the CKI-CDK-Rb-E2F pathway(22), p16/INK4, cyclin/cdk complex(23), TGFbeta(24), p38 MAPK(25), Cyclin D1(26), CDC25A(27). Furthermore, Inhibition of some important cell-cycle-related oncogenes (CDC25A(27), mitogen-activated protein kinase activation(28), Cdk2 activity(29)) or activation of some important tumor suppressors (p53(30), p21(Cip1) and p27(Kip1)(28, 31)) have been reported to induce HCC cell cycle arrest, apoptosis and antiproliferation, which might be novel therapeutic targets for HCC therapies. These researches are very insightful; however, cell cycle pathway is so complex that systematical analysis of HCC-related cell cycle pathway is needed and could be helpful to the better understanding of mechanism of HCC carcinogenesis, and to the improving HCC diagnosis and therapy. Therefore, a reconstruction of gene interaction network for pivotal portion of cell cycle is very necessary, which belongs to the well known genetic network and describes how genes interact with each other 'in concert' to achieve specific phenotypic characters(32). Genetic network reconstruction is a major task in system biology, which has emerged and developed rapidly for the purpose of unraveling the mechanisms of biological systems(33-37). Nowadays, system biology is greatly facilitated by DNA microarray observation for mRNA expression combined with computational models. Among the computational methods, Structural Equation Model (SEM)(38, 39) has recently exerted high ability in genetic networks reconstruction. There are several advantages in SEM for being applied to reconstructing cell cycle gene interaction network in this study. First, many current computational methods for genetic network reconstruction solely focus on network structure, however, it is indispensable to measure quantitatively the relationship between genes when studying their regulatory properties(40). SEM accomplishes this by figuring out the coefficient parameters of each gene-gene interaction edge, which measure the regulatory effects of one gene on others or the strength of the gene-gene interactions. Second, not only quantifying the interacting relationship among genes, SEM can also confirm the causality, i.e., identifying the upstream and downstream direction in an interacting pair. Third, since most genetic networks are not fully connected(41), it is suitable to employ structural equations to describe those sparse networks. Fourth, it has been deduced that, linear structural equations are very desirable for construction of a first-order approximation model of a genetic network using steady-state gene expression measurements(38, 42). It is generally accepted that DNA microarrays and RNA interference (RNAi) are useful tools for identification and validation of biomarkers in disease research. In recent years, DNA microarrays have been used to identify genes involved in various diseases including HCC (43-56). To apply microarray data to clinical use, it is necessary to identify a small set of genes which can be used as clinical biomarkers. It is useful to combine DNA microarrays with RNAi. Being effective and highly specific, use of RNAi represents another powerful tool for exploring gene function and validating novel therapeutic targets (57-59). The molecular targeted therapy has been emerged. Many novel therapeutic agents for HCC have been evaluated in phase I and II studies, such as sorafenib (Nexavar), erlotinib (Tarceva), bevacizumab (Avastin), and flavopiridol(60). These targets of these drugs may be vascular endothelial growth factor (VEGF), tyrosine kinase, cyclin-dependant kinase, and so on. Flavopiridol is the cyclin-dependant kinase inhibitor. Therefore, HCC-cell-cycle-pathway-related genes might become promising therapeutic targets of HCC. In this study, we systematically combined cDNA microarrays and RNAi analyses to identify and validate novel biomarkers involved in HCC carcinogenesis. We used linear structural equations to model a network composed of 14 cell-cycle-associated genes, which were chosen after analyzing data from 33 HCC microarrays. Then an improved genetic algorithm was applied to calculate network parameters and finally a network was identified based on statistical and biological criteria. One known interaction was rebuilt in the resulted network and several novel interactions were mined out. The validation and complement of the reconstructed cell cycle network were confirmed by published literatures and some biological experiments. Our work facilitated the discovery of those novel connections among genes mainly involved in the cell cycle of tumors. In addition, we suggested a further framework about how to deal with large networks. Our findings may help further discover genetic mechanism of HCC, and provide clues for identifying novel prognostic, diagnostic and therapeutic targets. 3. MATERIALS AND METHODS 3.1. Tissue samples and RNA isolation All the 33 primary HCC samples and noncancerous liver tissues used in cDNA microarray analysis were obtained with informed consent from patients who underwent curative resection at different Chinese hospitals in Guangxi, with full institutional review board approval. They are predominantly male and hepatitis B surface antigen (HBsAg)-positive. All liver tissues were verified by pathological examination. HCC samples were histopathologically diagnosed following Edmonson's classification (61). Total RNA was extracted from each sample using TRIzol (GibcoBRL, Grand Island, NY) following manufacturer's instructions. 3.2. cDNA microarrays Fabrication of cDNA microarray containing 12800 genes, probe preparation, microarray hybridization, image detection and data normalization were carried out as previously described (45, 46). For convenience of comparison, ratios of Cy5 (tumor) to Cy3 (nontumor) were log2-transformed and then converted back to fold change. Differentially expressed (DE) genes in HCC were selected according to criteria of P<0.05 by one-way analysis of variance (ANOVA) test (62), false discovery rate (FDR)≤5% (63). Pathway analysis was performed using GenMAPP 2.0 software (available at http://www.genmapp.org) (64). P<0.05 calculated by MAPPFinder was considered significant. We used EASE software (available at http://david.abcc.ncifcrf.gov/ease/ease.jsp) to assign DE genes to "Gene Ontology (GO) Biological Process" categories and test statistically (EASE Score, modified Fisher's exact test) for significant overrepresentation of identified genes within each category (65, 66). EASE Score<0.05 was considered significant. Full cDNA microarray data followed MIAME guidelines and will be available in NCBI's Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/. Accession number: GSE4108). 3.3. Reconstruction of cell-cycle genetic network for hepatocellular carcinoma The following scheme as illustrated in Figure 1 is proposed to reconstruct the genetic network for HCC cell-cycle. First, we select differentially expressed genes which are involved in the cell-cycle aberration of HCC by microarray data analyses. Second, using these genes, we model candidate networks with linear structural equations, and then use an improved genetic algorithm to work out the regulatory coefficients of a network for both normal and abnormal cells. Third, we identify the most appropriate genetic network according to two criteria. One criterion is derived from the combination of the two ideas: the difference between sample covariance and model covariance should be as small as possible; at the same time, the network should not be too complex. The other criterion is obtained from biological supposition that the accumulation of regulatory variation ultimately induces cancer occurrence. Fourth, we confirm our network with gene-gene interactions in published literatures and biological experiments and exploit the biological significance of our result. Here it's presumed that the framework of genetic network for both normal and abnormal cases keep consistent, only interaction coefficients vary. 3.3.1. Gene selection Since the turbulence of cell cycle is crucial for the occurrence of cancer, only a very small part of genes, which are critical to the aberration of cell cycle, are chosen for the reconstruction. The complete gene selection consists of two steps. An initial selection is done by identifying differentially expressed genes from comparing gene expression profiles of primary HCC samples with those of corresponding noncancerous liver samples. The whole process is based on both controlled false discovery rate(67) and one-way analysis of variance (ANOVA) test(62). In the subsequent selection, these differentially expressed genes are associated with different biological pathways using Gene Microarray Pathway Profile software (GenMAPP 2.0)(64). In addition, these genes are grouped into GO categories to reveal overrepresented gene functions. Finally, a small list of significant genes are found to be involved in cell-cycle function and chosen as basic elements for genetic network reconstruction. 3.3.2. Modeling genetic network Usually, a reconstructed genetic network can be represented by an edgeweighted graph. It contains two parts. One part is the framework, which consists of nodes representing genes and edges representing the existence of some relationship between the two linked genes, and the framework can be either directed or undirected, with or without loops. The other part is the definition of the coefficients, i.e. weight, associated with each edge. Structural equation model represents the genetic network with a directed path diagram, and it can contain feedback loops(68), but to avoid too much complexity, in this study, we do not encourage this. The nodes in the diagram denote genes, the edges represent interactive effects of one gene on others, the positive coefficient of an edge corresponds to an induced effect, and the negative one correspond to repressed effect. The genetic network can be modeled with linear structural equations as follows < where Y is a vector of p endogenous variables and X is a vector of q exogenous variables. Endogenous variables and exogenous variables are both observed variables, here representing gene expression levels with asymptotically normal distribution. Exogenous variables lie outside the model, e.g. they often act as the initiators of pathways, and endogenous variables are determined through joint interaction with other variables within the system. The terms exogenous and endogenous are model dependent, i.e. an exogenous variable in one model may play the role as an endogenous variable in another model. e denotes the residual error variables that cannot be measured and represents all other un-modeled causes to the observed variables. We assume that E (e) = 0 and e is uncorrelated with observed variables, and < The basic hypothesis for general structural equation is < where < < (38, 68). It has been well known that the method of maximum likelihood (ML) estimators is consistent and asymptotically unbiased(69), therefore, the unknown parameters in B, < < where p and q are the numbers of endogenous and exogenous variables, and Tr denotes the trace of a matrix. To find out the maximum likelihood estimate of the model parameter values, we use genetic algorithm (GA) due to its intelligent ability of automatic learning(71). GA is a simulation model based on Darwinian evolution mechanism, and a fitness function f can be used to measure the evolvement. Relation between fitness function< < Individuals with higher fitness have larger opportunities for propagation, and the fitness will finally converge to its maximum value through the process of inheritance, recombination and variation from generation to generation. To avoid the usual drawback of premature convergence to local extremum and overmuch time consumption when applying traditional GA, an improved algorithm proposed by Mao is applied in our study, which makes improvements in several aspects(72). First, fitness scale transformation for each individual in the population can prevent the fitness values from being too large or too small, otherwise the whole population would be occupied by the individuals with considerably large fitness, and therefore traditional GA could not terminate since fitness values in a generation are so close to each other. Second, instead of random selection, reservation of optimal individual insures the right direction of evolvement. Third, introducing some new individuals with equably random distribution can expand the searching range and effectively ameliorate the problem of premature convergence. Fourth, parallel rather than ordinal performance of inheritance, recombination and variation on the same parental generation insures that the filial generation is composed of optimal individuals with large fitness values. One run of the genetic algorithm leads to a local solution because of randomly generating individuals in the initial step, even with the improved algorithm. Generally, using many runs with different initializations allows exploring the entire data space for the best solution. In any possible path diagrams, if the interaction between two genes has been published or validated, we retain those recognized effects. In addition, at the biological level, a gene only interacts with a small number of other genes. So we calculate correlation coefficients among genes, and refer to the correlation coefficient as 'mutual information', which measures the inter-dependence between two genes. Only if the coefficient is above a certain value, the corresponding two genes are assumed to be interacted and are linked with a directed edge in the path diagram. Thus, the number of candidate hypothetical networks is reduced based on prior knowledge. This step efficiently avoids the agonizing puzzledom of too much time consumption, and furthermore, it does not lose the necessary information of reconstructing genetic network, as shown in the validation section. 3.3.3. Identification of genetic network It's a great challenge to pick out the best fit genetic network from many hypothetical ones. At present, there are several different standards(38), each has its own advantages and disadvantages, and should be chosen according to special circumstances. Here we carry out the identification based on two criteria, one being statistical significance and the other being biological significance. A widely used approach to measure the goodness of a model is the Akaike information criterion (AIC)(68), which is defined as < where N is the number of samples, < However, AIC information cannot be employed to testify whether the identified genetic network is biologically feasible. Since functional mutations in the genes will often cause changes in regulatory effects, we assume that the accumulation of mutations is the key factor for cell alteration from normality to abnormality. Then we propose a simple and effective criterion for the measurement of the differentially regulatory accumulation (DRA), the average value of absolute differences of coefficients between normal and abnormal cell, which can be written as < where n is the total number of regulatory coefficients, < 3.4. Cell culture Two HCC cell lines (SMMC-7721 and QGY-7703) and one normal liver cell line (L02) were obtained from American Type Culture Collection (ATCC) (Manassas, VA). Cells were maintained in Dulbecco's modified Eagle medium (DMEM) supplemented with 10% fetal calf serum (FCS) (PAA, Pasching, Austria) in humidified 37�C incubator with 5% CO2. 3.5. RNA interference Small interfering RNA (siRNA) oligonucleotides with 2-nt (2'deoxy) uridine at 3' end of the sequence were designed for MCM2 (Genbank accession number: NM_004526) (sense, 5'-GUCGCAGUUUCUCAAGUAU-3'), CCNB1 (Genbank accession number: NM_031966) (sense, 5'-GCCUGAGCCUAUUUUGGUU-3'), and negative control (sense, 5'-UUCUCCGAACGUGUCACGU-3'). These siRNA duplexes were synthesized by GeneChem (Shanghai, China), and transfected into SMMC-7721 and QGY-7703 cells using Lipofectamine 2000 (Invitrogen, Carlsbad, CA) following manufacturer's instructions. siRNA final concentration was 50 nM or 100 nM per well. Cells were harvested at different time points for further analysis. 3.6. Quantitative real-time reverse transcription-PCR Quantitative real-time RT-PCR was used to validate microarray data and gene knockdowns with RNAi. Total RNA was extracted from untransfected and transfected cells with TRIzol (GibcoBRL). Reverse transcription was performed using Reverse Transcriptase Kit (TaKaRa, Kyoto, Japan) following manufacturer's instructions. Quantitative real-time PCR of MCM2 and CCNB1 was performed in triplicate using Sybr Green Mastermix on ABI Prism 7900 Sequence Detection System (Applied Biosystems, Foster City, CA). GAPDH was used as endogenous control. PCR conditions were 95�C for 15 min, 40 cycles of 95�C for 15 s and 60�C for 1 min. PCR primer sequences were as follows: MCM2 5'-CTCAACCAGATGGACCAGGA-3'(sense), 5'-GATGTGCCGCACCGTAATG-3'(antisense); CCNB1 5'-TGCAGCACCTGGCTAAGAATG-3'(sense), 5'-AGTGCAGAATTCAGCTGTGGTAGAG-3'(antisense); and GAPDH 5'-GCACCGTCAAGGCTGAGAAC-3'(sense), 5'-ATGGTGGTGAAGACGCCAGT-3'(antisense). 3.7. Cell cycle analysis Cells were trypsinized for cell cycle analysis at 48 hours after siRNA transfection. Cell cycle was analyzed with propidium iodine (PI) staining using CycleTEST PLUS DNA Reagent Kit (Becton Dickinson, San Jose, CA) following manufacturer's instructions. Approximately 20,000 cells were acquired and analyzed using CellQuest software on Becton Dickinson FACSCalibur flow cytometry. Cell cycle distributions were calculated with ModFit LT 2.0 software (Becton Dickinson). 3.8. Apoptosis detection using flow cytometry Cells were trypsinized for apoptosis analysis at 72 hours after siRNA transfection. Apoptosis was detected using APO-BRDU Kit (Pharmingen, Becton Dickinson) following manufacturer's instructions. After incubation with PI staining, approximately 20,000 cells were acquired and analyzed using CellQuest software on Becton Dickinson FACSCalibur flow cytometry. 3.9. MTT assay Cells (1�104 cells/well) were seeded in triplicate in 96-well plates. After siRNA transfection for 24, 48, and 72 hours, cells were stained with 20 μL 3-(4,5-dimethylthiazolyl-2)-2,5-diphenyltetrazolium bromide (MTT) (5 mg/mL, Sigma, St. Louis, MO) for 4 h at 37�C. Then culture medium was removed, 150 μL DMSO (Sigma) was added to each well and thoroughly mixed for 10 min. Absorbance values at 490 nm (A490) were measured on Spectra Microplate Reader (ELx800, Bio-Tek, Winooski, VT). Cell growth inhibitory rate was calculated according to the following formula: Inhibitory rate (%) = ((A490control-A490sample)/A490control)�100% A490control: Absorption of cells transfected with negative control siRNA; A490sample: Absorption of cells transfected with MCM2 siRNA or/and CCNB1 siRNA. 3.10. Statistical analysis All RNAi experiments were performed in triplicate. Data were expressed as mean�SD. Comparison between each experimental siRNA treatment and negative control siRNA treatment was determined using one-way ANOVA followed by Dunnett's t test. All analyses were performed using SPSS 12.0 software (Chicago, IL). Ps<0.05 were considered significant. 4. RESULTS 4.1. Identification and functional analysis of differentially expressed genes in HCC cDNA microarray analysis of gene expression profiles between 33 HCCs and the corresponding noncancerous liver tissues identified 1067 DE genes (P<0.05 by ANOVA; FDR≤5%) in HCC patients. Pathway analysis using GenMAPP 2.0 indicated 13 pathways were significantly deregulated in HCCs comparing to normal liver tissues (P<0.05), such as Cell_cycle, G_Protein_Signaling, Alanine_and_aspartate_metabolism, Fatty_Acid_Synthesis, and Complement_and_Coagulation_Cascades pathways. Notably, 14 up-regulated and 3 down-regulated genes in HCC were involved in cell cycle (Figure 2). The majority of these genes play important roles in regulating cell cycle progression - for example, MCM2 and CDC7 in S phase; CCNB1, CCNB2 and CDC25C in G2 phase; BUB1B and MAD2L1 in M phase. MCM2 is phosphorylated, and regulated by CDC7(73). CDC25C directs dephosphorylation of CCNB-bound CDC2 and triggers entry into mitosis. BUB1B and MAD2L1 act cooperatively to prevent premature sister hromatids separation by directly inhibiting anaphase-promoting complex. Using EASE analysis, we further identified overrepresented GO biological process categories (EASE score<0.05) with the above 17 cell cycle-related DE genes in HCCs. GO and pathway analysis of DE genes in HCC revealed that up-regulated genes are mainly associated with cell cycle progression, while most down-regulated genes contribute to immune response. 4.2. Results of the cell-cycle network reconstruction We assumed that the 17 cell cycle-related genes made up of a local network. So these 17 genes were chosen for genetic network reconstruction. It should be noticed that, in the 17 chosen genes, MCM2, MCM4 and MCM7 belong to the same gene family, while CCNB1 and CCNB2 belong to another gene family. Genes in the same family have similar functions and regulatory effects. Hence, the number of candidate genes can be reduced to 14, based on the assumption that MCM2 represents MCM family and CCNB1 represents CCNB family. Additionally, cell cycle is a very complicated process. Here we made a simplified assumption. The cell cycle is made up of four consecutive phases, i.e. G1, S, G2 and M. Some genes are found to function mostly in one certain phase, therefore are called phase genes. We supposed that only genes belonging to neighboring phases can interact, i.e. a G1 phase gene can only interact with one S phase gene, and so do other phase genes. In addition, since we could not infer recurrent genetic network, then we presumed that M phase genes cannot interact with G1 phase genes. Based on the assumptions, we could plot hundreds of path diagrams. Generally, two genes are considered to interact if their expression profiles are similar. Therefore, only two genes are linked in diagram if their absolute value of correlation coefficient calculated from microarray data is above a certain value. Then many interactions were filtered out and excluded in path diagram plotting. The complexity of the network was greatly reduced and a large amount of computational time could be saved. For each diagram, we use SEM for modeling and apply the improved GA to the calculation of model parameters. We perform 30 runs of the algorithm, and we kept the model parameter values which correspond to the lowest value of the criterion< < The coefficients show direct or indirect interactive effects. Positive/negative coefficient means an increase/decrease expression of genes at the left side equation due to the effect of gene in the right side of equation. For example, as the expression of G1 phase gene GSK3B increases, the expression of S phase gene E2F5 decreases. 4.3. Reconstructed cell cycle network validation and complement 4.3.1. Rebuilding a published interaction In order to testify our method, we intentionally let the edge between CDC7 and MCM2, whose interaction was reported by several publications(74), to be unknown. CDC7 and the MCM2 are both essential for the initiation of eukaryotic DNA replication. Previous literatures showed that MCM2 contains several phosphorylation sites which can be phosphorylated by CDC7. At the beginning of S phase, MCM2 is directly regulated by CDC7. Expectedly, in the identified network, the relationship between CDC7 and MCM2 was rebuilt, which is a direct support of our model. 4.3.2. Unclosing some interesting interactions Except for the relationship between CDC7 and MCM2, some other interactions, which can be confirmed or inferred from researches and publications, are unclosed. These interactions are also evidences supporting the correctness of our model. MCM2-7 are related to each other and interact to form a stable heterohexamer(75). MCM2-7 proteins bind to chromatin in a cell-cycle dependent manner, being tightly bound in G1 while being removed in S- and G2-phases(76). It functions as a primary helicase opening up the DNA at replication origins. PCNA is a member of the so-called DNA sliding clamp family(77). When chromatin was untied by MCM2-7, PCNA acts as a DNA sliding clamp for replicative DNA polymerases in S-phase. There is no report of direct physical relationship within them, but the result discovers the biological event by the time order and suggests MCM2 may regulate PCNA. PCNA in mammalian cells also appears to play a key role in controlling several other reactions together with different partners, including cell cycle regulation(77). It can bind to cyclin-CDK complexes(78). Biochemical studies show that PCNA interacts with CDK2-cyclin A complex. In many literatures, PCNA-CDK2-cyclin-A complex has been mentioned as an important regulatory mechanism in cell cycle control(77). CDK1-Cyclin B1(CCNB1) complex is essential for eukaryotic cell cycle to enter M-phase, with which PCNA forms complex with ATR/Chk1-dependent checkpoint(77). Then PCNA inhibit the start of M-phase. In our result, the arrow from PCNA to CCNB1 is rebuilt well to support this kind of cell cycle control model. MAD2L1 is a component of the mitotic spindle assembly checkpoint that prevents the onset of anaphase until all chromosomes are properly aligned at the metaphase plate. Although no direct evidence shows CCNB1 has physical relationship with MAD2L1, which is specially expressed in metaphase. It implies that only after CCNB1 started up M-phase can MAD2L1 be expressed afterwards. Our result finds out this time sequential control relationship and suggests that CCNB1 may indirectly affect the expression of MAD2L1 through unknown regulation mechanisms. 4.3.3. Experiment validation and complement of the network 4.3.3.1. Silencing of MCM2 and CCNB1 in HCC cells by siRNA The expression levels of MCM2 and CCNB1 were significantly higher (≥2-fold) in two HCC cell lines (SMMC-7721 and QGY-7703) than that in normal liver cell line (L02) with quantitative real-time RT-PCR. Quantitative real-time RT-PCR results are consistent with that in microarray experiments (Figure 4A). The inferred network can be validated by changing a gene's expression and observing its downstream gene responses. In this reconstructed network, two types of gene relations exist. One is the known relations and the other is inferred from SEM model. Here several inferred relations are focused on and experimentally validated. We investigated the relations between MCM2 and CCNB1. MCM2 was overexpressed in HCC and we also observed the overexpression of CCNB1. The results are consistent with prediction of the structural equation model for the network. In the model for HCC case described above, we can infer that regulation of MCM2 on CCNB1 through positive control of MCM2 on PCNA and PCNA on CCNB1. Therefore, the model predicts that overexpression of MCM2 would increase expression of CCNB1. For RNAi studies, siRNAs targeting MCM2 and CCNB1 were transfected separately or cotransfected into the two HCC cells, and their effects on altering expression levels were examined using quantitative real-time RT-PCR at 24 hours after transfection (Figure 4B). The reason for using two cell lines is to avoid the cell strain specificity. Gene expression changes were evaluated relative to negative control siRNA-treated cells. With both final siRNA concentrations of 50 nM and 100 nM, the specific suppression of MCM2 and CCNB1 expression was over 70%. We used 50 nM for further research. These results indicated that RNAi could significantly suppress the overexpression of MCM2 and CCNB1 in HCC cells (P<0.001). Furthermore, cotransfection of siRNAs targeting both genes was as effective as separate transfection (P<0.001). Cell division cycle 7 (CDC7) was another significantly up-regulated gene involved in cell cycle pathway. Interestingly, CDC7 and some other up-regulated genes involved in cell cycle pathway was found to be down-regulated (≥1.5-fold) following specific depletion of MCM2 or/and CCNB1 in HCC cells using quantitative real-time RT-PCR (Figure 5A and B), indicating that these genes might be downstream genes of MCM2 and CCNB1. ASK, MAD2L1 and PCNA were down-regulated following depletion of MCM2. PCNA and ORC2L were down-regulated following depletion of CCNB1. ASK, CDC25C, MAD2L1, PCNA and PTTG1 were down-regulated following depletion of MCM2 and/or CCNB1. These results might somewhat validate and complement the reconstructed network. For example, PCNA might be downstream gene of MCM2, which might validate the network. It is suggested that there may exist a feedback from CCNB1 to PCNA, which might complement the network. 4.3.3.2. Induction of G2/M-phase arrest, apoptosis and antiproliferation in HCC cells by depletion of MCM2 or/and CCNB1 In HCC cells (SMMC-7721 and QGY-7703), we used flow cytometry to observe influence of MCM2 or/and CCNB1 siRNA on cell cycle and on apoptosis, respectively. Cell cycle and apoptosis were analyzed at 48 and 72 hours after transfection, respectively. siRNA targeting each gene separately resulted in significant increase in G2/M-phase (P<0.022) and eventually led to obvious apoptosis (P<0.003) compared with negative control siRNA treatment. Notably, the G2/M-phase arrest (P<0.001) and apoptosis (P<0.001) phenotypes could be enhanced by simultaneous cotransfection of siRNA targeting both genes (Figure 6A-B). It is shown in the inferred model that between G2 phase genes and M phase genes, only two genes are predicted to be interdependent, i.e. CCNB1 regulates negatively MAD2L1 in abnormal state. Thus we supposed that the inhibition of CCNB1 would lead to the cell cycle arrest in G2/M phase in hepatocellular carcinoma cells. The influence of CCNB1 RNAi on the cell cycle validated the supposition. In both cell lines, after the negative control siRNA treatment, less than 10% cells are in G2/M phase, however, after CCNB1 siRNA treatment, the cells arrested in G2/M phase increase up to 20%. It is speculated that inhibition of the gene CCNB1 in G2 phase leads to the loss of the regulation effect of CCNB1 on MAD2L1 and consequent cell cycle arrest in G2/M phase. The experiment results are in accord with the supposition from the inferred network. Since MCM2 and CCNB1 are two of the key components required to regulate cell proliferation, we analyzed antiproliferative impact of MCM2 or/and CCNB1 RNAi in HCC cells (SMMC-7721 and QGY-7703) using MTT analysis. We found that absorbance at 490 nm decreased gradually in targets siRNA treatments from 24 hours to 72 hours after transfection, but increased gradually in negative control siRNA group (Figure 6C). Therefore, depletion of MCM2 or CCNB1 resulted in reduction in proliferation in HCC cells (P<0.043). When both genes were targeted simultaneously, the antiproliferative effect was more pronounced (P<0.011). Thus, RNAi-mediated depletion of MCM2 or CCNB1 induced G2/M-phase arrest, apoptosis and antiproliferation in HCC cells, and depletion of both genes enhanced the phenotypes. 5. DISCUSSION We analyzed gene expression profiles in HCC using cDNA microarrays. GO biological process and pathway analysis of DE genes involved in HCC carcinogenesis revealed that up-regulated genes are mainly associated with cell cycle, while down-regulated genes are mainly associated with immune response, which is consistent with previous report (53). Cell cycle-related genes were important because they might be directly bound up with tumor development. Some of these genes might be effective anticancer targets, such as the up-regulated genes associated with cell cycle pathway. Overexpression of these genes might contribute to activation of cell cycle pathway and play critical roles in HCC carcinogenesis. We reconstructed a genetic network for cell cycle in HCC based on a SEM framework. In the resulted network, a known edge (CDC7 to MCM2) is rebuilt, which we had intentionally made it unknown to test the goodness of our approach. Moreover, an inferred gene-gene relation in the network and a supposition of G2/M arrest in cell cycle can be validated experimentally. Some novel and interesting gene interactions can be mined out. For example, MCM2 may regulate PCNA in G1-phase when chromatin replicates, and CCNB1 may regulate MAD2L1 through controlling the onset of M-phase. In cell cycle, these genes are strictly expressed in special phase. Our result rebuilds such time-dependent relationship and implies these molecules may have a long-distance regulation. Although our model cannot provide the details of the reactions between two or more molecules, it still benefits further researches for discovering the regulation connection of these molecules. In our model, several assumptions were applied to simplify network model and save computation. The system of true cell is so complex that these assumptions would be apart from the real conditions. How to do the supposition to make computed network closer to real biological condition is always a great challenge to scientists. However, the resulted HCC cell cycle network from our model can be validated by some evidences, so our scheme to reconstruct network can also be easily applied to study other pathways and other complex diseases. In addition, structural equation modeling can also be used in other challenging work. Usually, general structural equations are made up of measurement model and structural model, i.e. path model. Measurement model describes relations between latent variables and observed variables, while structure model only deals with latent variables themselves. This sheds new light on reconstructing global network, which includes hundreds of genes involved in many different pathways. A global network can be divided into several local ones, which independently associate with a certain pathway. Gene expression data in a local network can be considered as observed variables. We can define few factors related to the disease, mainly involved in the function of the networks. The factor corresponding to a local network can be regarded as a latent variable. Therefore, measurement model describes the contributions of genes in a local network to the corresponding latent variable, and relations among latent variables are described by structure model. Then main relations among factors of a disease can be unraveled in light of the reconstructed network. Since the contributions of those factors to the disease can be quantitated, a great perspective of individual therapy is provided. Among the up-regulated cell cycle-related genes, MCM2 and CCNB1 were chosen for further analysis. Reason for overexpression of MCM2 and CCNB1 may be that they map to frequent cytogenetic gains of 3q21 and 5q12 in HCC (16, 17), respectively; while their overexpression may drive selection for the chromosomal gains. Moreover, as both MCM2 and CCNB1 are cell cycle regulated (79), their overexpression are likely to be due to increased cell proliferation and cycling in cancer cells; while their overexpression may induce cell proliferation. Recent studies have shown that MCM2 and CCNB1 are overexpressed in various tumors but present at low levels in normal tissues, indicating that they may be specific anticancer targets (80, 81). In HCC research, increased MCM2 mRNA levels have been reported (82). CCNB1 protein overexpression related to poor-differentiation in HCC has been reported (83). Therefore, MCM2 and CCNB1 may be potential diagnostic and therapeutic targets involved in HCC carcinogenesis and tumor progression. Recently, knockdown of MCM2 or CCNB1 alone has been reported to inhibit tumor cell growth (84-86), suggesting that downregulation of MCM2 or CCNB1 might become an interesting strategy for antitumor intervention. Here, this is the first report of RNAi analysis of MCM2 or/and CCNB1 in HCC cells. Our results show that RNAi can significantly silence the overexpression of MCM2 and CCNB1 in two HCC cells (SMMC-7721 and QGY-7703), either separately or together. Specific depletion of MCM2 or CCNB1 alone induces G2/M-phase arrest, apoptosis and antiproliferation in HCC cells, and depletion of both genes enhances the phenotypes, indicating the potency and effectiveness of siRNA against MCM2 or/and CCNB1 as a new strategy for HCC therapy. MCM2 is one of the MCM proteins which are essential for initiating and elongating replication forks during S-phase (87). Moreover, MCM proteins affect chromosome structure, which is consistent with the evidence that most MCM proteins don't colocalize with DNA synthesis sites. Even modest reductions in MCM proteins levels confer chromosome instability (88). Thus, G2/M-phase arrest resulted from MCM2 depletion may be due to the fact that loss of MCM2 function causes DNA damage and genome instability. CCNB1 complexes with CDC2 to form M-phase promoting factor (MPF), which is essential for G2/M phase transitions of cell cycle (89). This may be responsible for the effect that CCNB1 depletion resulted in G2/M-phase arrest. Because MCM2 and CCNB1 have a direct effect on mitosis, their overexpression in HCC may lead to uncontrolled cell proliferation and tumorigenesis. After performing RNAi, G2/M-phase arrest may be responsible for the effect that depletion of MCM2 or/and CCNB1 eventually led to apoptosis and antiproliferation in HCC cells. Interestingly, compared with depletion of MCM2 or CCNB1 alone, depletion of both genes enhances the phenotypes of G2/M-phase arrest, apoptosis and antiproliferation, indicating a relationship between MCM2 and CCNB1. The plausible explanations for the effects are as follows. It has been reported that MCM2 may interact with CCNB2 (90), suggesting that MCM2 may interact with CCNB1. Both CCNB1 and CCNB2 associate with p34cdc2 and are essential components of cell cycle regulatory machinery. B1 and B2 differ in their subcellular localization. Moreover, MCM2 protein is phosphorylated and regulated by protein kinases CDC2 and CDC7 (73, 91), while CCNB1 complexes with CDC2 to form MPF (89), suggesting that CDC2 and CDC7 might function as bridges between MCM2 and CCNB1. CDC7 was another significantly up-regulated gene involved in cell cycle pathway. Interestingly, in this study we found that CDC7 was down-regulated following specific depletion of MCM2 or/and CCNB1 in HCC cells, indicating that MCM2 and CCNB1 might be upstream genes of CDC7 and act on CDC7. Therefore, MCM2 and CCNB1 may be useful in HCC diagnosis and therapy, and it's more efficient to disrupt functions of both genes in therapy. Furthermore, we found that some other up-regulated genes involved in cell cycle pathway were down-regulated following specific depletion of MCM2 or/and CCNB1 in HCC cells, indicating that ASK, MAD2L1 and PCNA might be downstream genes of MCM2; PCNA and ORC2L might be downstream genes of CCNB1; and ASK, CDC25C, MAD2L1, PCNA and PTTG1 might be downstream genes of MCM2 or/and CCNB1, and their down-regulation might be involved with the interaction with MCM2 and CCNB1. It has been reported that MCM2 may interact with ASK, and CCNB1 may interact with PCNA, which are consistent with our results. Here, our work is the first report of the other interactions. These interaction results also validated and complemented some results of the cell-cycle network reconstruction. Some interaction results might well validate the network reconstruction, such as PCNA and MAD2L1 might be downstream genes of MCM2, while CDC25C, MAD2L1, and PTTG1 might be downstream genes of CCNB1. In addition, some other interaction results might complement the network reconstruction, such as feedbacks from CCNB1 to PCNA and ORC2L, and feedbacks from MCM2 to CDC7 and ASK. In conclusion, our studies may be helpful to further disclose the mechanism of HCC carcinogenesis through systematically analyzed the HCC-related-cell-cycle pathway. Our studies also show that MCM2 and CCNB1 may be potential biomarkers involved in HCC carcinogenesis. Importantly, MCM2 and CCNB1 may serve as promising molecular targets for HCC therapy. 6. ACKNOWLEDGMENT Meiqian Sun and Wenjuan Mo equally contributed to this article. This study supported by grants from the China Mega-Projects of Science Research for the 11th Five Year Plan (no.2009ZX10004-313) and the China 863 project (2006AA02Z324). We thank Dr. Zhao-yong Mao of Northwestern Polytechnical University for his helpful discussion about genetic algorithm and generous provision of the program. 7. REFERENCES 1. A. I. Gomaa, S. A. Khan, M. B. Toledano, I. Waked and S. D. Taylor-Robinson: Hepatocellular carcinoma: epidemiology, risk factors and pathogenesis. World J Gastroenterol, 14(27), 4300-8 (2008) 51. M. J. Callahan, Z. Nagymanyoki, T. Bonome, M. E. Johnson, B. Litkouhi, E. H. Sullivan, M. S. Hirsch, U. A. Matulonis, J. Liu, M. J. Birrer, R. S. Berkowitz and S. C. Mok: Increased HLA-DMB expression in the tumor epithelium is associated with increased CTL infiltration and improved prognosis in advanced-stage serous ovarian cancer. Clin Cancer Res, 14(23), 7667-73 (2008) Abbreviations: HCC: hepatocellular carcinoma; RNAi: RNA interference; DE: differentially expressed; FDR: false discovery rate; ANOVA: analysis of variance; GO: Gene Ontology; MCM2: minichromosome maintenance protein 2; siRNA: small interfering RNA; CCNB1: cyclin B1; RT-PCR: reverse transcription polymerase chain reaction; MTT: 3-(4,5-dimethylthiazolyl-2)-2,5-diphenyltetrazolium bromide; SEM: structural equation model; ML: maximum likelihood; GA: genetic algorithm; AIC: Akaike information criterion; DRA: differentially regulatory accumulation. Key Words: Hepatocellular Carcinoma, Microarray, Expression Profile, Structural Equation Modeling, Genetic Network, Genetic Algorithm, Cell Cycle Pathway, RNA interference, MCM2, CCNB1 Send correspondence to: Yi Xie, Institute of Genetics, School of Life Science, Fudan University, Shanghai 200433, PR China, Tel: 86-021-55520025, Fax: 86-021-65642502, E-mail: yxie@fudan.edu.cn |