[Frontiers in Bioscience 2, a31-36, November 1, 1997]
Reprints
PubMed
CAVEAT LECTOR




Table of Conents
 Previous Section   Next Section

FUNCTIONAL BIOINFORMATICS: THE CELLULAR RESPONSE DATABASE

James Sorace1,2,3, Kip Canfield1, Steven Russell1

1 Department of Information Systems, University of Maryland Baltimore County, 2The Department of Pathology and Laboratory Service Baltimore VA Medical Center, 3 Department of Pathology, University of Maryland at Baltimore School of Medicine, Baltimore, Maryland

Received 4/3/97 Accepted 10/24/97

3. METHODS

This project required three distinct phases of development. First, it was necessary to conceptually model the type of experimental design used in this field. Many cellular response experiments can be modeled as in figure 1.

Figure 1: Experimental notation for the Cellular Response Database. Agents A1 to An are added to identical test cell populations at time (T) and concentrations (Cx). The test agent At (bold arrow) is added only to the experimental group. The control agent Ac (dashed arrow) may be added to the control group. A biological response between the groups is then measured.

Three types of biological entities are involved, the test cell population, the target gene/protein whose activity is assayed, and agents that are tested to see if they alter the target gene's expression or activity. These agents include protein molecules such as cytokines, sterol hormones or drugs. Their common property is that, at a specific time point in the experiment, the investigator adds them at a predetermined concentration. It must be remembered that various combinations of agents may be used within any experiment, and that dose response and kinetic data may be presented. However, appropriate experimental design allows comparisons between two groups of test cells that differ only by treatment with one agent. In figure 1, this is shown with a bold arrow and it represents the test agent. In the thalidomide example noted above, LPS is present in both the experimental and control groups. Thalidomide is present only in the experimental group and is defined as the test agent. In other experimental designs, in addition to the test agent, a control agent must also be defined. In figure 1, this is shown with a dashed arrow. The control agent is added only to the control population of cells. In the thalidomide example noted above, a biologically inert thalidomide analog can be used as a control agent. In antisense experiments, the control agent may be an oligonucleotide with different sequence that is identical in composition to the test agent. Alternatively, if the test agent is an antibody, it would be an antibody of the same isotype with differing specificity. An experiment must always have one test agent. However, it may have several other agents, one of which may be a control agent. Table 1 gives several examples of how this notation is used.

Table 1: Examples of Agent Notation

EXPERIMENT

NUMBER OF AGENTS

TEST AGENT

CONTROL AGENT

TARGET GENE/PROTEIN

TEST CELL POPULATION

The investigator wishes to determine if the production of TNF-alpha in human PBMC that is induced by cytokine A is inhibited by drug B.

2 (cytokine A, Drug B)

Drug B

None

TNF-alpha

Human PBMC.

As in Experiment 1, except that drug B is not water soluble. Instead it is dissolved in DMSO, a chemical that is known to alter gene expression in some circumstances. The experimenter controls for this by adding an equal concentration of DMSO to the control group. However because it is present in both the experimental and control groups DMSO is not a control agent.

3 (Cytokine A, Drug B, DMSO)

Drug B

None

TNF-alpha

Human PBMC

The investigator has determined that IFN-gamma induces the expression of a transcription factor. Using an anti-sense oligonucleotide A directed against the transcription factor, the investigator wishes to see if IFN-gamma induction of IL-12 is down regulated in the murine RAW 264.7 macrophage cell line. As a control an oligonucleotide B of identical composition but differing sequence to A will be used. Since oligonucleotide B is found only in the control group it is the control agent.

3 (IFN-gamma, oligonucleotide A, oligonucleotide B)

Oligonucleotide A

Oligonucleotide B

IL-12

murine RAW 264.7 macrophage cell line

The investigator wishes to determine if an anti-Met oncogene monoclonal antibody (Mab 1), inhibits hepatocyte growth factor induced MAP kinase phosphorylation in NIH 3T3 cells. As a control the investigator will use a monoclonal antibody directed against an irrelevant antigen (Mab 2).

3 (Mab 1, Mab 2, hepatocyte growth factor).

Mab 1

Mab 2

MAP Kinase

NIH 3T3 Cells

The next phase of development consisted of several rounds of data modeling based on the experimental model discussed above. This resulted in the relational database schema presented in figure 2. This schema enables queries to be performed on test cell populations, the target protein, and the agents that are being tested. We have attempted to store important information associated with each of the experimental entities. For example, the target cell population may be a cell line (cloned or polyclonal), or primary culture (e.g. murine peritoneal exudate cells). The data model also supports searching by species, cell name, cell type, ATCC number, or organ of origin. Additional relevant data such as cellular concentration or culture conditions can be stored in a memo field in the database, or linked information in an electronic manuscript. Secondly, the gene/protein target can be referenced by name, species or GenBank number thus linking the target gene of interest to current sequence databases. The gene/protein target may be detected experimentally in several ways. In the thalidomide examples noted above this has included enzyme linked immunosorbent assay (ELISA) measurements of protein target (TNF-alpha), Northern blot measurement of the mRNA, or bioassay (L929 cell cytotoxicity). This broad range of possible detection systems, and the lack of numerical quantitation of many of them (e.g. bands on a Northern blot), is one of the major challenges for the database designer. In our implementation, we have adopted several approaches. First, a qualitative description of the change is entered (table 2) along with the type of detection assay used (ELISA, cytotoxicity, Northern blotting). Secondly, quantitative values and their units can also be defined and entered.

Figure 2: The data model for the Cellular Response Database: The most recent data model for this database is displayed.

Table 2: Patterns of target gene response

PATTERN #

DESCRIPTION

1

Up regulated, not detectable before treatment

2

Up regulated, but detectable before treatment

3

Down regulated

4

Basal level of expression unchanged

5

Not detected before or after treatment

6

Variable depending on dose

The database stores the qualitative interpretation of the response pattern in the Assay Table. In addition, numerical data can also be stored by the database.

In some cases, assays may be run in which a specific gene/protein target is not measured, but a general biological property is. For example, considerable public domain data exist on the growth inhibition of tumor cell lines (6). The CRD can handle this type of data without modification by defining the gene/protein target as "none" (a null default value), and the assay type as GI50 (for growth inhibition 50%). Finally, a figure and caption can be linked. Also, as noted above, additional information can be stored in text fields or obtained if necessary by linking to a manuscript. Next, a Microsoft Access database has been developed and populated with data derived from several literature references. Several queries of this database are accessible through a CGI interface.