[Frontiers in Bioscience 3, d973-984, September 1, 1998] |
EVOLUTION AND PHYLOGENY OF DEFENSE MOLECULES ASSOCIATED WITH INNATE IMMUNITY IN HORSESHOE CRAB
Sadaaki Iwanaga and Shun-ichiro Kawabata
Department of Biology, Faculty of Science, Kyushu University, Fukuoka 812-8581, Japan
Received 7/13/98 Accepted 8/21/98
3. DEFENSE MOLECULES
3.1. Defense molecules found in horseshoe crab hemolymph
In horseshoe crabs, one of the major defense systems is carried by the hemolymph, which contains at least two types of hemocytes, granular and non-granular cells, based on cell morphology (12). However, there appears to be only one type of hemocytes in the systemic circulation of the adult intermolt animal, the so-called amoebocyte/granulocyte, since the population density of the non-granular hemocytes is only 1% of the total cells. The granular hemocyte is an oval, plate-shaped structure, 15-20 mm in its longest dimension. It contains numerous dense granules classed into two types: large (L) and small (S) granules. The L-granules are larger (up to 1.5 mm in diameter) and less electron dense than the S-granules (0.6 mm in diameter).
Table 1 summarizes the proteins and peptides so far found in the hemocytes and hemolymph plasma (8,10). These components, including clotting factors, protease inhibitors, antibacterial substances, lectins and others, are closely associated with the host defense of this animal. Unlike mammalian blood plasma, horseshoe crab hemolymph plasma does not contain so many proteins but circulates three major proteins: hemocyanin (13), C-reactive proteins (14), and a2-macroglobulin (15). On the other hand, the circulating hemocytes, which are filled with the two secretory L- and S-granules, contain many kinds of defense proteins and peptides (6).
Table 1. Defense molecules found in hemocytes and hemolymph plasma of the horseshoe crab.
LICI, limulus intracellular coagulation inhibitor; LTI, Limulus trypsin inhibitor; LEBP-PI, Limulus endotoxin-binding protein-protease inhibitor; GNB, Gram-negative bacteria; GPB, Gram-positive bacteria; FN, fungus; LPS, lipopolysaccharide; LAF, Limulus 18-kDa agglutination-aggregation factor; KDO, 2-keto-3-deoxyoctonic acid; PC, phosphorylcholine; PE, phosphorylethanolamine; SA, sialic acid; TTA, Tachypleus tridentatus agglutinin; LCRP, Limulus C-reactive protein; TCRP, Tachypleus C-reactive protein; HLA, hemolytic activity; LTA, lipoteichoic acid; ND, not determined.
As is well known, the horseshoe crab hemocytes are extremely sensitive to lipopolysaccharides (LPS), a major cell wall component of Gram-negative bacteria, and respond by degranulating the granular components after the LPS-mediated stimulation (12,16,17). This response is thought to be very important for host defense involving the engulfing and killing invading microbes, in addition to preventing the leakage of hemolymph. Thus, L-granules contain at least 24 proteins, a majority of which are clotting factors, serpins, and various lectins. In contrast, the S-granules contain at least 6 proteins with molecular masses of less than 30 kDa, in addition to an antimicrobial peptide tachyplesin and its analogues (18-25). More recent studies on the S-granules demonstrate that they store selectively not only tachyplesins but also big defensin (26), tachycitin (27, 28), and tachystatins (28), all of which show antimicrobial activities against Gram-negative and -positive bacteria and fungi (table 1). These components of the S-granules have been recently found to have specific binding to chitin, a polymer of N-D-acetylglucosamine, which constitutes mainly helmet and exoskeleton of horseshoe crab (Kawabata et al., unpublished data).
Figure 1 shows one defense system linking with the clotting cascade found in hemocytes of the horseshoe crab T. tridentatus (8). When Gram-negative bacteria invade the hemolymph, the hemocytes detect LPS on their surface, and then release the contents of granules through rapid exocytosis. The released granular components include two biosensors of the clotting reaction, factor C (29-35) and factor G (36-38). These serine protease zymogens are autocatalytically activated by LPS and (1,3)-b-D-glucan, the latter of which is a major cell wall component of fungi. The activation of these two zymogens triggers the clotting cascades, resulting finally in the conversion of coagulogen to an insoluble coagulin gel. In parallel with the process of coagulin gel formation, various agglutinins/lectins that induce cell aggregation are released from L-granules. Thus, the invaders in the hemolymph are engulfed and immobilized by the clot, and subsequently killed by antimicrobial substances that are also released from the two types of granules. The clot is softer than a mammalian fibrin clot.
Figure 1. Defense systems in horseshoe crab hemocytes. The hemocytes detects LPS on Gram-negative bacteria and initiates exocytosis of the large and small granules. The clotting factors thus released are activated by LPS or (1,3)-b-D-glucan on the pathogens, which result in hemolymph coagulation. Thus, the pathogens are cell-agglutinated by various lectins and subsequently killed by antibacterial substances. The large granules also contain protease inhibitors, such as serpins, a2-macroglobulin, and cystatin, and an azurocidin-like pseudoserine protease with antibacterial activity, named factor D.
3.2. Molecular evolution of coagulogen and coagulin
The clottable protein, coagulogen, has a functional similarity with vertebrate fibrinogen (39) and is known to play a central role in the horseshoe crab clotting system (figure 1). Unlike fibrinogen, the soluble precursor of this protein is absent from the plasma but is instead sequestered in L-granules of the hemocytes (12). Coagulogen consists of a 175-amino acid single basic polypeptide chain and has a calculated molecular weight of 19,700. This protein contains three segments, A chain, peptide C and B chain (40). The horseshoe crab clotting enzyme cleaves the Arg-Gly and Arg-Thr linkages, both located at the NH2-terminal region (41). The Arg-Gly linkage cleaved by the clotting enzyme is the same type as that cleaved by a-thrombin in the transformation of mammalian fibrinogen to fibrin. Moreover, the COOH-terminal octapeptide sequences of A chain and peptide C exhibit high sequence similarity, and their sequences are very similar to that of primate fibrinopeptide B (40).
The amino acid sequences of four coagulogens isolated from the hemocytes of American (L. polyphemus) (42), Japanese (T. tridentatus) (40,41), and two Southeast Asian (T. gigas and C. rotundicauda) horseshoe crabs (43,44) are shown in figure 2. Upon gelation of all the coagulogens by the clotting enzyme, peptide C (28 residues) is released from the inner portion of the parent molecules. The resulting gel consists of two chains of A (18 residues) and B (129 residues), bridged by two disulfide linkages. The overall sequences of the four coagulogens have considerable sequence similarity, particularly in the A and B chains, consisting of 147 residues (44). In these regions, 73-92% of the sequences contain the same residues in identical positions. In addition, 16 half-cystine residues of the four are in the same positions in the sequences. In contrast, the sequence similarity of the peptide C region is less than that of the A and B chain regions and the peptide C sequences between the four coagulogens are 43-57% identical (44).
Figure 2. Amino acid sequence identities among Asian (T. t, T. g., C. r.) and American (L. p.) horseshoe crabs. Chemically identical residues in the sequences are bold faced. The bonds cleaved by horseshoe crab clotting enzyme are indicated by arrows. The region of residue No. 19 to No. 46 corresponds to the peptide C. T. t, Tachypleus tridentatus, T. g., Tachypleus gigas, C. r., Carcinoscorpius rotundicauda, L. p., Limulus polyphemus.
Two cDNA clones for coagulogen, named precoagulogen I and precoagulogen II, have been isolated from a cDNA library of T. tridentatus (45). The sequence of cDNA encoded for precoagulogen I agrees completely with that previously established by amino acid sequence analysis (41). The predicted amino acid sequence for precoagulogen II is in good agreement with the sequence determined by the protein-based study, except for the two residues at positions 82 and 86, both of which are identical to those found in coagulogen isolated from T. gigas, indicating that coagulogens are translated from at least two very similar but distinct mRNA.
Based on the data on amino acid differences and minimum base substitutions among the sequences of coagulogens from the four species, the evolution rate and phylogenetic relationships have been estimated as follows; the mutation distance between Limulinae and Tachypleinae is approximately 67.3; that between T. gigas and the other Asian species is 25 and that between T. tridentatus and C. rotundicauda is 20. A cladogram (based on the mutation distances among coagulogen sequences) indicating the phylogeny for the living horseshoe crab species is illustrated in figure 3. These data agree with previous reports (48) that T. tridentatus and C. rotundicauda are phylogenetically more closely related than the other species of horseshoe crab. The fact that L. polyphemus is evolutionarily distant from the other three species is supported by the results of immunochemical studies on coagulogen (49,52) and sialic acid binding lectin (50). According to Shuster (51), divergence from the common ancestor of Limulinae and Tachypleinae occurred in the late Jurassic or early Cretaceous period, corresponding to 110-160 million years BP. The calculated divergence time between T. gigas and the other Asian species is 52.5 million years and that between T. tridentatus and C. rotundicauda is 36.3 million years BP. These values differ from those suggested by Shuster (51) and also from those estimated based on the amino acid sequences of peptide C of coagulogen (52). The approximate mutation rate per amino acid site in coagulogen per year (R) can be calculated by the method of Kimura using the following equation; R = A/13.5 x 107 x B x 2. A is the average number of amino acid replacements separating Limulinae and Tachypleinae, and B is the number of amino acid residues in coagulogen. The value of 13.5 x 107 is the average time in years from the late Jurassic and early Cretaceous periods to the present. We can calculate A and B as 54.7 and 175, respectively, and thus, R for coagulogen is 1.16 x 10-9. This value is approximately half of that obtained for peptide C, 2.03 x 10-9, as shown in table 2.
Figure 3. Phylogenetic relationship of living horseshoe crab species based on mutation distances calculated from the amino acid sequences of coagulogens.
Table 2. Comparison of the mutation rate of various proteins
Recently, we have succeeded in resolving a crystal structure of T. tridentatus coagulogen and obtained new information not only about its conversion to coagulin gel, but also about the molecular evolution (46). The stereo structure of coagulogen reveals an elongated molecule that embraces the helical peptide C region. Cleavage and removal of the peptide C would expose an extended hydrophobic cove, which could interact with the hydrophobic edge of a second molecule, leading to a polymeric fiber. The most interesting finding is that the COOH-terminal half of the coagulogen molecule exhibits a striking topological similarity to the neutrophin nerve growth factor (NGF), providing the first evidence for a neutrophin fold in invertebrates. Sequence similarity between coagulogen and Spätzle, the Drosophila ligand of the receptor Toll, also suggests that the neutrophin fold might be considered more ancient and widespread than previously realized (47). However, coagulogen shows no NGF activity, according to our own examinations (Iwanaga et al., unpublished results).
3.3. Molecular evolution of clotting zymogens and serpins
Recently, we reported the molecular mechanism of hemolymph coagulation in horseshoe crab and established a protease cascade system similar to that of the mammalian blood clotting system (4). As shown in figure 1, this cascade is based on three serine protease zymogens, factor C (34), factor B (53,54), proclotting enzyme (55,56), and one clottable protein, coagulogen. In the presence of LPS, factor C is autocatalytically converted to an active form, factor. Factor B is then activated by factor and, in turn, its active form (factor ) converts proclotting enzyme to clotting enzyme. The resulting clotting enzyme cleaves two bonds in coagulogen to yield an insoluble coagulin gel (figure 2). The clotting cascade is also activated by (1,3)-b-D-glucan, in which factor G sensitive to b-D-glucan is autocatalytically converted to an active factor (37, 38), leading to activation of the proclotting enzyme (figure 1).
All the four zymogens contain a serine protease domain at their COOH-terminal portions similar to those of mammalian clotting factors, except for factor G, which is a physical heterodimer composed of a 72-kDa subunit a and a 37-kDa subunit b (38). Subunit b is a serine protease zymogen, which is highly similar to factor B (40.5% amino acid sequence identity) and the proclotting enzyme (37.7% sequence identity). While the COOH-terminal serine protease domains of these four clotting factors resemble each other (57), their NH2-terminal portions display their own characteristic structures, indicating mosaic proteins probably derived from exon shuffling (figure 4). For example, the NH2-terminal region of factor C, which binds to LPS, contains five "sushi" (also called short consensus repeat, SCR or complement control protein, CCP) domains, an epidermal growth factor (EGF)-like domain, and a C-type lectin-like domain (34). The fact that this initiator of the horseshoe crab clotting cascade contains "sushi domains" that are found mainly in mammalian complement factors (58), leads us to the speculation that both the coagulation and the complement systems may have evolved from a common origin.
Figure 4. Domain structures of horseshoe crab clotting factors. The arrowheads indicate cleavage sites of zymogen activation. The potential carbohydrate attachment sites are indicated by closed diamonds.
Factor B and proclotting enzyme are similar in the domain structure to each other (54). In addition to the COOH-terminal protease domain, both clotting factors contain a "clip"-like domain (formerly called the "disulfide-knotted" domain) in the NH2-terminal light chain (56) as shown in figure 4, and this portion shows sequence similarity to the NH2-terminal light chain of Drosophila proteins, namely serine protease easter and snake precursors (47). Both easter and snake proteins are indispensable for normal embryogenesis in the flies. The presence of this type of domain in Drosophila strongly suggests the existence of a protease cascade similar to that in horseshoe crab (57).
The molecular phylogenetic tree for many serine proteases originating from various sources are shown in figure 5. The positions of horseshoe crab clotting factors (T. tridentatus) are far away from various members of the serine protease superfamily, including pancreatic trypsin/ chymotrypsin, mast cell-derived tryptase/chymase, fibrinolytic plasmin/urokinase, vitamin K-dependent clotting factors, and bacterial serine proteases. Phylogenetically, horseshoe crab clotting factors except for factor C could well be placed near the positions of snake and easter serine proteases previously mentioned. Factor C, on the other hand, is classified into the branch of the family consisting of complement factors, C1r and C1s, in addition to vitamin K-dependent clotting factors, such as protein C and prothrombin. Thus, factor C, the initiator of the horseshoe crab clotting cascade, is a newly discovered type of serine protease zymogen, a "coagulation-complement factor," which may play a critical role both in hemostasis and in defense mechanisms (figure 1).
Figure 5. Phylogenetic tree of various serine proteases from vertebrate and invertebrate animals. D. Drosophila.
To date, three types of serpins, serine protease inhibitors, have been isolated from T. tridentatus hemocytes, called limulus intracellular coagulation inhibitors, LICI-1 (59), LICI-2 (60), and LICI-3 (61). All LICIs form stable complexes with target serine proteases, as do mammalian serpins. Of the three serpins, LICI-1 specifically inhibits factor , whereas both LICI-2 and LICI-3 inhibit factor , factor , and the clotting enzyme activities. LICI-2 most strongly inhibits the clotting enzyme between the three proteases (60), and LICI-3 favours factor (61). All inhibitors are stored in L-granules and are exocytosed upon activation of the cells. Thus, these inhibitors as well as mammalian serpins are likely to function as regulators of the clotting cascade and to prevent diffusion of the active clotting factors, which may cause unnecessary clot formation.
The phylogenetic tree for 14 serpins is shown in figure 6. The positions of three LICIs in the tree are far away from mammalian plasma serpins, such as antithrombin III, a-antitrypsin, and antichymotrypsin (60). Even invertebrate serpins, antitrypsin and antichymotrypsin 1 from the larval hemolymph of Bombyx mori, are also located far apart from LICIs. LICIs could well occupy the position of the branch containing ovalbumin and intracellular serpins, such as plasmingen activator inhibitor type 2, rather than the insect serpins (62). Nevertheless, the serpin superfamily arose from a primitive ancestral serine protease inhibitor, the Archeserpin, some 500 million years BP (63). Therefore, the sequence similarity between LICIs and mammalian serpins (approximately, 32-34% identities) suggests that serpins are ancient and structurally and functionally conserved proteins.
Figure 6. Phylogenetic tree of 13 serpin sequences, including LICI-1, LICI-2, and LICI-3. Abbreviations for proteins are; A1AT_HUMAN, a1-antitrypsin human; A1AT_BOMMO, antitrypsin Bombyx mori; PAI1_HUMAN, PAI-1 human; ACH1_BOMMO, antichymotrypsin 1, Bombyx mori; LEI_HUMAN, HORSE; and porcine leukocyte elastase inhibitors; PTI, placental thrombin inhibitor; PAI-2, plasminogen activator type 2; OVAL_CHICK, ovalbumin chicken.
3.4 Molecular phylogeny of big defensin and its derivatives
Hemocytes of the horseshoe crab T. tridentatus contain a family of arthropodous peptide antibiotics, termed the tachyplesin family (19-22), and antibacterial protein, called anti-LPS factor (64-69), of which the former is located in S granules and the latter in L granules of the hemocytes (18). In our ongoing studies on granular components, we have recently identified a novel defensin-like substance present in both L- and S-granules (table 1). This substance strongly inhibits the growth of Gram-negative and -positive bacteria, and fungi, such as Candida albicans (26). The isolated substance, termed "big defensin", consists of 79 amino acid residues, of which the COOH-terminal 37 residues have a sequence similar to those of mammalian neutrophil-derived defensins, especially rat defensin (26). Horseshoe crab big defensin, however, is distinct from the mammalian defensins in molecular size, the latter of which commonly have 29-34 amino acid residues in common (70). It is noteworthy that the disulfide motif in big defensin is identical to that of b-defensins from bovine neutrophils but not to that of classical defensins including rat NP-2 defensin (71,72). Furthermore, the structural organization of big defensin differs markedly from those of insect defensins, not only in disulfide bridge locations, but also in the molecular size (72). Insect defensins isolated from various species, such as Phormia terranovae (73) and Sarcophaga peregrina (74,75), are cationic, 34 to 43 residue peptides, all containing six cysteines present in three intramolecular disulfide bridges. Therefore, the overall structure of T. tridentatus big defensin is unique in consisting of a polypeptide with the NH2-terminal extension. A new isoform of defensin has also been found in bovine tracheal mucosa (77) and in paneth cells of the human small intestine (78). The size, basic charge and three intramolecular disulfide bonds of these mature defensins are similar to those isolated from mammalian circulating phagocytic cells (76).
Recently, we found that the folding pattern of the three disulfide bridges of the clip domain in both factor B and the clotting enzyme previously mentioned is identical to that of big defensin (8). Since the clip domains constitute the hinge region susceptible to protease attack, these clip domains may be released through the activation of the zymogens to act as antimicrobial agents. If this is the case, the clotting cascade itself could produce antimicrobial substances during activations. The system may have dual actions, that is, coagulation and the killing of invaders, suggesting a close association of the clotting cascade with host defense.
3.5. Molecular phylogeny of horseshoe crab transglutaminase (TGase)
It is well known that in the mammalian clotting system, the fibrin clot generated via the cascade reaction is finally cross-linked to form a huge fibrin network as well as being cross-linked to other proteins, such as fibronectin, a2-macroglobulin, and a2-plasmin inhibitor (79). This final step is catalyzed by a plasma TGase, called factor XIIIa, and the cross-linking of fibrin with itself and with other proteins is essential for normal hemostasis and wound healing (80,82). Among various TGases, type I TGase is known to be a key enzyme in the keratinization and terminal differentiation of epidermal keratinocytes (81). On the other hand, type II TGase is thought to have many functions, including cell attachment, programmed cell death, and others (82).
Horseshoe crab TGase isolated from the hemocytes of T. tridentatus consists of 764 amino acid residues with a molecular mass of 87 kDa (83). Similar to guinea pig liver TGase, the TGase does not exist as a zymogen analogous to plasma factor XIII (84). Thus, its enzymatic properties, such as Ca2+ dependence and inhibitor spectrum are very similar to those of guinea pig liver TGase, indicating that horseshoe crab TGase belongs to the type II (cytosolic) TGase. However, there is a clear difference between the TGase and guinea pig liver TGase, in that the former is sensitive to chloride ions, while the latter is insensitive. Furthermore, the horseshoe crab TGase activity is not inhibited by GTP, whereas the liver enzyme is sensitive to GTP (83). In invertebrates, all the hemolymphs of horseshoe crab, lobster, sand crab, and sponge have been reported to contain TGase activity (85), but to our knowledge, there is no report of the purification and characterization of an invertebrate plasma TGase, except for T. tridentatus TGase.
Based on the amino acid sequence of horseshoe crab TGase, it shows significant sequence similarity with members of the mammalian family as follows (84): guinea pig liver TGase (32.3%), human keratinocyte TGase (37.6%), human factor XIIIa subunit (34.7%), and human erythrocyte band 4.2 (23%). Horseshoe crab TGase has a unique NH2-terminal cationic extension of 60 residues with no similarity to the NH2-termini of mammalian TGases (84). As shown in figure 7, a phylogenetic tree representing evolutionary relationships among the family members was inferred by the neighborjoining method. The phylogenetic tree reveals four distinct clusters consisting of the keratinocyte TGase group (type I), the endothelial/macrophage/liver TGase group including erythrocyte band 4.2 (type II), the factor XIII subunit a group (type III), and invertebrate TGase group (type IV). The branching pattern of the type II TGase suggests a very recent divergence of endothelial TGase, macrophage TGase, and liver TGase.
Figure 7. Phylogenetic tree of the TGase family. This tree was inferred by the neighbor-joining method. Numbers and nodes "b" and "c" indicate bootstrap probabilities that two lineages are clustered at b and c, respectively. Branch length are proportional to the numbers of accumulated amino acid substitutions.