![]()
|
[Frontiers in Bioscience 3, d509-516, May 26, 1998] Reprints PubMed CAVEAT LECTOR |
|
|---|---|---|
![]() ![]() ![]()
|
NATURAL SELECTION AND THE EVOLUTIONARY HISTORY OF MAJOR HISTOCOMPATIBILITY COMPLEX LOCI Austin L. Hughes and Meredith Yeager Department of Biology and Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, University Park PA 16802 USA Received 5/11/98 Accepted 5/22/98 3. BALANCING SELECTION AT MHC LOCI 3.1. Explaining MHC Polymorphism There are four independent lines of evidence that polymorphic MHC loci are subject to balancing selection: (i) The distribution of allelic frequencies does not fit the neutral expectation. (ii) The rate of nonsynonymous nucleotide substitution significantly exceeds the rate of synonymous substitution in the codons encoding the peptide-binding region of the molecule. (iii) Polymorphisms at certain MHC loci have been maintained for long periods of time, sometmes predating speciation events. (iv) Introns have been homogenized relative to exons over evolutionary time, as expected when balancing selection acts to maintain diversity in the latter but not the former. Here we briefly discuss each of these lines of evidence. Under selective neutrality, polymorphism is a transient phenomenon in evolutionary terms (4). For neutral alleles that become fixed, the average time to fixation is 4Ne generations, where Ne is the effective population size (5). At most selectively neutral loci, we expect to see one allele with high frequency at one or a few others of lower frequency. This occurs because most alleles other than the most frequent allele represent new mutations which are still at a low frequency or older alleles that are on the way to extinction. In the case of the polymorphic MHC loci, however, there are a large number of alleles of intermediate frequency (6). Such an allelic distribution is inconsistent with selective neutrality but easily explained under balancing selection (6). The polymorphism of MHC class I loci was initially discovered through these gene’s role in transplant rejection, and initially the function of the molecules they encode was unknown. Therefore, in proposing hypotheses to account for MHC polymorphism, biologists also tried to account for the molecules’ function as well. Because the self-incompatibility genes of plants were the only other system known with such high polymorphism, Thomas (7) came up with the idea that the MHC might be a kind of self-incompatibility system for animals. This led to the ideas that MHC molecules might function in maternal-fetal interaction or in mate choice. Even though there has never been good experimental evidence for either of these ideas (8,9), they continue to attract adherents. Ironically, right around the same time as Thomas’s exercise in mistaken analogy, Zinkernagel and Doherty (10) provided solid evidence for the MHC’s real function; namely, in presenting peptides to T cells (11). Soon afterward, Doherty and Zinkernagel (12) presented the first hypothesis to account for MHC polymorphism in terms of the molecules’ immune function, which they based on their observation that different MHC allelic products differ with respect to the antigens (i.e., peptides) they can bind and present to T cells. To re-express their argument in terms of our current understanding of MHC function, they proposed that since different allelic products bind different sets of peptides, a heterozygote at all or most MHC loci will have an advantage in a population exposed to more than one pathogen. Because a heterozygote can bind a wider array of peptides, it will be able to deal with a broader array of pathogens. These pathogens might be different species or antigenically distinct strains of the same species (13). Thus, Doherty and Zinkernagel proposed that MHC polymorphism is maintained by heterozygote advantage (overdominant selection). By now, there is abundant evidence from direct sequencing of peptides bound by both class I and class II molecules that MHC allelic products do indeed differ strikingly with respect to their peptide-binding specificities (14). As regards the hypothesis of overdominant selection, it will be very difficult to obtain evidence regarding it from population studies. In outbred species such as human and mouse, the vast majority of individuals are heterozygous at MHC loci. Thus, a conventional population approach to testing for heterozygote advantage, in which fitnesses of homozygotes and heterozygoes are compared in a natural population, is unlikely to be practical. An alternative approach to testing for natural selection at MHC loci by makes use of the evolutionary information in DNA sequence comparisons (9,15-17). In most genes, the number of synonymous (or silent) nucleotide substitutions per synonymous site (dS) exceeds the number of nonsynonymous (or replacement) substitutions per nonsynonymous site (dN). This occurs because most nonsynonymous mutations are deleterious to protein structure and thus are quickly eliminated by so-called purifying selection. Synonymous substitutions, because they do not change the amino acid, are likely to be selectively neutral or nearly so. On the other hand, if natural selection favors amino acid replacements in a certain protein region, dN can exceed dS. Overdominant selection is expected to produce such a pattern of nucleotide substitution because this type of selection accelerates the rate of amino acid replacement (18); and certain other types of balancing selection may have a similar effect. In the case of both class I and class II MHC, estimation of rates of nucleotide substitution revealed that dN is significantly greater than dS in the codons that encode the peptide-binding region (PBR) of the molecule, whereas in the remainder of the molecule dS exceeds dN as is true of most genes (9,15-17). Figures 1 and 2 illustrate this pattern of nucleotide substitution in the case of human class I and class II MHC genes. Such analyses provide strong evidence that MHC polymorphism is selectively maintained. They also demonstrate that the selection maintaining this polymorphism is focused on the PBR. Thus they support Doherty and Zinkernagel’s (12) hypothesis that the main force driving the selection is the advantage conferred by being able to bind a variety of peptides and thus to resist a variety of pathogens. ![]() Figure 1. Mean numbers of nucleotide substitutions per synonymous (dS) and per nonsynonymous (dN) site (50,51) at the human class I MHC loci HLA-A (A), HLA-B (B), and HLA-C (C). Tests of the hypothesis that dS = dN: * P < 0.05; *** P < 0.001. ![]() Figure 2. Mean numbers of nucleotide substitutions per synonymous (dS) and per nonsynonymous (dN) site (50,51) at the human class II MHC loci HLA-DRB1, HLA-DQB1, and HLA-DPB1. Tests of the hyothesis that dS = dN: *** P < 0.001. 3.2. Trans-species Polymorphism One of the characteristics of MHC loci is the existence of so-called trans-species polymorphism (19-21). In other words, MHC polymorphisms have been maintained for long periods of time, often pre-dating speciation events. Figure 3 shows an example of this phenomenon in the case of the DQA1 locus of primates, which encodes a class II MHC chain. In this tree the human alleles HLA-DQA*0101, -DQA*0102, and –DQA*0103 cluster with certain chimpanzee alleles; and these alleles cluster with alleles from gibbon and rhesus monkey (figure 3). Certain other human alleles cluster with other chimpanzee alleles; for example, HLA-DQA*0501 (figure 3). The phylogenetic tree thus indicates that there are allelic lineages at the human DQA1 locus that have persisted since before human and chimpanzee diverged (5-7 million years ago) or even since before human and old world monkeys diverged (about 25 million years ago). Because selectively neutral polymorphisms are not expected to be maintained for long periods of time, the long persistence of MHC polymorphisms is evidence that they are selectively maintained (22). ![]() Figure 3. Phylogenetic tree (52) of DQA1 alleles based on the proportion of amino acid difference (p) in the a1 domain. Species used are human (HLA-), chimpanzee (Patr-), Gorilla (Gogo-), gibbon (Hyla-), rhesus monkey (Mamu-), crab-eating macaque (Mafa-), red-faced stump-tailed macaque (Maar-), bovine (Bota-), dog (Cafa-), and mouse (H-2). PIR database accession numbers are in parentheses. Numbers of branches represent percentages of 1000 bootstrap replicates (derived by pseudosampling sites with replacement) that supported a given branch; only values >50% are shown. Computer simulations by Takahata and Nei (22) showed that polymorphisms could be maintained for long periods of time both by overdominant selection and by one model of frequency-dependent selection which those authors called "minority advantage." In this model, it was assumed that a genotype has an advantage when it becomes rare in a population. Mathematically, this model is very similar to that of overdominant selection, but biologically it is rather different. Consider a parasite species that is well controlled by members of its host species having a particular MHC allele (A1) whose product can bind and present a peptide from the parasite. Under selective pressure from this parasite, the A1 allele will increase in frequency. Suppose a mutation occurs so that the same peptide is no longer bound by the A1 allelic product, giving the parasite an advantage. Then suppose a mutation occurs, giving rise to a new MHC allele (A2), which can bind and present another peptide from the same parasite. The A2 allele will now have an advantage and increase in frequency. When the A2 allele becomes common and the A1 allele scarce, the minority advantage model requires that the A1 allele again have a selective advantage. But it is hard to see that this will happen in the case of the MHC. If the parasite mutates again so that the peptide bound by the A2 allelic product is no longer bound, what likelihood is there that it will mutate in such a way that the parasite now contains a peptide bound by the A1 allelic product. Thus, the minority advantage model does not seem applicable to the MHC. For this reason, it seems likely that selection at MHC loci is overdominant, as originally proposed by Doherty and Zinkernagel (12). 3.3. Introns of Class I Genes Recently sequences of introns from the human class I MHC loci HLA-A, -B, and –C have become available (23). Comparison of these introns with the adjoining exons reveals an additional line of evidence that polymorphisms at these loci are maintained by balancing selection. Figure 4 shows the proportion of nucleotide difference in a sliding window of 30 aligned base pairs along the sequence of alleles at the human class I loci HLA-A, -B, and –C in the region from intron 1 to exon 4. Exons 2 and 3 encode the a1 and a2 domains, respectively, of the protein. The a1 and a2 domains include the codons of the PBR, and most polymorphism is found in these exons; exon 4 encodes the conserved a3 domain. The exons, particularly exons 2 and 3, show substantially more sequence divergence than do the introns, particularly the long third intron. ![]() Figure 4. Proportion of nucleotide difference (p) in a sliding window of 30 base pairs in all pairwise comparisons among HLA-A, -B, and –C alleles. Horizontal bars indicate the positions of exons 2, 3, and 4. From ref. 23. This pattern of nucleotide divergence in exons and introns closely fits what is expected under balancing selection (24-26). As first shown by Hughes and Nei (15) the balancing selection at these loci operates on nonsynonymous sites in the PBR codons. Being closely linked to these sites, other sites in exons 2 and 3 are expected to "hitch-hike" along with these selected sites. Thus the polymorphism in these exons will be relatively ancient. Introns, whose linkage to the PBR codons is less tight, will have less ancient polymorphism than exons because introns have been homogenized relative to the exons by recombination and subsequent genetic drift (23). The fact that the long third intron is the most homogenized among alleles (figure 3) provides support for this model because linkage between sites in this intron is expected to be less than in the shorter introns 1 and 2 (23). |