[Frontiers in Bioscience 3, d408-418, March 27, 1998] |
DNA INVERTED REPEATS AND HUMAN DISEASE
The Children's Hospital Research Foundation, Cincinnati, Ohio
7. ROLE OF INVERTED REPEATS IN MUTAGENESIS
7.1. Palindrome mediated mutagenesis
Regions of DNA alternative secondary structure pose a barrier to replication fidelity. Work in Escherichia coli has demonstrated that the extruded cruciform facilitates frameshift mutations by bringing the DNA slippage sites (direct repeats) into close proximity (45,44). This can lead to deletion as well as insertion mutations. Figure 3A illustrates duplex DNA depicted by a wide and narrow line. The inverted repeat is illustrated by the opposing arrows. The replication of the strand is interrupted by the encounter of the replication fork and the extruded cruciform (figure 3B). The extrusion of the stem loop structure brings the distant DNA direct repeats into close proximity. In panels C and D the replication fork pauses, the 5' direct repeat sequence melts and the nascent strand reanneals on the direct repeat 3' to the extruded sequence. Replication continues resulting in a heteroduplex with one non-mutant strand and one deletion mutant. These can be made permanent by another round of DNA replication. In figure 4, one mechanism for an insertion mutation is illustrated where the inverted repeat in the copied strand folds into a cruciform structure and the replicated 3' direct repeat comes into register over the 5' complementary sequence allowing continued DNA synthesis resulting in a duplication.
Figure 3: Deletion mutation of DNA mediated by an inverted repeat. A. Two complementary strands of DNA are depicted by two lines of differing thickness. The opposing arrows mark the site of the inverted repeat. At either end of the inverted repeat, a direct repeat sequence, must be present for slippage to occur. B. After the two strands are separated, the replication fork encounters the extruded stem-loop structure in the thick strand. C. The nascent strand and polymerase partially dissociate from the template strand. The nascent strand has copied the 5' direct repeat. D. The nascent strand slips forward over the stem-loop structure and comes in register with the 3' direct repeat. E. Continued polymerization excludes the sequence from the hairpin structure.
Figure 4: Insertion mutation of DNA mediated by an inverted repeat. A. Two complementary strands of DNA are depicted by two lines of differing thickness. The opposing arrows mark the site of the inverted repeat. At either end of the inverted repeat, a direct repeat sequence, must be present for slippage to occur. B. The replication passes through the inverted repeat and through the 3' direct repeat. C. The nascent strand and polymerase dissociate from the template strand and the hairpin structure forms in the nascent strand. The 3' direct repeat anneals to the complementary bases for the displaced 5' direct repeat. The template strand fails to form the stem-loop structure. D. Continued polymerization creates a duplication.
7.2. Quasipalindrome mediated mutagenesis
Imperfect palindromes theoretically also can mediate similar mutations as perfect palindromes and in addition engage in intra- and interstrand switching further increasing the potential mutation spectrum (46, 47).
Intrastrand switching requires the polymerase to replicate through the center of the palindrome and, with the nascent chain, disassociate from the template strand and engage in intrastrand annealing into a hairpin structure and then continue to polymerize (48, 49). This situation is transient and following additional polymerization, the nascent strand and polymerase switch back to the original template strand. If the template is a quasipalindrome, intrastrand switching will convert the quasipalindrome into a perfect inverted repeat. The mutation can then be fixed by DNA repair or a second round of replication. Figure 5 illustrates such a process. Figure 5C is the first strand switch. The mutation arising from intramolecular templating is illustrated in blue. The second strand switch occurs in figure 5D.
Figure 5: Intrastrand switching. A. Two complementary strands of DNA are depicted by two lines of differing thickness. The opposing arrows mark the site of the imperfect inverted repeat. The blue dots illustrate the site of the base pairs difference between the inverted repeats. B. The DNA is polymerized through the center of the inverted repeat. C. The polymerization machinery and nascent strand dissociate from the template strand and the nascent strand forms a stem-loop structure creating an alternative template for DNA synthesis. D. Synthesis of the DNA occurs. The blue line represents the mutation arising from intrastrand templating. E. The nascent strand and polymerized DNA again dissociate and anneal back to the original template and synthesis continues. The mismatched in blue may be removed by DNA repair, or may be fixed in the genome by a second round of replication resulting in one original sequence and one mutant.
Interstrand switching is similar to intrastrand switching in that the nascent strand and polymerase separate from the template strand, however, in an interstrand switching event, the nascent strand switches to the other DNA strand being replicated (46). Figure 6 illustrates an interstrand switching event. Figure 6A again represents complementary strands including an inverted repeat. In figure 6B, replication occurs into one half of the inverted repeat. Figure 6C illustrates the first interstrand switch followed by polymerization in figure 6D, and then a second switch and continued polymerization in figure 6E. Both types of switching have been demonstrated to occur in vivo (46, 47).
Figure 6: Interstrand switching. A. Two complementary strands of DNA are depicted by two lines of differing thickness. The opposing arrows mark the site of the inverted repeat. B. The DNA is polymerized into the inverted repeat. Unlike intramolecular switching, in intermolecular switching the DNA synthesis need not be through the center of symmetry. C. The nascent strand switches its template strand to the complementary strand. D. The nascent strand elongates by DNA polymerization on the complementary strand. E. The nascent strand switches back to the original template strand and continues synthesis. Similar mutations can arise from both intra- and interstrand switch mechanisms.
7.3. Diseases Associated with Imperfect Inverted Repeats
7.3.1. Hereditary Angioneurotic Edema
There is strong evidence for imperfect inverted repeat mediated mutagenesis resulting in the human disease hereditary angioneurotic edema. This autosomal dominant disease is caused by mutations that reduce production of functional C1 inhibitor (50). C1 inhibitor is a pivotal regulatory protein in inflammation and regulates proteins involved in both complement, contact, and clotting cascades. When insufficient amounts of functional inhibitor are available for regulation, bradykinin (and possibly other kinins) are produced leading to tissue edema of the skin and mucosal surfaces. This causes symptoms ranging from temporary disfigurement to bowel obstructions or far worse, asphyxiation. The frequency of this disease is estimated at 1:50,000 to 1:100,000. During mutation screening, we identified two separate kindreds with mutations (51) that appeared to be mediated by a mechanism involving an imperfect inverted repeat (figure 7). This quasipalindrome was examined in a prokaryotic system and it has been found to cause mutations of similar types and sizes (Bissler, et al, in preparation).
Figure 7: Quasipalindrome from the C1 Inhibitor gene. The blue sequence was deleted in one kindred and the sequence in the yellow background was tandemly duplicated in another kindred.
7.3.2. Triplet repeat mediated diseases.
So far, twelve human genetic diseases have been associated with variations in the length of triplet repeats. These diseases include myotonic dystrophy, fragile X syndrome, and Huntington's chorea (52- 60). Clinically, these triplet repeat diseases are associated with a phenomenon called anticipation. Anticipation implies progressively earlier onset and worsening severity of the disease with successive generations (55, 58, 61). Anticipation is associated with expansion of the triplet repeat sequence in the disease associated gene. The mechanism of expansion is unclear, and many hypotheses involve abnormal replication or repair processes on a slipped strand structure within the triplet repeat region. These large triplet tracts are quasipalindromic, and Pearson et al. identified novel DNA secondary structures forming in (CTG)n.(CAG)n tracts from the DM and FRAXA genes. They identified non-B-DNA structures, termed S-DNA (62). Such S-DNA can be remarkable thermostable, and this stability has been hypothesized to be due to hairpin formation in the loop-out regions (60). An example of hairpin formation in (CCG)n is illustrated in figure 8.
Figure 8: Sequence from d(CGG)n.d(GCC)n repeat. A. A single strand of the triplet repeat. B. The same sequence in 8A configured into hairpin structure.
7.3.3. Duchenne Muscular Dystrophy
Duchenne muscular dystrophy is an X-linked recessive disease. Approximately one out of 3,500 male newborns are affected and will suffer a severe muscular wasting that will attenuate their lifespan to approximately 20 years of age. This disease has been mapped to the dystrophin gene at Xp21. Most cases (65%) reported have large (kilobase) intragenic deletions. The others are partial duplications, small frameshifts, or point mutations. One of these other mutants is a microdeletion between nucleotides 6982-6998 (63). There are several ways an imperfect inverted repeat contributes to mutagenesis in this gene (figure 9). For example, the polymerase machinery may simply slip on the AG direct repeat, or may resolve the hairpin structure up to the more G.C rich stem and then, during the resolution of the hairpin structure, slip-mispair on the direct repeats resulting in the deletion. With this 17 basepair deletion, the proposed hairpin has significantly less basepairing and presumably would be a weaker structure (figure 9).
Figure 9: Genomic quasipalindrome from the Dystrophin gene. The lower case letters are intronic, the uppercase letters are exonic. The blue nucleotides are the 17 bases deleted. This could have happened by simple slipping on the AG direct repeat denoted by the brackets, or an interstrand switch mechanism.
7.3.4. Osteogenesis Imperfecta
The autosomal recessive disease Osteogenesis Imperfecta is manifest by remarkably brittle bones. This disease is associated with mutations in the type I procollagen gene. These mutations include deletions, insertions and point mutations often involving the triple helical coding region. The point mutations often disrupt the GXY repeat motif resulting in destabilization and delayed formation of the collagen triple helix. The kinetic alterations allow excessive post-translational hydroxylation and glycosylation of lysine. Bateman et al. report a mutation that results in a structural abnormality in the C-propeptide of the pro-a 1 (I) chain (64-66). Carboxyl terminal propeptides of pro- a 1 (I) and pro- a 2 (I) are thought to be critical in initiating the process of chain assembly and helical propagation in the assembly of collagen. Mutations in this site, therefore, also would result in a disease such as Osteogenesis Imperfecta. The mutation identified was a T-insertion that can easily be explained by an inter- or intrastrand switch phenomenon resulting in the conversion of an imperfect inverted repeat to a palindromic sequence (figure 10).
Figure 10: Sequence from the collagen I gene configured in a stem-loop structure. Using a strand switch mechanism (either intra- or interstrand) the stems of the inverted repeat are made perfectly complementary.
7.3.5. Antithrombin Deficiency
Mutations in the antithrombin gene can result in the development of venous thromboembolism at a young age (reviewed in 67). Mutation analysis of this inhibitor is consistent with the hypothesis that the reactive center coding region exhibits sequence directed mutagenesis involving inverted repeats. The P1 and P1' residues, on either side of the reactive center bond (involved in the inhibition reaction), are CpG dinucleotides. Twelve mutations have been reported involving these two codons. These mutations may best be explained by the methylation of the cytosine base followed by the spontaneous deamination resulting in a thymidine base substitution. In addition, mutations resulting in deficiency occur in two regions containing five codons on either side of the reactive center bond.
The coding region of the antithrombin III gene contains two mutable codons and is imperfectly palindromic. The sequence 5' to the reactive center coding region (nucleotides 13,800 to 13,831, accession number X68793) can be drawn into an imperfect cruciform structure containing the two codons in this region that mutate (figure 11A). This stem loop structure also may explain many of the point mutations in which the codon GCA is converted to ACA. A plausible explanation may be that during replication, a strand switch event occurred followed by a second round of replication that fixed this substitution in the genome and lead to disease. A carboxyl region of the reactive center coding region (nucleotides 13,855 to 13,879) can be configured into a stem loop structure as well (figure 11B). This region contains three codons that mutate and cause disease. It is tempting to suggest that these five codons on either side of the reactive center bond cluster mutations because these sequences can adopt a peculiar alternative secondary structure. No mutation has been found that actually deletes one of these potential stem loop structures. Perhaps no deletions have been identified because these are reasonably weak cruciform structures, or the sequence effects of contiguous nucleotides reduce the likelihood of deletion. The remnant effect may be on replication fidelity as the holoenzyme encounters the geometry and strength of such a structure.
Figure 11: Quasipalindome and palindrome from the Antithrombin gene. A. Nucleotides 13,800 to 13,831 are quasipalindromic and are illustrated as a stem-loop structure. The arrow shows the strand switch product that converts the stems of the quasipalindrome to a perfectly complementary sequence. The blue 'A' could result from synthesis templating from the 3' 'T' that would be mispaired with the 'G' residue. B. The quasipalindrome from 13,855 to 13,879. Only the loop, which does not basepair in the hairpin configuration is not palindromic. The sequence in bold identify the codons mutated.
7.3.6. Silent Serum Cholinesterase
Deficiency of human serum cholinesterase is characterized by absence of enzymatic activity. The autosomal recessive phenotypic manifestation of this disorder is of concern for anesthesiologists and intensivists alike because this enzyme degrades the paralyzing agent succinyl choline. This drug is used in operating room and intensive care settings. The homozygous frequency of this disorder is estimated to be approximately 1:100,000 in the Caucasian population, but it is as high as 1:100 in western Alaskan Eskimos and 1:50 in the population of Andra Pradesh, India. Nogueira et al. found a patient that had the conversion of a quasipalindrome into a palindromic sequence that could be best explained by an intrastrand switch mechanism (68) (figure 12).
Figure 12: Sequence from the cholinesterase gene. This structure likely could not form without unpairing of the intrastrand A.T on either side or the 'C' residue at the top. The arrow indicates that through a strand switch reaction, the sequence is made more palindromic as seen by the loss of the 'T' residue on the left, and its replacement with the 'AG' illustrated in blue, leading to a frameshift.
7.3.7. Lesch-Nyhan Syndrome
Mutations that inactivate the human hypoxanthine phosphoribosyl transferase (HPRT) gene lead to the X-linked neurological disease Lesch-Nyhan Syndrome. This syndrome is associated with polyathetosis, spasticity, mental retardation, self mutilation, and an elevated serum uric acid. Gibbs et al. identified a 13 bp deletion that could result from a slip 3' (figure 13). Following the first slip there is additional polymerization of a 'T' residue, then a second slip that would place the nascent 3' end in register with the complementary DNA. Both slips could be facilitated by a hairpin structure in the template strand thereby elimination the inverted repeat sequence. This mutation could then be fixed in the genome by another round of replication (69).
Figure 13: Sequence from the hypoxanthine-phosphoribosyl transferase gene. A. The top strand is the wild type coding strand from the hypoxanthine-phosphoribosyl transferase gene. The blue nucleotides depict the deleted sequence. The bottom sequence is the resulting deletion. B. The sequence is illustrated as dulpex DNA. The melting of DNA is depicted by the 'Y' configuration and the green nucleotides represent the nascent strand. C. The nascent strand partially melts and the template strand forms a hairpin structure. The nascent strand comes into register so that the 3' can continue polymerization. D. The polymerase adds a 'T' residue to the nascent strand.E. A second realignment between the template and nascent strands occurs, making the hairpin structure stronger, and the nascent continues polymerization.
7.3.8. Kearn-Sayre Syndrome
This syndrome is a constellation of physical findings including external ophthalmoplegia, retinal degeneration, and a cardiac conduction block. The genetic basis for this mitochondrial cytopathy resides in deletions in the mtDNA. Frequently, the deletions are flanked by a variable length direct repeat and is most easily explained by slip mispairing or errors in recombination. Remes et al. identified a 38 year old man who had deletion breakpoint flanked by an imperfect inverted repeat in a highly conserved sequence block (CSB II) that had been implicated in cleavage of RNA primers during initiation of DNA replication by MRP ribonuclease (70). The deletion includes nucleotides 10170 through 13406. The 3' breakpoint of this deletion resides in the middle of a small inverted repeat configured into a hairpin structure (figure 14). The blue nucleotides represent the portion of the inverted repeat that is included in the deletion, the black nucleotides depict those nucleotides remaining. The stem loop structure is small and of low energy, and will require confirmation of the role, if any, in mutagenesis.
Figure 14: Mitochondrial DNA with the breakpoint identified by the different color sequence. The blue nucleotides are included in the deletion, and the black nucleotides are retained.
7.3.9. Biotinase Deficiency
Biotinase Deficiency is an autosomal recessive disorder resulting in the inability to recycle biotin. This water-soluble vitamin is an essential cofactor for several carboxylases. Lack of functional biotinase can result is severe deficiency of biotin and manifest in clinical features including siezures, ataxia, developmental delay, acidosis, and coma (71). Pomponio et al. identified a complex deletion/insertion mutation that they postulated was mediated by a quasipalindrome that adopted a hairpin configuration in the template strand (72) (figure 15).
Figure 15: Quasipalindrome configured as a hairpin structure from the biotinase gene. Flawed resolution of this structure during DNA replication has been postulated to lead to deletion in he biotinase gene (72).
7.3.10. Familial Hypercholesterolemia
Familial hypercholesterolemia is an autosomal dominant disease affecting approximately 1 out of 500 people (73). While several different types of mutations can lead to this disease, Yamakawa-Kobayashi et al. identified a quasipalindrome that they postulated was responsible for a complex deletion/insertion mutation in exon 8 of the LDL receptor. The quasipalindrome is illustrated in figure 16. These authors postulated a model that involved a deletion and an intramolecular strand switch mechanism in two kindred they felt were unrelated (74) (figure 16).
Figure 16: Quasipalindrome sequence from the LDL receptor gene configured into a hairpin structure. Yamakawa-Kobayashi et al. postulated that this structure may be involved in the mutagenic mechanism leading to familial hypercholesterolemia (74).