[Frontiers in Bioscience E4, 2464-2475, June 1, 2012]

A bivariate variance components model for mapping iQTLs underlying endosperm traits

Gengxin Li1, Cen Wu1, Cintia Coelho2, Rongling Wu3,4, Brian A. Larkins2, Yuehua Cui1

1Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA, 2Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA, 3Center for Statistical Genetics, Pennsylvania State University, Hershey, PA 17033, USA, 4Center for Computational Biology, Beijing Forestry University, Beijing, People's Republic of China

TABLE OF CONTENTS

1. Abstract
2. Introduction
3. Statistical method
3.1. The genetic model and parent-specific allelic sharing
3.2. Parameter estimation
3.3. Hypothesis testing
4. Simulation
4.1. Simulation design
4.2. Simulation results
5. Real data analysis
6. Discussion
7. Acknowledgements
8. Appendix
9. References

1. ABSTRACT

Genomic imprinting plays a pivotal role in early stage development in plants. Linkage analysis has been proven to be useful in mapping imprinted quantitative trait loci (iQTLs) underlying imprinting phenotypic traits in natural populations or experimental crosses. For correlated traits, studies have shown that multivariate genetic linkage analysis can improve QTL mapping power and precision, especially when a QTL has a pleiotropic effect on several traits. In addition, the joint analysis of multiple traits can test a number of biologically interesting hypotheses, such as pleiotropic effects vs close linkage. Motivated by a triploid maize endosperm dataset, we extended the variance components linkage analysis model incorporating imprinting effect proposed by Li and Cui (2010) to a bivariate trait modeling framework, aimed to improve the mapping precision and to identify pleiotropic imprinting effects. We proposed to partition the genetic variance of a QTL into sex-specific allelic variance components, to model and test the imprinting effect of an iQTL on two traits. Both simulation studies and real data analysis show the power and utility of the method.

2. INTRODUCTION

With the availability of linkage map and molecular markers in many species, coupled with the development of statistical and computational methods, enormous progress has been made in the identification of novel genes or quantitative trait loci (QTL) underlying various complex traits of interest (e.g. 1). Recent advances in biotechnology have enabled the generation of high throughput genome-wide dense single nucleotide polymorphism (SNP) data. Even though there have been large successes in genome-wide association studies (GWAS), GWAS still cannot substitute QTL mapping due to high false positive and false negative rates compared to linkage mapping (2). In linkage mapping, when multiple correlated traits are available, a number of studies have shown that jointly modeling correlated traits can significantly improve the power and mapping precision to detect QTLs (3-5). For a gene with a pleiotropic effect, often it might code for a product which has a signaling function on various targets. This is practically important as the gene might belong to a signaling pathway and could be a potential target for further functional validation. This makes linkage mapping with multiple traits particularly attractive as it can test the pleiotropic effect of a QTL (6).

Genomic imprinting is an epigenetic phenomenon in which the expression of the same alleles could be different, owning to their parental origin (7). It plays a critical role in early stage development in many species, which makes it practically important to identify imprinted genes underlying various traits of interest (8). In addition to Mendelian traits, genomic imprinting has been routinely considered in genetic mapping, with the aim to identify genes with sex-specific expression due to epigenetic modification. So far, various statistical attempts have been made to map imprinted QTLs (iQTLs) underlying complex traits (e.g. 9-16).

In current applications, statistical methods for iQTL mapping are predominantly focused on single trait analysis, termed univariate iQTL mapping. In real application, it is frequent to observe multivariate traits which are potentially controlled by imprinted genes. For example, the percentage of endoreduplication and mean ploidy level in maize endosperm are two highly correlated traits (17). The two traits describe the level of endoreduplication in endosperm, which is thought to be genetically controlled by imprinted genes (18). We have previously developed a variance components statistical framework for mapping iQTLs underlying the single endosperm trait (see 16). Considering the advantage of joint analysis of multiple traits in QTL mapping, in this study we propose a multivariate variance components model in mapping iQTLs underlying multivariate imprinted traits, and further determine if there is a pleiotropic iQTL effect (6, 19).

Our work is based on a published endosperm mapping data set, and the biological application makes the study particularly attractive (17). Endosperm is a triploid tissue resulting from a unique double fertilization process in angiosperms. As a result, the endosperm genome carries two copies of chromosomes inherited from female parent and one copy from male parent. Maize endosperm cells undergoing endoreduplication are generally larger than other cells, which consequently results in larger fruits or seeds and is beneficial to human beings (20). It is thus particularly important from the breeding point of view to identify which genes control the endoreduplication process and where they are located in the genome. To our best knowledge, no study has been conducted to map iQTLs underlying the imprinting process with multivariate traits.

Variance components models have been shown to be powerful tools in multi-trait linkage analysis for an outbred or human population (e.g., 3, 4). Due to the special inbreeding structure and unique genetic make-up of the endosperm genome, the current multi-trait linkage analysis methods cannot be directly applied to the endosperm genome. We have previously shown that the variance components model can be applied to an inbreeding population to identify imprinted QTLs underlying a univariate endosperm trait (16). As an extension to our previous method, in this work we propose a bivariate iQTL mapping method to target iQTLs with potential pleiotropic effects. This study will fill the gap in genetic mapping iQTL underlying multiple endosperm traits by considering the imprinting property of a QTL.

3. STATISTICAL METHOD

3.1. The genetic model and parent-specific allelic sharing

We consider a backcross design initiated with two inbreeding parental lines with a large contrast in the phenotype of interest. Denote the genotype of two parental lines as and . We then use the F1 lines () as the maternal parents to backcross with both parental lines to generate two backcross segregation populations. In terms of the endosperm genotype, the backcross offspring is denoted as, and , where the subscript m or f implies that the corresponding allele is inherited from the maternal or paternal parent, respectively. Similarly, we can use the two parental lines as the maternal parent and backcross with the F1 line to generate two different backcross populations which contain the same sets of genotype as the other two crosses. For a detailed description of the backcross design, readers are referred to table 1 in Li and Cui (16). Note that the endosperm genotypes resulting from the backcross design contain two identical gene copies from the maternal parent and are different from a regular diploid mapping population.

Consider two phenotypic traits of interest. Let and be two vectors of observed trait values for trait 1 and 2 in the kth backcross family, where is the number of observations in family . We assume a multivariate normality for the joint distribution of and . Denote the genotype-specific cytoplasmic maternal effects as (,), additive genetic effects at a QTL as (,), polygenic additive effects as (,), and random environmental effects as (,) for the two traits in a bivariate model. To consider the imprinting effect of a QTL, the additive genetic effects are further partitioned into parent-of-origin effects due to the maternal alleles with respect to each trait (denoted as ,), and effects due to the paternal allele (denoted as ,). The genetic model underlying two endosperm traits can be expressed as

Equation (1)

where the coefficient of the maternal allele effect is set as 2 due to the two identical copies.

For the proposed backcross design, there are a total of three possible maternal genotypes, denoted as , and . Thus ( and ) denote mean parameters of two traits with respect to three maternal genotypes, i.e., = (, , ), = (, , ). The random effects corresponding to trait are , , and which are normally distributed, i.e., , , and , where and are the additive genetic variances at a QTL for the maternal and paternal alleles respectively; and are identical-by-descent (IBD) sharing matrices that are derived from the maternal and paternal alleles among sib-pairs, respectively; and are the additive polygenic and residual variances, respectively; is the expected proportion of alleles shared IBD; and is the identity matrix. The above model is similar to a bivariate variance components model described in Almasy et al. (4), except that here we incorporate the parent-specific allelic effects. For two correlated traits, the covariances of random effects are expressed as and for the additive genetic effects at a QTL; for the polygenic effects; and for the residual effects.

The above variance components model is built upon the basis of IBD sharing at a QTL. For a triploid inbreeding population, a unique decomposition of parent-specific allele sharing is illustrated in figure 1 of Li and Cui (16). Following the definition given in Li and Cui (16), the phenotypic variance-covariance corresponding to trait j (=1, 2) in family k can be expressed as: , where is the IBD sharing matrix due to cross-sharing of allele derived from different parents and is the variance term due to allele cross-sharing for an inbreeding population (16). Note that = 0 for an non-inbreeding population, but it could be non-zero for a partially inbreeding population (21). The covariance of two phenotypic traits is expressed as . With the above notation, the phenotypic variance-covariance matrix of two phenotypic traits within the kth backcross family can be expressed as

Equation (2)

The IBD sharing probability mentioned above is calculated assuming that a QTL is located at a marker position. Unless markers are dense enough, a QTL can be anywhere in the genome bracketed by two flanking markers. Here we assume a QTL can be anywhere in the genome and calculate the IBD sharing probability based on the recombination information between a putative QTL and two flanking markers. In a genome-wide linkage scan, we search a QTL every 1 or 2cM throughout the entire genome and the conditional probability of a QTL conditional on two flanking markers can be calculated (see 22). These conditional probabilities are then considered when calculating the IBD probability of a putative QTL at a given genome position (see 16 for more details).

3.2. Parameter estimation

Let . Assuming multivariate normality of and independence between different families, the log-likelihood function for families can be expressed as

Equation (3)

where and contains the genotype-specific maternal effects, and contains different random variance components. The parameters can be estimated with either maximum likelihood (ML) method or the restricted maximum likelihood (REML) method. In our previous investigation, we did a comprehensive comparison of the two methods in estimating variance components based on a diploid mapping population (15). The results indicated that the ML method is faster than the REML method, but the REML method gives less biased results, which is consistent with the work of Corbeil and Searle (23). The less biased results make the REML estimation method more attractive. In the following, we briefly outline the REML estimation procedure and more details about the REML algorithm can be found in the Appendix.

All data in families form one big vector denoted as y with dimension . The vector y can be further partitioned into three vectors , where phenotype corresponding to families with maternal genotypes , corresponding to families with maternal genotypes and corresponding to families with maternal genotypes in different backcross families. Similarly, can be expressed as ==. With this partition, the REML log-likelihood function can be expressed as

Equation (4)

where; ;

;; denotes the number of families generated from the backcross with maternal genotype (), () and (); 's are block diagonal matrices; and . Then the Fisher scoring algorithm can be derived for parameter estimation. The details are given in the Appendix.

3.3. Hypothesis testing

We propose to search for iQTLs across the genome by assuming a putative QTL every 1 or 2cM by partitioning the whole linkage map into small intervals. At each putative QTL position, we test if there is a significant QTL effect on the bivariate traits by formulating the following hypotheses

Hypothesis (5)

The significance of the above test is assessed through the likelihood ratio test (LRT). Let and be estimates of the unknown parameters under and , respectively, then the likelihood ratio statistic is evaluated by

Equation (6)

which, under the null hypothesis, is distributed as a mixture chi-square distribution with the form

(24). Due to correlations of tests across the genome, permutations can be applied to

determine the genome-wide significance threshold.

Once a QTL is identified at a genomic position, its imprinting property for both phenotypic traits is assessed by the following imprinting hypothesis

Hypothesis (7)

Again, the likelihood ratio test is applied and the test statistic (denoted as ) asymptotically follows a chi-square distribution with 3 degrees of freedom. If the null is rejected at the tested QTL position, the QTL is declared as an iQTL. We can further assess whether the imprinting effect is due to complete silence of the maternal allele by testing

Hypothesis (8)

or due to complete silence of the paternal allele by testing

The likelihood ratio test statistic (denoted as and ) corresponding to the above two tests follows a mixture chi-square distribution with (25).

We can also test whether a QTL controls trait 1 by testing

Hypothesis (9)

or controls trait 2 by testing

Hypothesis (10)

The likelihood ratio statistic corresponding to the above tests is denoted as () which under the null

asymptotically follows a mixture chi-square distribution with the form

where refers to the correlation between the variance terms and () which is calculated from the Fisher information matrix, and the conditional correlation is defined as . The detailed derivation can be found in Li and Cui (25).

Rejection of the null of the above two tests indicates the pleiotropic effect (i.e., one gene acts on two traits). But if two genes are closely linked at the detected (i)QTL (i.e., close linkage), the pleiotropic effect might be a false positive due to close linkage. Thus, it is essential to distinguish a pleiotropic effect vs close linkage. This is exactly the relative advantage of the multi-trait linkage analysis. To further distinguish close linkage against pleiotropic effect, we develop the following two tests

Hypothesis (11)

for testing pleiotropic effect and

Hypothesis (12)

for testing close linkage, where r .¢ s are genetic correlation measures for different variance components between two traits. The null hypothesis in test Hypothesis (11) indicates that the additive effects for two traits are perfectly correlated and two traits are possibly controlled by a single gene. On the contrary, the null hypothesis in test Hypothesis (12) indicates two closely linkage genes at one (i)QTL location. The likelihood ratio test is denoted by for test Hypothesis (11) and for test Hypothesis (12). The null distribution of has a mixture chi-square distribution (since 1 is a boundary point for correlation) with the form , while the null distribution of follows a regular chi-square distribution with 3 degrees of freedom, i.e., LRco-in ~since 0 is not a boundary point. Note that the assessment of pleiotropic effects vs close linkage only occurs at the genomic location where there is an (i)QTL being identified by the overall genetic test.

4. SIMULATION

4.1. Simulation design

We designed a simulation study to evaluate the performance of the joint analysis as well as the effect of different genetic designs on testing power and parameter estimation. Six equally-spaced markers (M1-M6) were simulated for one linkage group assuming a backcross design. This linkage group covers a length of 100cM. Haldane map function was used to convert map distance to recombination rate. Assume there was one QTL located at 48cM away from the first marker which had effects on two phenotypic traits. Phenotypic values of two traits were generated from a multivariate normal distribution with variance-covariance specified in Equation (2). Parameter values used for simulation are given in Table 2-3.

As described in Li and Cui (16), different combinations of family and offspring size could influence testing power and parameter estimation. To mimic the real data, we fixed the total sample size as 400 and considered two designs, i.e., 4 families with 100 offsprings each (denoted as 4�100) and 20 families with 20 offsprings each (denoted as 20�20). In each simulation scenario, the IBD value of any two siblings was calculated at every 2cM along the linkage group using the approach described in Li and Cui (16). The REML method was adopted to estimate unknown parameters of interests, and 100 simulation replicates were recorded.

4.2. Simulation results

Figure 1 plots the likelihood ratio statistic averaged over 100 simulation replicates across the entire linkage group obtained with the proposed multivariate analysis and single trait analysis. The bivariate analysis shows consistently high LR values across the linkage group with a clear peak corresponding to the simulated QTL position, which implies the potential power gain in QTL identification with the joint analysis. This is further confirmed by the power analysis in Table 1 (82% power for the bivariate analysis vs 70% and 60% for each single trait analysis) with data simulated assuming no imprinting effect (i.e., and ).

The parameter estimation, power of QTL identification and type I error of the imprinting test are summarized in Table 1. Overall, the joint analysis gives larger statistical power (82%), smaller type I error rate for the imprinting test (6%) and less biased variance components estimation compared to the results obtained with the single trait analysis. We also noticed that the joint analysis gives less bias and smaller standard error in QTL location estimation compared to the single trait analysis. In addition, the joint analysis greatly improved the false positive control with the 4�100 design. The simulation results indicate that the joint mapping by incorporating bivariate phenotypic information greatly improves the efficiency of QTL detection.

The above simulation results were based on a 4�100 design. Our previous investigation showed that this design is less powerful for single trait analysis compared to the 20�20 design when the sample size is fixed (16). Thus, we did additional simulation under the 20�20 design. The results showed improved power for QTL detection and also improved precision for parameter estimation (Table 2). In real application, our recommendation is to adopt a balanced design with a reasonably large family size.

Focusing on the 20�20 design, we did additional simulation to evaluate the performance of imprinting analysis for the joint model under different imprinting mechanisms. The results are tabulated in Table 3. Four imprinting mechanisms were considered: complete paternal imprinting (i.e., and ), complete maternal imprinting (i.e., and ), partial paternal imprinting (i.e., and ), and partial maternal imprinting (i.e., and ). The results showed that the complete maternal imprinting case has the lowest power (65%). This is due to the fact that a backcross endosperm genome carries two maternal copies. When there is no genetic effect for the maternal copies, we expect the power to be lower. We also realized the overall relatively low imprinting power. By increasing the sample (e.g., double the total size to 800), we did find largely improved imprinting power (data not shown). The simulation study implies that large samples are always desirable to obtain reasonably large imprinting power, which is feasible for an experimental cross, especially for plants. For the proposed backcross design study, more than 1000 samples are needed in order to achieve good imprinting power, preferably with a 40�25 design.

5. REAL DATA ANALYSIS

Endosperm is created from the fusion of two polar nuclei and a sperm cell, resulting in a triploid tissue with two identical chromosomes inherited from the maternal parent and one chromosome from the paternal parent. The endosperm supplies nutrition to the embryo during seed germination, and serves as a major source of food for human beings (26). The function of the endosperm is far more complicated than we understand, and is beyond simple nutrient delivery to the embryo. As mentioned in the introduction section, maize endosperm cells undergoing endoreduplication are generally larger than other cells (20). Recent empirical study revealed that endoreduplication in maize endosperm may be associated with genes showing parent-of-origin effects (18).

A study was recently launched to identify parent-of-origin effects associated with endoreduplication in maize endosperm (17). Two inbred lines (Sg18 and Mo17) were used to generate the F1 population. Four backcross segregation populations were then generated following a reciprocal backcross design, i.e., Sg18F1, Mo17F1, F1Sg18, and F1Mo17, with each one containing roughly around 90 offspring. Two endosperm traits, % of endoreduplication (denoted as Endo) and mean ploidy level (denoted as Ploid) were measured for each of the backcross offspring, and 10 linkage groups were constructed from the observed marker data. More details about the description of the data can be found in Coelho et al. (17). The data set was previously analyzed with a single trait variance components model and significant imprinting genes were identified (16).

The sample correlation between the two traits is 0.812 after combining the four backcross phenotype data together, which indicates the potential benefit of a joint analysis by using the proposed bivariate variance components model. To enhance the power of iQTL mapping and further identify the pleiotropic iQTL effect, we applied our newly derived method to this data set. Figure 2 shows the LR profile plot across the ten linkage groups. In addition to the results based on the joint analysis (solid curve), the single trait analysis results are also plotted (dotted line for trait Endo and dashed line for trait Ploid). The 5% genome-wide significance threshold corresponding to the joint analysis (horizontal dotted lines) was determined by permutations based on repeatedly shuffling the relationship between marker genotypes and phenotypes (27). The horizontal dotted line in G7 indicates the 5% chromosome-wide permutation threshold. Note that the single trait analysis was done without correcting for the neighborhood QTL effect, hence the LR curves obtained in Li and Cui are different from the ones presented in this paper (16).

As shown in the figure, five QTLs were detected at the 5% genome-wide significance level and are located in linkage groups G2, G4, G6, G9 and G10. One QTL in G7 only passes the chromosome-wide threshold and is a suggestive QTL. Compared to the single trait analysis results, more QTLs were detected by the joint analysis. Table 3 lists the estimates of the QTL position, variance components, and p-values for testing overall QTL effect and various imprinting and pleiotropic effects. With the joint analysis, the identified QTLs in G4 and G6 are imprinted with imprinting p-values 0.02, and 0.04 respectively. Further tests show that the QTL in G4 shows paternal expression (), and the QTL on G6 shows maternal expression (). The other QTLs do not show an imprinting effect. We further evaluated the pleiotropic vs close linkage effect of the six QTLs. The results indicated that two iQTLs in G4 and G6 show significant co-incident linkage (and ), while the QTL in G10 shows a pleiotropic effect (and ). The p-values for the other QTLs are all larger than 0.05, hence no conclusion about pleiotropic or close linkage effect can be made. This might be due to the issue of genetic design and small sample sizes. The LR profile plot in Figure 2 indicates the power of the joint analysis over the single trait analysis. In addition to the increased power for QTL identification, we were also able to test the pleiotropic effect of (i)QTLs. The (i)QTLs shown pleiotropic effects should be paid special attention for follow-up functional validation.

6. DISCUSSION

A number of studies have shown that for correlated traits, multivariate approaches for genetic linkage analysis can increase the power and precision to identify genetic effects, especially when a QTL has a pleiotropic effect on several traits (5, 6). Considering the importance of imprinted genes in endosperm development and the relative merit of multi-trait analysis, we developed a bivariate variance components model based on a reciprocal backcross design to identify iQTLs while incorporating the special genetic make-up of the triploid inbreeding population. Simulation studies showed the performance of the method under different sampling designs with finite sample size. Comparing the results of joint analysis with those of single trait analysis, the joint analysis greatly improves the performance in QTL position estimation, testing power, and type I error rate.

We applied the joint model to a real data set with two endosperm traits, e.g., % of endoreduplication and mean ploidy level. Six QTLs were detected on G2, G4, G6, G7, G9, and G10 across the maize endosperm genome. Among the six QTLs, five showed genome-wide significance, two are iQTLs with maternal imprinting (on G4) and paternal imprinting (on G6). Compared with the single trait analysis, more QTLs were mapped in the joint analysis. In maize, several paternal imprinting genes have been identified. For example, there is the gene in the regulation of anthocyanin, the seed storage protein regulatory gene dsrl, the MEA gene in seed development and some - tubulin genes (28-31). Study has shown that endoreduplication shows a maternally controlled parent-of-origin effect (18). Given that no specific gene has been reported to control endoreduplication, the identified iQTLs could serve to locate potential candidate genes for further functional validation.

In addition to mapping several QTLs, we also identified significant pleiotropic effects. In maize, some vital genes displaying pleiotropic effects have been reported. For example, maize zfl regulatory genes have pleiotropic effects on structure traits in branching and inflorescence formation (32). The tb1 gene and its intergenic sequences illustrates the pleiotropic effects on maize morphology (33). A maize gene GLOSSY1 (GL1) expresses its effect on trichome size and cutin structure during epidermis development (34). Given the high correlation between the two endoreduplication traits, the identification of genes with pleiotropic effects is practically important. Further functional verification is needed to confirm the findings of this investigation.

In the simulation study, the results indicate low power for imprinting detection with the 4�100 design. This result is consistent with the findings we found earlier for a single trait analysis (16). When we changed the design to 20�20 with a fixed total sample size, improved results were observed (Table 3). Even with the extremely unbalanced design (4�100), simulations also show reasonable false imprinting detection rate (6%). The real data setting is quite similar to the simulation design, thus the detected imprinting effects should have little chance to be false positives. Overall, the simulation studies provide practical guidance to real experimental designs: try to maintain a balanced sampling design and avoid extremely small families with large offspring and extremely large families with very small offspring.

The current method was derived for single iQTL mapping. Extension to multiple iQTL mapping is in fact not straightforward. Further investigation is needed in this context. On the other hand, even though the method was developed for experimental crosses, extension to human genetic mapping studies is straightforward. The only modification is the IBD sharing probability of sibpairs, where the calculation should take the family structure into account (e.g., 9). In fact, the IBD calculation is simpler in humans, because the cross sharing probability reduces to zero for a natural population with random mating. We hope our method will shed light on human genetic mapping studies to identify imprinting effects with variance components models while considering multiple traits.

7. ACKNOWLEDGEMENT

This work was supported by NSF grant DMS-0707031. We thank the two anonymous referees for their helpful comments that greatly improved the manuscript.

8. APPENDIX

Derivation of Fisher scoring algorithm for REML estimation.

Define the IBD sharing matrices corresponding to the three phenotypic vectors as

 with dimension for matrix 0.

 with dimension for matrix 0.

 with dimension for matrix 0.

The IBD sharing matrices for , , , and can be similarly defined and are denoted as , , and (; ). Let be the score vector of the first derivative of the log-likelihood function in (4) with respective to each variance component, i.e.,

Let be the Hessian matrix (The matrix is too large to be included and is omitted here). The Fisher information matrix, , in the REML procedure is obtained by taking the expectation of the negative Hessian matrix. The algorithm starts with . Given initial values , the iteration starts and stops until converges. Upon convergence, the REML estimator of is just the generalized least squares estimator, that is,

9. REFERENCES

1. C Li, A Zhou, T Sang: Rice domestication by reducing shattering. Science 311, 1936-1939 (2006).
http://dx.doi.org/10.1126/science.1123604

2. J Bergelson, F Roux: Towards identifying genes underlying ecologically relevant traits in Arabidopsis thaliana. Nat Rev Genet 11, 867-879 (2010).
http://dx.doi.org/10.1038/nrg2896

3. JT Williams, PV Eerdewegh, L Almasy, J Blangero: Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. I. likelihood formulation and simulation results. Am J Hum Genet 65, 1134-1147 (1999).
http://dx.doi.org/10.1086/302570

4. L Almasy, J Blangero: Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet 62, 1198-1211 (1998).
http://dx.doi.org/10.1086/301844

5. DM Evans: The power of multivariate quantitative-trait loci linkage analysis is influenced by the correlation between the variables. Am J Hum Genet 70, 1599-1602 (2002).
http://dx.doi.org/10.1086/340850

6. C Jiang, Z-B Zeng: Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, 1111-1127 (1995).

7. K Pfeifer: Mechanisms of genomic imprinting. Am J Hum Genet 67, 777-787 (2000).
http://dx.doi.org/10.1086/303101

8. IM Morison, JP Ramsay, HG Spencer: A census of mammalian imprinting. Trends Genet 21, 457-465 (2005).
http://dx.doi.org/10.1016/j.tig.2005.06.008

9. RL Hanson, S Kobes, RS Lindsay, WC Kmowler: Assessment of parent-of-origin effects in linkage analysis of quantitative traits. Am J Hum Genet 68, 951-962 (2001).
http://dx.doi.org/10.1086/319508

10. D-J de Koning, H Bovenhuis, JAM van Arendonk: On the detection of imprinted quantitative trait loci in experimental crosses of outbred species. Genetics 161, 931-938 (2002).

11. Y Cui: A statistical framework for genome-wide scanning and testing imprinted quantitative trait loci. J Theo Biol 244, 115-126 (2007).
http://dx.doi.org/10.1016/j.jtbi.2006.07.009

12. Y Cui, JM Cheverud, R Wu: A statistical model for dissecting genomic imprinting through genetic mapping. Genetica 130, 227-239 (2007).
http://dx.doi.org/10.1007/s10709-006-9101-x

13. T Liu, RJ Todhunter, S Wu, W Hou, R Mateescu, Z Zhang, NI Burton-Wurster, GM Acland, G Lust, R Wu: A random model for mapping imprinted quantitative trait loci in a structured pedigree: An implication for mapping canine hip dysplasia. Genomics 90, 276-284 (2007).

14. Y Li, CM Coelho, T Liu, S Wu, J Wu, Y Zeng, Y Li, B Hunter, RA Dante, BA. Larkins, R Wu: A statistical strategy to estimate maternal-zygotic interactions and parent-of-origin effects of QTLs for seed development. PLoS One 3, e3131 (2008).
http://dx.doi.org/10.1371/journal.pone.0003131

15. G Li, Y Cui: A statistical variance components framework for mapping imprinted quantitative trait loci in experimental crosses. J Prob Stat vol. 2009, Article ID 689489 (2009).
http://dx.doi.org/10.1155/2009/689489

16. G Li, Y Cui: A general statistical framwork for dissecting parent-of-origin effects underlying endosperm traits in flowering plants. Ann App Stat 4, 1214-1233 (2010).
http://dx.doi.org/10.1214/09-AOAS323

17. CM Coelho, S Wu, Y Li, B Hunter, RA Dante, Y Cui, R Wu, BA Larkins: Identification of quantitative trait loci that affect endoreduplication in maize endosperm. Theor Appl Genet 115, 1147-1162 (2007).
http://dx.doi.org/10.1007/s00122-007-0640-z

18. BP Dilkes, RA Dante, C Coelho, BA Larkins: Genetic analysis of endoreduplication in Zea mays endosperm: evidence of sporophytic and zygotic maternal control. Genetics 160, 1163-1177 (2002).

19. B Mangin, P Thoquet, N Grimsley: Pleiotropic QTL analysis. Biometrics 54, 88-99 (1998).
http://dx.doi.org/10.2307/2533998

20. JP Grime, MA Mowforth: Variation in genome size: an ecological interpretation. Nature 299, 151-153 (1982).
http://dx.doi.org/10.1038/299151a0

21. M Abney, MS McPeek, C Ober: Estimation of variance components of quantitative traits in inbred populations. Am J Hum Genet 66, 629-650 (2000).
http://dx.doi.org/10.1086/302759

22. Y Cui, R Wu: A statistical model for characterizing epistatic control of triploid endosperm triggered by maternal and offspring QTL. Genet Res 86, 65-76 (2005).
http://dx.doi.org/10.1017/S0016672305007615

23. RR Corbeil, SR Searle: A comparison of variance component estimators. Biometrics 32, 779-791 (1976).
http://dx.doi.org/10.2307/2529264

24. SG Self, KY Liang: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Stat Asso 82, 605-610 (1987).
http://dx.doi.org/10.2307/2289471

25. G Li, Y Cui: Assessing statistical significance in genetic linkage analysis with the variance components model. Manuscript (2011).

26. RA Brink, D Cooper: The endosperm in seed development. Bot Rev 13, 423-541 (1947).
http://dx.doi.org/10.1007/BF02861548

27. G Churchill, RW Doerge: Empirical threshold values for quantitative trait mapping. Genetics 138, 963-971 (1994).

28. JL Kermicle: Dependence of the R-mottled aleurone phenotype in maize on the modes of sexual transmission. Genetics 66, 69-85 (1970).

29. S Chaudhuri, J Messing: Allele-specific parental imprinting of dzrl, a post transcriptional regulator of zein accumulation. Proc Natl Acad Sci 91, 4867-4871 (1994).
http://dx.doi.org/10.1073/pnas.91.11.4867

30. T Kinoshita, R Yadegari, JJ Harada, RB Goldberg, RL Fishcher: Imprinting of the MEDEA polycomb gene in the Arabidopsis endosperm. Plant Cell 11, 1945-1952 (1999).
http://dx.doi.org/10.1105/tpc.11.10.1945

31. G Lund, J Messing, A Viotti: Endosperm-specific demethylation and activation of specific alleles of a-tubulin genes of Zea mays L. Mol Gen Genet 246, 716-722 (1995).
http://dx.doi.org/10.1007/BF00290717

32. K Bomblies, JF Doebley: Pleiotropic effects of the duplicate maize FLORICAULA/LEAFY genes zfl1 and zfl2 on traits under selection during maize domestication. Genetics 172, 519-531 (2006).
http://dx.doi.org/10.1534/genetics.105.048595

33. RM Clark, TN Wagler, P Quijada, J Doebley: A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nat Genet 38, 594-597 (2006).
http://dx.doi.org/10.1038/ng1784

34. M Sturaro, H Hartings, E Schmelzer, R Velasco, F Salamini, M Motto: Cloning and characterization of GLOSSY1, a maize gene involved in cuticle membrane and wax production. Plant Physiol 138, 478-489 (2005).
http://dx.doi.org/10.1104/pp.104.058164

Abbreviations: GWAS: genome-wide association studies, IBD: identical-by-descent, iQTL: imprinting quantitative trait loci, LR: likelihood ratio, QTL: quantitative trait loci, REML: restricted maximum likelihood, SNP: single nucleotide polymorphism

Key Words: Backcross, Close linkage, Likelihood ratio test, Maize, Maximum likelihood, Pleiotropic effect

Send correspondence to: Yuehua Cui, Department of Statistics and Probability, A-432 Wells Hall, Michigan State University, East Lansing, MI 48824, Tel: 517-432-7098, Fax: 517-432-1405, E-mail:cui@stt.msu.edu