- Fred A. Eeuwijk van (1)
- Daniela A.L. Lourenco (1)
- Alula Assen (1)
- Josineudson Augusto Ii Vasconcelos Silva (1)
- S. Berg van den (1)
- P. Bijma (1)
- Piter Bijma (1)
- P.J. Bowman (1)
- Gerard C. Linden van der (1)
- Paul C. Struik (1)
- M.P.L. Calus (3)
- Julio G. Velazco (1)
- M.E. Goddard (1)
- Colleen H. Hunt (1)
- B.J. Hayes (1)
- Amanda M. Maiorano (1)
- Marcos Malosetti (1)
- T.H.E. Meuwissen (1)
- Ignacy Misztal (1)
- Niteen N. Kadam (1)
- William O. Herring (1)
- David R. Jordan (1)
- Emma S. Mace (1)
- Krishna S.V. Jagadish (1)
- Shogo Tsuruta (1)
- R.F. Veerkamp (1)
- Y.C.J. Wientjes (3)
- Ching Yi Chen (1)
- Xinyou Yin (1)
Genomic Prediction of Grain Yield and Drought-Adaptation Capacity in Sorghum Is Enhanced by Multi-Trait Analysis
Velazco, Julio G. ; Jordan, David R. ; Mace, Emma S. ; Hunt, Colleen H. ; Malosetti, Marcos ; Eeuwijk, Fred A. van - \ 2019
Frontiers in Plant Science 10 (2019). - ISSN 1664-462X
auxiliary trait - blended kinship matrix - BLUP - genomic prediction - grain yield - multi-trait analysis - sorghum - stay-green
Grain yield and stay-green drought adaptation trait are important targets of selection in grain sorghum breeding for broad adaptation to a range of environments. Genomic prediction for these traits may be enhanced by joint multi-trait analysis. The objectives of this study were to assess the capacity of multi-trait models to improve genomic prediction of parental breeding values for grain yield and stay-green in sorghum by using information from correlated auxiliary traits, and to determine the combinations of traits that optimize predictive results in specific scenarios. The dataset included phenotypic performance of 2645 testcross hybrids across 26 environments as well as genomic and pedigree information on their female parental lines. The traits considered were grain yield (GY), stay-green (SG), plant height (PH), and flowering time (FT). We evaluated the improvement in predictive performance of multi-trait G-BLUP models relative to single-trait G-BLUP. The use of a blended kinship matrix exploiting pedigree and genomic information was also explored to optimize multi-trait predictions. Predictive ability for GY increased up to 16% when PH information on the training population was exploited through multi-trait genomic analysis. For SG prediction, full advantage from multi-trait G-BLUP was obtained only when GY information was also available on the predicted lines per se, with predictive ability improvements of up to 19%. Predictive ability, unbiasedness and accuracy of predictions from conventional multi-trait G-BLUP were further optimized by using a combined pedigree-genomic relationship matrix. Results of this study suggest that multi-trait genomic evaluation combining routinely measured traits may be used to improve prediction of crop productivity and drought adaptability in grain sorghum.
Improving accuracy of direct and maternal genetic effects in genomic evaluations using pooled boar semen: a simulation study
Maiorano, Amanda M. ; Assen, Alula ; Bijma, Piter ; Chen, Ching Yi ; Silva, Josineudson Augusto Ii Vasconcelos ; Herring, William O. ; Tsuruta, Shogo ; Misztal, Ignacy ; Lourenco, Daniela A.L. - \ 2019
Journal of Animal Science 97 (2019)8. - ISSN 0021-8812 - p. 3237 - 3245.
genomic prediction - maternal ability - multiple sire - prediction accuracy
Pooling semen of multiple boars is commonly used in swine production systems. Compared with single boar systems, this technique changes family structure creating maternal half-sib families. The aim of this simulation study was to investigate how pooling semen affects the accuracy of estimating direct and maternal effects for individual piglet birth weight, in purebred pigs. Different scenarios of pooling semen were simulated by allowing the same female to mate from 1 to 6 boars, per insemination, whereas litter size was kept constant (N = 12). In each pooled boar scenario, genomic information was used to construct either the genomic relationship matrix (G) or to reconstruct pedigree in addition to G. Genotypes were generated for 60,000 SNPs evenly distributed across 18 autosomes. From the 5 simulated generations, only animals from generations 3 to 5 were genotyped (N = 36,000). Direct and maternal true breeding values (TBV) were computed as the sum of the effects of the 1,080 QTLs. Phenotypes were constructed as the sum of direct TBV, maternal TBV, an overall mean of 1.25 kg, and a residual effect. The simulated heritabilities for direct and maternal effects were 0.056 and 0.19, respectively, and the genetic correlation between both effects was -0.25. All simulations were replicated 5 times. Variance components and direct and maternal heritability were estimated using average information REML. Predictions were computed via pedigree-based BLUP and single-step genomic BLUP (ssGBLUP). Genotyped littermates in the last generation were used for validation. Prediction accuracies were calculated as correlations between EBV and TBV for direct (accdirect) and maternal (accmat) effects. When boars were known, accdirect were 0.21 (1 boar) and 0.26 (6 boars) for BLUP, whereas for ssGBLUP, they were 0.38 (1 boar) and 0.43 (6 boars). When boars were unknown, accdirect was lower in BLUP but similar in ssGBLUP. For the scenario with known boars, accmat was 0.58 and 0.63 for 1 and 6 boars, respectively, under ssGBLUP. For unknown boars, accmat was 0.63 for 2 boars and 0.62 for 6 boars in ssGBLUP. In general, accdirect and accmat were lower in the single-boar scenario compared with pooled semen scenarios, indicating that a half-sib structure is more adequate to estimate direct and maternal effects. Using pooled semen from multiple boars can help us to improve accuracy of predicting maternal and direct effects when maternal half-sib families are larger than 2.
Incorporating genome-wide association into eco-physiological simulation to identify markers for improving rice yields
Kadam, Niteen N. ; Jagadish, Krishna S.V. ; Struik, Paul C. ; Linden, Gerard C. van der; Yin, Xinyou - \ 2019
Journal of Experimental Botany 70 (2019)9. - ISSN 0022-0957 - p. 2575 - 2586.
Oryza sativa - Crop modelling - genomic prediction - genotype–phenotype relationships - GWAS - marker design
We explored the use of the eco-physiological crop model GECROS to identify markers for improved rice yield under well-watered (control) and water deficit conditions. Eight model parameters were measured from the control in one season for 267 indica genotypes. The model accounted for 58% of yield variation among genotypes under control and 40% under water deficit conditions. Using 213 randomly selected genotypes as the training set, 90 single nucleotide polymorphism (SNP) loci were identified using a genome-wide association study (GWAS), explaining 42-77% of crop model parameter variation. SNP-based parameter values estimated from the additive loci effects were fed into the model. For the training set, the SNP-based model accounted for 37% (control) and 29% (water deficit) of yield variation, less than the 78% explained by a statistical genomic prediction (GP) model for the control treatment. Both models failed in predicting yields of the 54 testing genotypes. However, compared with the GP model, the SNP-based crop model was advantageous when simulating yields under either control or water stress conditions in an independent season. Crop model sensitivity analysis ranked the SNP loci for their relative importance in accounting for yield variation, and the rank differed greatly between control and water deficit environments. Crop models have the potential to use single-environment information for predicting phenotypes under different environments.
Multibreed genomic prediction using multitrait genomic residual maximum likelihood and multitask Bayesian variable selection
Calus, M.P.L. ; Goddard, M.E. ; Wientjes, Y.C.J. ; Bowman, P.J. ; Hayes, B.J. - \ 2018
Journal of Dairy Science 101 (2018)5. - ISSN 0022-0302 - p. 4279 - 4294.
Bayesian variable selection - genomic prediction - multibreed
Genomic prediction is applicable to individuals of different breeds. Empirical results to date, however, show limited benefits in using information on multiple breeds in the context of genomic prediction. We investigated a multitask Bayesian model, presented previously by others, implemented in a Bayesian stochastic search variable selection (BSSVS) model. This model allowed for evidence of quantitative trait loci (QTL) to be accumulated across breeds or for both QTL that segregate across breeds and breed-specific QTL. In both cases, single nucleotide polymorphism effects were estimated with information from a single breed. Other models considered were a single-trait and multitrait genomic residual maximum likelihood (GREML) model, with breeds considered as different traits, and a single-trait BSSVS model. All single-trait models were applied to each of the 2 breeds separately and to the pooled data of both breeds. The data used included a training data set of 6,278 Holstein and 722 Jersey bulls, as well as 374 Jersey validation bulls. All animals had genotypes for 474,773 single nucleotide polymorphisms after editing and phenotypes for milk, fat, and protein yields. Using the same training data, BSSVS consistently outperformed GREML. The multitask BSSVS, however, did not outperform single-trait BSSVS, which used pooled Holstein and Jersey data for training. Thus, the rigorous assumption that the traits are the same in both breeds yielded a slightly better prediction than a model that had to estimate the correlation between the breeds from the data. Adding the Holstein data significantly increased the accuracy of the single-trait GREML and BSSVS in predicting the Jerseys for milk and protein, in line with estimated correlations between the breeds of 0.66 and 0.47 for milk and protein yields, whereas only the BSSVS model significantly improved the accuracy for fat yield with an estimated correlation between breeds of only 0.05. The relatively high genetic correlations for milk and protein yields, and the superiority of the pooling strategy, is likely the result of the observed admixture between both breeds in our data. The Bayesian model was able to detect several QTL in Holsteins, which likely enabled it to outperform GREML. The inability of the multitask Bayesian models to outperform a simple pooling strategy may be explained by the fact that the pooling strategy assumes equal effects in both breeds; furthermore, this assumption may be valid for moderate- to large-sized QTL, which are important for multibreed genomic prediction.
Data from: Across population genomic prediction scenarios in which Bayesian variable selection outperforms GBLUP
Berg, S. van den; Calus, M.P.L. ; Meuwissen, T.H.E. ; Wientjes, Y.C.J. - \ 2015
genomic prediction - across population - Bayesian variable selection - GBLUP - accuracy - number of independent chromosome segments
Background: The use of information across populations is an attractive approach to increase the accuracy of genomic prediction for numerically small populations. However, accuracies of across population genomic prediction, in which reference and selection individuals are from different populations, are currently disappointing. It has been shown for within population genomic prediction that Bayesian variable selection models outperform GBLUP models when the number of QTL underlying the trait is low. Therefore, our objective was to identify across population genomic prediction scenarios in which Bayesian variable selection models outperform GBLUP in terms of prediction accuracy. In this study, high density genotype information of 1033 Holstein Friesian, 105 Groningen White Headed, and 147 Meuse-Rhine-Yssel cows were used. Phenotypes were simulated using two changing variables: (1) the number of QTL underlying the trait (3000, 300, 30, 3), and (2) the correlation between allele substitution effects of QTL across populations, i.e. the genetic correlation of the simulated trait between the populations (1.0, 0.8, 0.4). Results: The accuracy obtained by the Bayesian variable selection model was depending on the number of QTL underlying the trait, with a higher accuracy when the number of QTL was lower. This trend was more pronounced for across population genomic prediction than for within population genomic prediction. It was shown that Bayesian variable selection models have an advantage over GBLUP when the number of QTL underlying the simulated trait was small. This advantage disappeared when the number of QTL underlying the simulated trait was large. The point where the accuracy of Bayesian variable selection and GBLUP became similar was approximately the point where the number of QTL was equal to the number of independent chromosome segments (M e ) across the populations. Conclusion: Bayesian variable selection models outperform GBLUP when the number of QTL underlying the trait is smaller than M e . Across populations, M e is considerably larger than within populations. So, it is more likely to find a number of QTL underlying a trait smaller than M e across populations than within population. Therefore Bayesian variable selection models can help to improve the accuracy of across population genomic prediction.
Data from: An equation to predict the accuracy of genomic values by combining data from multiple traits, populations, or environments
Wientjes, Y.C.J. ; Bijma, P. ; Veerkamp, R.F. ; Calus, M.P.L. - \ 2015
genomic prediction - multi-population - accuracy - prediction equation
Predicting the accuracy of estimated genomic values using genome-wide marker information is an important step in designing training populations. Currently, different deterministic equations are available to predict accuracy within populations, but not for multipopulation scenarios where data from multiple breeds, lines or environments are combined. Therefore, our objective was to develop and validate a deterministic equation to predict the accuracy of genomic values when different populations are combined in one training population. The input parameters of the derived prediction equation are the number of individuals and the heritability from each of the populations in the training population; the genetic correlations between the populations, i.e., the correlation between allele substitution effects of quantitative trait loci; the effective number of chromosome segments across predicted and training populations; and the proportion of the genetic variance in the predicted population captured by the markers in each of the training populations. Validation was performed based on real genotype information of 1033 Holstein–Friesian cows that were divided into three different populations by combining half-sib families in the same population. Phenotypes were simulated for multiple scenarios, differing in heritability within populations and in genetic correlations between the populations. Results showed that the derived equation can accurately predict the accuracy of estimating genomic values for different scenarios of multipopulation genomic prediction. Therefore, the derived equation can be used to investigate the potential accuracy of different multipopulation genomic prediction scenarios and to decide on the most optimal design of training populations.