- Fred A. Eeuwijk van (1)
- Claudia A. Sevillano (1)
- S. Berg van den (1)
- Daniela Bustos-Korts (1)
- Scott C. Chapman (1)
- Yvonne C.J. Wientjes (1)
- M.P.L. Calus (2)
- Albrecht E. Melchinger (1)
- Simone E.F. Guimarães (1)
- Fabyano F. Silva (1)
- Gregor Gorjanc (1)
- Y. Haas de (1)
- Hiroyoshi Iwata (1)
- Emilie J. Millet (1)
- Willem Kruijer (1)
- Christian Kuppe (1)
- Marcos Malosetti (1)
- C.I.V. Manzanilla-Pech (1)
- T.H.E. Meuwissen (1)
- Onno Muller (1)
- Dominik Müller (1)
- Konstantinos N. Blazakis (1)
- J. Napel ten (1)
- Jan Napel ten (1)
- Martin P. Boer (1)
- Mario P.L. Calus (2)
- Roberto Quiroz (1)
- Pascal Schopp (1)
- Francois Tardieu (1)
- Addie Thompson (1)
- Jeremie Vandenplas (1)
- R.F. Veerkamp (1)
- Y.C.J. Wientjes (1)
- Kang Yu (1)
- BMC Genetics (1)
- BMC Genomics (1)
- G3 : Genes Genomes Genetics (1)
- Genetics (1)
- Journal of Dairy Science (1)
Modelling strategies for assessing and increasing the effectiveness of new phenotyping techniques in plant breeding
Eeuwijk, Fred A. van; Bustos-Korts, Daniela ; Millet, Emilie J. ; Boer, Martin P. ; Kruijer, Willem ; Thompson, Addie ; Malosetti, Marcos ; Iwata, Hiroyoshi ; Quiroz, Roberto ; Kuppe, Christian ; Muller, Onno ; Blazakis, Konstantinos N. ; Yu, Kang ; Tardieu, Francois ; Chapman, Scott C. - \ 2019
Plant Science 282 (2019). - ISSN 0168-9452 - p. 23 - 39.
Crop growth model - Genomic prediction - Genotype-by-environment-interaction - Genotype-to-phenotype model - Mixed model - Multi-environment model - Multi-trait model - Phenotyping - Phenotyping platform - Physiology - Plant breeding - Prediction - Reaction norm - Response surface - Statistical genetics
New types of phenotyping tools generate large amounts of data on many aspects of plant physiology and morphology with high spatial and temporal resolution. These new phenotyping data are potentially useful to improve understanding and prediction of complex traits, like yield, that are characterized by strong environmental context dependencies, i.e., genotype by environment interactions. For an evaluation of the utility of new phenotyping information, we will look at how this information can be incorporated in different classes of genotype-to-phenotype (G2P) models. G2P models predict phenotypic traits as functions of genotypic and environmental inputs. In the last decade, access to high-density single nucleotide polymorphism markers (SNPs) and sequence information has boosted the development of a class of G2P models called genomic prediction models that predict phenotypes from genome wide marker profiles. The challenge now is to build G2P models that incorporate simultaneously extensive genomic information alongside with new phenotypic information. Beyond the modification of existing G2P models, new G2P paradigms are required. We present candidate G2P models for the integration of genomic and new phenotyping information and illustrate their use in examples. Special attention will be given to the modelling of genotype by environment interactions. The G2P models provide a framework for model based phenotyping and the evaluation of the utility of phenotyping information in the context of breeding programs.
Effects of alleles in crossbred pigs estimated for genomic prediction depend on their breed-of-origin
Sevillano, Claudia A. ; Napel, Jan ten; Guimarães, Simone E.F. ; Silva, Fabyano F. ; Calus, Mario P.L. - \ 2018
BMC Genomics 19 (2018)1. - ISSN 1471-2164
Breed-of-origin - Crossbred - Genomic prediction - Pig
Background: This study investigated if the allele effect of a given single nucleotide polymorphism (SNP) for crossbred performance in pigs estimated in a genomic prediction model differs depending on its breed-of-origin, and how these are related to estimated effects for purebred performance. Results: SNP-allele substitution effects were estimated for a commonly used SNP panel using a genomic best linear unbiased prediction model with breed-specific partial relationship matrices. Estimated breeding values for purebred and crossbred performance were converted to SNP-allele effects by breed-of-origin. Differences between purebred and crossbred, and between breeds-of-origin were evaluated by comparing percentage of variance explained by genomic regions for back fat thickness (BF), average daily gain (ADG), and residual feed intake (RFI). From ten regions explaining most additive genetic variance for crossbred performance, 1 to 5 regions also appeared in the top ten for purebred performance. The proportion of genetic variance explained by a genomic region and the estimated effect of a haplotype in such a region were different depending upon the breed-of-origin. To illustrate underlying mechanisms, we evaluated the estimated effects across breeds-of-origin for haplotypes associated to the melanocortin 4 receptor (MC4R) gene, and for the MC4Rsnp itself which is a missense mutation with a known effect on BF and ADG. Although estimated allele substitution effects of the MC4Rsnp mutation were very similar across breeds, explained genetic variance of haplotypes associated to the MC4R gene using a SNP panel that does not include the mutation, was considerably lower in one of the breeds where the allele frequency of the mutation was the lowest. Conclusions: Similar regions explaining similar additive genetic variance were observed across purebred and crossbred performance. Moreover, there was some overlap across breeds-of-origin between regions that explained relatively large proportions of genetic variance for crossbred performance; albeit that the actual proportion of variance deviated across breeds-of-origin. Results based on a missense mutation in MC4R confirmed that even if a causal locus has similar effects across breeds-of-origin, estimated effects and explained variance in its region using a commonly used SNP panel can strongly depend on the allele frequency of the underlying causal mutation.
Genomic prediction using individual-level data and summary statistics from multiple populations
Vandenplas, Jeremie ; Calus, Mario P.L. ; Gorjanc, Gregor - \ 2018
Genetics 210 (2018)1. - ISSN 0016-6731 - p. 53 - 69.
Genomic prediction - GenPred - Meta-analysis - Quantitative trait - Shared data resources - Statistical method
This study presents a method for genomic prediction that uses individual-level data and summary statistics from multiple populations. Genome-wide markers are nowadays widely used to predict complex traits, and genomic prediction using multi-population data are an appealing approach to achieve higher prediction accuracies. However, sharing of individual-level data across populations is not always possible. We present a method that enables integration of summary statistics from separate analyses with the available individual-level data. The data can either consist of individuals with single or multiple (weighted) phenotype records per individual. We developed a method based on a hypothetical joint analysis model and absorption of population-specific information. We show that population-specific information is fully captured by estimated allele substitution effects and the accuracy of those estimates, i.e., the summary statistics. The method gives identical result as the joint analysis of all individual-level data when complete summary statistics are available. We provide a series of easy-to-use approximations that can be used when complete summary statistics are not available or impractical to share. Simulations show that approximations enable integration of different sources of information across a wide range of settings, yielding accurate predictions. The method can be readily extended to multiple-traits. In summary, the developed method enables integration of genome-wide data in the individual-level or summary statistics from multiple populations to obtain more accurate estimates of allele substitution effects and genomic predictions.
Genomic prediction within and across biparental families : Means and variances of prediction accuracy and usefulness of deterministic equations
Schopp, Pascal ; Müller, Dominik ; Wientjes, Yvonne C.J. ; Melchinger, Albrecht E. - \ 2017
G3 : Genes Genomes Genetics 7 (2017)11. - ISSN 2160-1836 - p. 3571 - 3586.
Biparental families - GBLUP deterministic accuracy - Genomic prediction - Genomic selection - Genpred shared data resources - Linkage disequilibrium - Plant breeding
A major application of genomic prediction (GP) in plant breeding is the identification of superior inbred lines within families derived from biparental crosses. When models for various traits were trained within related or unrelated biparental families (BPFs), experimental studies found substantial variation in prediction accuracy (PA), but little is known about the underlying factors. We used SNP marker genotypes of inbred lines from either elite germplasm or landraces of maize (Zea mays L.) as parents to generate in silico 300 BPFs of doubled-haploid lines. We analyzed PA within each BPF for 50 simulated polygenic traits, using genomic best linear unbiased prediction (GBLUP) models trained with individuals from either full-sib (FSF), half-sib (HSF), or unrelated families (URF) for various sizes (Ntrain) of the training set and different heritabilities (h2). In addition, we modified two deterministic equations for forecasting PA to account for inbreeding and genetic variance unexplained by the training set. Averaged across traits, PA was high within FSF (0.41-0.97) with large variation only for Ntrain, 50 and h2 <0:6: For HSF and URF, PA was on average ~40-60% lower and varied substantially among different combinations of BPFs used for model training and prediction as well as different traits. As exemplified by HSF results, PA of across-family GP can be very low if causal variants not segregating in the training set account for a sizeable proportion of the genetic variance among predicted individuals. Deterministic equations accurately forecast the PA expected over many traits, yet cannot capture trait-specific deviations. We conclude that model training within BPFs generally yields stable PA, whereas a high level of uncertainty is encountered in across-family GP. Our study shows the extent of variation in PA that must be at least reckoned with in practice and offers a starting point for the design of training sets composed of multiple BPFs.
Accuracies of breeding values for dry matter intake using nongenotyped animals and predictor traits in different lactations
Manzanilla-Pech, C.I.V. ; Veerkamp, R.F. ; Haas, Y. de; Calus, M.P.L. ; Napel, J. ten - \ 2017
Journal of Dairy Science 100 (2017)11. - ISSN 0022-0302 - p. 9103 - 9114.
Fat- and protein-corrected milk - Feed intake - Genomic prediction - Live weight
Given the interest of including dry matter intake (DMI) in the breeding goal, accurate estimated breeding values (EBV) for DMI are needed, preferably for separate lactations. Due to the limited amount of records available on DMI, 2 main approaches have been suggested to compute those EBV: (1) the inclusion of predictor traits, such as fat- and protein-corrected milk (FPCM) and live weight (LW), and (2) the addition of genomic information of animals using what is called genomic prediction. Recently, several methodologies to estimate EBV utilizing genomic information (EBV) have become available. In this study, a new method known as single-step ridge-regression BLUP (SSRR-BLUP) is suggested. The SSRR-BLUP method does not have an imposed limit on the number of genotyped animals, as the commonly used methods do. The objective of this study was to estimate genetic parameters using a relatively large data set with DMI records, as well as compare the accuracies of the EBV for DMI. These accuracies were obtained using 4 different methods: BLUP (using pedigree for all animals with phenotypes), genomic BLUP (GBLUP; only for genotyped animals), single-step GBLUP (SS-GBLUP), and SSRR-BLUP (for genotyped and nongenotyped animals). Records from different lactations, with or without predictor traits (FPCM and LW), were used in the model. Accuracies of EBV for DMI (defined as the correlation between the EBV and pre-adjusted DMI phenotypes divided by the average accuracy of those phenotypes) ranged between 0.21 and 0.38 across methods and scenarios. Accuracies of EBV for DMI using BLUP were the lowest accuracies obtained across methods. Meanwhile, accuracies of EBV for DMI were similar in SS-GBLUP and SSRR-BLUP, and lower for the GBLUP method. Hence, SSRR-BLUP could be used when the number of genotyped animals is large, avoiding the construction of the inverse genomic relationship matrix. Adding information on DMI from different lactations in the reference population gave higher accuracies in comparison when only lactation 1 was included. Finally, no benefit was obtained by adding information on predictor traits to the reference population when DMI was already included. However, in the absence of DMI records, having records on FPCM and LW from different lactations is a good way to obtain EBV with a relatively good accuracy.
Across population genomic prediction scenarios in which Bayesian variable selection outperforms GBLUP
Berg, S. van den; Calus, M.P.L. ; Meuwissen, T.H.E. ; Wientjes, Y.C.J. - \ 2015
BMC Genetics 16 (2015)1. - ISSN 1471-2156 - 12 p.
Accuracy - Across population - Bayesian variable selection - GBLUP - Genomic prediction - Number of independent chromosome segments
Background: The use of information across populations is an attractive approach to increase the accuracy of genomic prediction for numerically small populations. However, accuracies of across population genomic prediction, in which reference and selection individuals are from different populations, are currently disappointing. It has been shown for within population genomic prediction that Bayesian variable selection models outperform GBLUP models when the number of QTL underlying the trait is low. Therefore, our objective was to identify across population genomic prediction scenarios in which Bayesian variable selection models outperform GBLUP in terms of prediction accuracy. In this study, high density genotype information of 1033 Holstein Friesian, 105 Groningen White Headed, and 147 Meuse-Rhine-Yssel cows were used. Phenotypes were simulated using two changing variables: (1) the number of QTL underlying the trait (3000, 300, 30, 3), and (2) the correlation between allele substitution effects of QTL across populations, i.e. the genetic correlation of the simulated trait between the populations (1.0, 0.8, 0.4). Results: The accuracy obtained by the Bayesian variable selection model was depending on the number of QTL underlying the trait, with a higher accuracy when the number of QTL was lower. This trend was more pronounced for across population genomic prediction than for within population genomic prediction. It was shown that Bayesian variable selection models have an advantage over GBLUP when the number of QTL underlying the simulated trait was small. This advantage disappeared when the number of QTL underlying the simulated trait was large. The point where the accuracy of Bayesian variable selection and GBLUP became similar was approximately the point where the number of QTL was equal to the number of independent chromosome segments (M e ) across the populations. Conclusion: Bayesian variable selection models outperform GBLUP when the number of QTL underlying the trait is smaller than M e . Across populations, M e is considerably larger than within populations. So, it is more likely to find a number of QTL underlying a trait smaller than M e across populations than within population. Therefore Bayesian variable selection models can help to improve the accuracy of across population genomic prediction.