Prioritization of candidate genes in QTL regions based on associations between traits and biological processes
Bargsten, J.W. ; Nap, J.P.H. ; Sanchez Perez, G.F. ; Dijk, A.D.J. van - \ 2014
BMC Plant Biology 14 (2014). - ISSN 1471-2229
genome-wide association - protein function prediction - arabidopsis-thaliana - nucleotide polymorphisms - enrichment analysis - flowering time - complex traits - oryza-sativa - rice - architecture
Background Elucidation of genotype-to-phenotype relationships is a major challenge in biology. In plants, it is the basis for molecular breeding. Quantitative Trait Locus (QTL) mapping enables to link variation at the trait level to variation at the genomic level. However, QTL regions typically contain tens to hundreds of genes. In order to prioritize such candidate genes, we show that we can identify potentially causal genes for a trait based on overrepresentation of biological processes (gene functions) for the candidate genes in the QTL regions of that trait. Results The prioritization method was applied to rice QTL data, using gene functions predicted on the basis of sequence- and expression-information. The average reduction of the number of genes was over ten-fold. Comparison with various types of experimental datasets (including QTL fine-mapping and Genome Wide Association Study results) indicated both statistical significance and biological relevance of the obtained connections between genes and traits. A detailed analysis of flowering time QTLs illustrates that genes with completely unknown function are likely to play a role in this important trait. Conclusions Our approach can guide further experimentation and validation of causal genes for quantitative traits. This way it capitalizes on QTL data to uncover how individual genes influence trait variation.