Exploiting whole genome sequence variants in cattle breeding : Unraveling the distribution of genetic variants and role of rare variants in genomic evaluation
Zhang, Qianqian - \ 2017
Wageningen University. Promotor(en): H. Bovenhuis; M.S. Lund, co-promotor(en): G. Sahana; M. Calus; B. Guldbrandtsen. - Wageningen : Wageningen University - ISBN 9788793643147 - 249
cattle - genomes - genetic variation - inbreeding - homozygosity - longevity - quantitative traits - animal breeding - animal genetics - rundvee - genomen - genetische variatie - inteelt - homozygotie - gebruiksduur - kwantitatieve kenmerken - dierveredeling - diergenetica
The availability of whole genome sequence data enables to better explore the genetic mechanisms underlying different quantitative traits that are targeted in animal breeding. This thesis presents different strategies and perspectives on utilization of whole genome sequence variants in cattle breeding. Using whole genome sequence variants, I show the genetic variation, recent and ancient inbreeding, and genome-wide pattern of introgression across the demographic and breeding history in different cattle populations. Using the latest genomic tools, I demonstrate that recent inbreeding can accurately be estimated by runs of homozygosity (ROH). This can further be utilized in breeding programs to control inbreeding in breeding programs. In chapter 2 and 4, by in-depth genomic analysis on whole genome sequence data, I demonstrate that the distribution of functional genetic variants in ROH regions and introgressed haplotypes was shaped by recent selective breeding in cattle populations. The contribution of whole genome sequence variants to the phenotypic variation partly depends on their allele frequencies. Common variants associated with different traits have been identified and explain a considerable proportion of the genetic variance. For example, common variants from whole genome sequence associated with longevity have been identified in chapter 5. However, the identified common variants cannot explain the full genetic variance, and rare variants might play an important role here. Rare variants may account for a large proportion of the whole genome sequence variants, but are often ignored in genomic evaluation, partly because of difficulty to identify associations between rare variants and phenotypes. I compared the powers of different gene-based association mapping methods that combine the rare variants within a gene using a simulation study. Those gene- based methods had a higher power for mapping rare variants compared with mixed linear models applying single marker tests that are commonly used for common variants. Moreover, I explored the role of rare and low-frequency variants in the variation of different complex traits and their impact on genomic prediction reliability. Rare and low-frequency variants contributed relatively more to variation for health-related traits than production traits, reflecting the potential of improving prediction reliability using rare and low-frequency variants for health-related traits. However, in practice, only marginal improvement was observed using selected rare and low-frequency variants when combined with 50k SNP genotype data on the reliability of genomic prediction for fertility, longevity and health traits. A simulation study did show that reliability of genomic prediction could be improved provided that causal rare and low-frequency variants affecting a trait are known.
Comparative genomics and trait evolution in Cleomaceae, a model family for ancient polyploidy
Bergh, Erik van den - \ 2017
Wageningen University. Promotor(en): M.E. Schranz; Y. van de Peer. - Wageningen : Wageningen University - ISBN 9789463431705 - 106
capparaceae - genomics - polyploidy - evolution - genomes - reproductive traits - flowers - colour - glucosinolates - genetic variation - biosystematics - taxonomy - identification - capparaceae - genomica - polyploïdie - evolutie - genomen - voortplantingskenmerken - bloemen - kleur - glucosinolaten - genetische variatie - biosystematiek - taxonomie - identificatie
As more and more species have been sequenced, evidence has been piling up for a fascinating phenomenon that seems to occur in all plant lineages: paleopolyploidy. Polyploidy has historically been a much observed and studied trait, but until recently it was assumed that polyploids were evolutionary dead-ends due to their sterility. However, many studies since the 1990’s have challenged this notion by finding evidence for ancient genome duplications in many genomes of current species. This lead to the observation that all seed plants share at least one ancestral polyploidy event. Another polyploidy event has been proven to lie at the base of all angiosperms, further signifying the notion that ancient polyploidy is widespread and common. These findings have led to questions regarding the apparent disadvantages that can be observed in a first generation polyploid. If these disadvantages can be overcome however, duplication of a genome also presents an enormous potential for evolutionary novelty. Duplicated copies of genes are able to acquire changes that can lead to specialization of the duplicated pair into two functions (subfunctionalization) or the development of one copy towards an entirely new function (neofunctionalization).
Currently, most research towards polyploidy has focused on the economically and scientifically important Brassicaceae family containing the model plant Arabidopsis thaliana and many crops such as cabbage, rapeseed, broccoli and turnip. In this thesis, I lay the foundations for the expansion of this scope to the Cleomaceae, a widespread cosmopolitan plant family and a sister family of Brassicaceae. The species within Cleomaceae are diverse and exhibit many scientifically interesting traits. They are also in a perfect position phylogenetically to draw comparisons with the much more studied Brassicaceae. I describe the Cleomaceae and their relevance to polyploid research in more detail in the Introduction. I then describe the important first step towards setting up the genetic framework of this family with the sequencing of Tarenaya hassleriana in Chapter 1.
In Chapter 2, I have studied the effects of polyploidy on the development of C4 photosynthesis by comparing the transcriptome of C3 photosynthesis based species Tarenaya hassleriana with the C4 based Gynandropsis gynandra. C4 photosynthesis is an elaboration of the more common C3 form of photosynthesis that concentrates CO2 in specific cells leading to decreased photorespiration by the RuBisCO and higher photosynthetic efficieny in low CO2 environments. I find that polyploidy has not led to sub- or neofunctionalization towards the development of this trait, but instead find evidence for another important phenomenon in postpolyploid evolution: the dosage balance hypothesis. This hypothesis states that genes which are dependent on specific dosage levels of their products will be maintained in duplicate; any change in their function would lead to dosage imbalance which would have deleterious effects on their pathway. We show that most genes involved in photosynthesis have returned to single copy in G. gynandra and that the changes leading to C4 have mostly taken place at the expression level confirming current assumptions on the development of this trait.
In Chapter 3, I have studied the effects of polyploidy on an important class of plant defence compounds: glucosinolates. These compounds, sometimes referred to as ‘mustard oils’, play an important role in the defence against herbivores and have radiated widely in Brassicaceae to form many different ‘flavors’ to deter specific herbivores. I show that in Cleomaceae many genes responsible for these compounds have benefited from the three rounds of polyploidy that T. hassleriana has undergone and that many duplicated genes have been retained. We also show that more than 75% is actively expressed in the plant, proving that the majority of these duplications has an active function in the plant.
Finally, in Chapter 4 I investigate a simple observation made during experiments with T. hassleriana in the greenhouse regarding the variation in flower colour between different individuals: some had pink flowers and some purple. Using LC-PDA mass spectrometry we find that the two colours are caused by different levels of two anthocyanin pigments, with cyanidin dominating in the purple flowers and pelargonidin being more abundant in pink flowers. Through sequence comparison and synteny analysis between A. thaliana and T. hassleriana we find the orthologs of the genes involved in this pathway. Using a Genotyping by Sequencing method on a cross between these two flower colours, we produce a collection of SNP markers on the reference genome. With these SNPs, we find two significant binary trait loci, one of which corresponds to the location of the F3’H ortholog which performs the conversion of a pelargonidin precursor to a cyanidin precursor.
In the General Conclusion, I combine all findings of the previous chapters and explain how they establish part of a larger species framework to study ancient polyploidy in angiosperms. I then put forth what these findings can mean for possible future research and the directions that are worth to be explored further.
From species to trait evolution in Aethionema (Brassicaceae)
Mohammadin, Setareh - \ 2017
Wageningen University. Promotor(en): M.E. Schranz. - Wageningen : Wageningen University - ISBN 9789463431385 - 125
brassicaceae - evolution - rna - genomes - genetic diversity - phytogeography - glucosinolates - quantitative trait loci - next generation sequencing - brassicaceae - evolutie - rna - genomen - genetische diversiteit - plantengeografie - glucosinolaten - loci voor kwantitatief kenmerk - next generation sequencing
The plant family Brassicaceae (or crucifers) is an economically important group that includes many food crops (e.g. cabbages and radishes), horticultural species (e.g. Draba, Iberis, Lunaria), and model plant species (particularly Arabidopsis thaliana). Because of the fundamental importance of A. thaliana to plant biology, it makes the Brassicaceae an ideal system for comparative genomics and to test wider evolutionary, ecological and speciation hypotheses. One such hypothesis is the ‘Whole Genome Duplication Radiation Lag Time’ (WGD-RLT) model for the role of polyploidy on the evolution of important plant families such as the Brassicaceae. The WGD-RLT model indicates a higher rate of diversification of a core-group compared to its sister group, due to a lag time after a whole genome duplication event that made it possible for novel traits or geo- or ecological events to increase the core groups diversification rate.
Aethionema is the species-poor sister genus of the core Brassicaceae and hence is at an important comparative position to analyse trait and genomic evolution of the species-rich core group. Aethionema species occur mainly in the western Irano-Turanian region, which is concordantly the biodiversity hotspot of the Brassicaceae family. Moreover comparing Aethionema to the Brassicaceae core group can help us to understand and test the ‘WGD-RLT’ model. However to be able to do so we first need to know more about Aethionema. In this thesis, I investigated various levels of evolutionary change (from macro, to micro to trait evolution) within the genus Aethionema, with a major focus the emerging model species Aethionema arabicum.
Next generation sequencing has made it possible to use the genomes of many species in a comparative framework. However, the formation of proteins and enzymes, and in the end the phenotype of the whole plant, relies on transcription from particular regions of the genome including genes. Hence, the transcriptome makes it possible to assess the functional parts of the genome. However, the functional part of the genome not only relies on the protein coding genes. Gene regulatory elements like promoters and long non-coding RNAs function as regulators of gene expression and hence are involved in increasing or decreasing transcription. In Chapter 2 I used the transcriptome of four different Aethionema species to understand the lineage specificity of these long non-coding RNAs. Moreover in a comparison with the Brassicaceae core group and Brassicaceae’s sister family the Cleomaceae I show that although the position of long non-coding RNAs can be conserved, their sequences do not have to be.
Most of the Aethionema species occur in the Irano-Turanian region, a politically instable region, making it hard for scientist to collect from. However the natural history collections made throughout the last centuries are a great resource. Combing these collections with the newest sequencing techniques, e.g. next generation sequencing, have allowed me to infer the phylogeny of ~75% of the known Aethionema species in a time calibrated and historical biogeographical framework. Hence, I was able to establish that Aethionema species likely originated from the Anatolian Diagonal and that major geological events like the uplift of the Turkish and Iranian plateau have had a hand in their speciation (Chapter 3).
To examine species-level processes I sequenced and analysed transcriptomes of eight Ae. arabicum accessions coming from Cyprus, Iran and Turkey to investigate population structure, genetic diversity and local adaptation (Chapter 4). The most prominent finding was a ploidy difference between the Iranian and Turkish/Cypriotic lines, whereby the former were (allo)tetraploid and the latter diploid. The tetraploid Iranian lines seem to have one set of alleles from the Turkish/Cypriotic gene-pool. However we do not know where the other alleles come from. In addition to the differences in ploidy level there are also differences in glucosinolate defence compounds between these two populations (Iranian vs Turkish/Cypriotic), with the Iranian lines lacking the diversity and concentration of indolic glucosinolates that the Turkish/Cypriotic lines have. This chapter serves as a good resource and starting point for future research in the region, maybe by using the natural history collections that are at hand.
Glucosinolates (i.e. mustard oils) are mainly made by Brassicales species, with their highest structural diversity in the Brassicaceae. In Chapter 5, I examined two Ae. arabicum lines (CYP and TUR) and their recombinant inbred lines to assess glucosinolate composition in different tissues and throughout the plants development. The levels of glucosinolates in the leaves changed when Ae. arabicum went from vegetative to a reproductive state. Moreover, a major difference in glucosinolate content (up to 10-fold) between CYP and TUR indicates a likely regulatory pathway outside of the main glucosinolate biosynthesis pathway. Multi-trait and multi-environment QTL analyses based on leaves, reproductive tissues and seeds identified a single major QTL. Fine mapping this region reduced the interval to only fifteen protein coding genes, including the two most intriguing candidates: FLOWERING LOCUS C (FLC) and the sulphate transporter SULTR2;1. These findings show an interesting correlation between development and defence.
Finally, Chapter 6 gives a final discussion of this thesis and its results. It brings the different topics together, put them in a bigger picture and look forward to new research possibilities.
The transcriptome as early marker of diet-related health : evidence in energy restriction studies in humans
Bussel, Inge P.G. van - \ 2017
Wageningen University. Promotor(en): Sander Kersten, co-promotor(en): Lydia Afman. - Wageningen : Wageningen University - ISBN 9789463430678 - 194
energy restricted diets - energy intake - gene expression - genomes - proteins - endurance - food composition - human nutrition research - energiearme diëten - energieopname - genexpressie - genomen - eiwitten - uithoudingsvermogen - voedselsamenstelling - voedingsonderzoek bij de mens
Background: Nutrition research is facing several challenges with respect to finding diet related health effects. The effects of nutrition on health are subtle, show high interindividual variations in response, and can take long before they become visual. Recently, the definition of health has been redefined as an organism’s ability to adapt to challenges and ‘this definition’ can be extended to metabolic health. In the metabolic context the ability to adapt has been named ‘phenotypic flexibility’. A potential new tool to magnify the effects of diet on health is the application of challenge tests. Combined with a comprehensive tool such as transcriptomics, the study of challenge tests before and after an intervention might be able to test a change in phenotypic flexibility. A dietary intervention well-known to improve health through weight loss is energy restriction (ER). ER can be used as a model to examine the potential of challenge tests in combination with transcriptomics to magnify diet-induced effects on health. As opposed to ER, caloric restriction (CR) is a reduction in energy intake aimed at improving health and life span in non-obese subjects and not directly aimed at weight loss. In this thesis, we aimed to investigate the use of the transcriptome as an early and sensitive marker of diet-related health.
Methods: First we studied the consequences of age on the effects of CR on the peripheral blood mononuclear cells (PBMCs) transcriptome. For that purpose, we compared the changes in gene expression in PBMCs from old men with the changes in gene expression in PBMCs from young men upon three weeks of 30% CR. To study the effect of a change in dietary composition during ER, we compared the changes in gene expression upon a 12 weeks high protein 25% ER diet with the changes in gene expression upon a 12 weeks normal protein 25% ER diet in white adipose tissue (WAT). Next, we investigated the added value of measuring the PBMC transcriptome during challenge tests compared to measuring the PBMC transcriptome in the fasted state to magnify the effects of ER on health. This was investigated by measuring the changes in gene expression upon an oral glucose tolerance test (OGTT) and upon a mixed meal test (MMT), both before and after 12 weeks of 20% ER. Finally, we determined the differences between a challenge test consisting of glucose alone, the OGTT, or consisting of glucose plus other macronutrients, the MMT, on the PBMC transcriptome in diet-related health.
Results: We observed that the transcriptome of PBMCs of healthy young men had a higher responsiveness in immune response pathways compared to the transcriptome of PBMCs of aged men upon CR (chapter 2). Also, we showed that upon a normal protein-ER diet the transcriptome of WAT showed a decrease in pathways involved in immune response and inflammasome, whereas no such effect was found upon a high protein-ER diet. These effect were observed while parameters such as weight loss, glucose, and waist circumference did not change due to the different protein quantities (chapter 3). 12 weeks of 20% ER was shown to increase phenotypic flexibility as reflected by a faster and more pronounced downregulation of OXPHOS, cell adhesion, and DNA replication during the OGTT compared to the control diet (chapter 4). Finally, two challenge tests consisting of either glucose (OGTT) or glucose plus fat and protein (MMT), were shown to result in a larger overlap than difference in the changes in gene expression of PBMCs (chapter 5).
Conclusions: Based on the differential changes in gene expression upon CR at different ages, we concluded that age is an important modulator in the response to CR. As a high protein ER diet induced transcriptional changes seemed to reflect less beneficial health effects than a normal protein ER diet we concluded that the diet composition is important in the health-effect of ER as measured by the transcriptome. Based on the faster PBMCs changes in gene expression during an OGTT upon 12 weeks of 20% ER, we concluded that the PBMC transcriptome combined with a challenge test can reflect changes in phenotypic flexibility. This makes challenge tests a suitable tool to study diet-related health effects. Finally, based on the changes in gene expression of the MMT and OGTT, we conclude that glucose in a challenge test is the main denominator of the postprandial changes in gene expression in the first two hours. Overall, these results lead to the conclusion that the transcriptome, especially in combination with challenges test, can be used as an early marker of diet-related health. The direct relation to health still needs to be investigated, but the possibility to use the transcriptome as an early marker of diet-related health gives rise to a better understanding of the effects of nutrition on health.
Selection for pure- and crossbred performance in Charolais
Vallée-Dassonneville, Amélie - \ 2017
Wageningen University. Promotor(en): Johan van Arendonk; Henk Bovenhuis. - Wageningen : Wageningen University - ISBN 9789463430180 - 151
charolais - cattle - animal breeding - crossbreeding - crossbreds - selection - beef cattle - genomes - genetic parameters - charolais - rundvee - dierveredeling - kruisingsfokkerij - kruising - selectie - vleesvee - genomen - genetische parameters
Two categories of beef production exist; i.e. (i) purebred animals from a beef sire and a beef dam and (ii) crossbred animals from a beef sire and a dairy dam.
For the purebred beef production, there is a growing interest to include behavior and type traits in the breeding goal. Heritabilities for behavior traits, estimated using subjective data scored by farmers, range from 0.02 to 0.19. Heritabilities for type traits range from 0.02 to 0.35. Results show that there are good opportunities to implement selection for behavior traits using a simple on-farm recording system to allow collection of large data set, and for type traits in Charolais. A genome-wide association study detected 16 genomic regions with small effect on behavior and type traits. This suggests that behavior and type traits are influenced by many genes each explaining a small part of the genetic variance.
The two main dairy breeds mated to Charolais sires for crossbred beef production in France are Montbéliard and Holstein. The genetic correlation between the same trait measured on Montbéliard x Charolais and on Holstein x Charolais was 0.99 for muscular development, 0.96 for birth weight; and 0.91 for calving difficulty, 0.80 for height, and 0.70 for bone thinness. Thus, for these last three traits, results show evidence for re-ranking of Charolais sires depending on whether they are mated to Montbéliard or Holstein cows. When using genomic prediction, the Montbéliard x Charolais and Holstein x Charolais populations could be combined into a single reference population to increase size and accuracy of genomic prediction. Results indicate that the higher the genetic correlation is between the two crossbred populations, the higher the gain in accuracy is achieved when combining the two populations into a single reference.
The selection of Charolais sires to produce purebred or crossbred animals is made through distinct breeding programs. An alternative could be to combine selection into one breeding program. Decision for combining or keeping breeding programs separate is determined by the correlation between the breeding objectives, the selection intensity, the difference in level of genetic merit, the accuracy of selection, and the recent implementation of genomic evaluation. Considering all parameters and based on estimations for selection on birth weight, I recommend combining both breeding programs because this will lead to higher genetic gain, and might simplify operating organization and reduce associated costs.
Utilization of complete chloroplast genomes for phylogenetic studies
Ramlee, Shairul Izan Binti - \ 2016
Wageningen University. Promotor(en): Richard Visser, co-promotor(en): Rene Smulders; Theo Borm. - Wageningen : Wageningen University - ISBN 9789462579354 - 186
phylogenetics - genomes - chloroplasts - models - solanum - orchidaceae - phylogenomics - dna sequencing - fylogenetica - genomen - chloroplasten - modellen - solanum - orchidaceae - phylogenomica - dna-sequencing
Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from other evolutionary processes such as gene duplication, concerted evolution, pseudogene formation and genome rearrangements. The conservation of the chloroplast genome sequence allows designing primers targeting regions conserved well beyond species boundaries, and amplification of these targets.
The small size together with their high copy number in leaf cells greatly facilitates chloroplast genome sequencing. In this thesis, chloroplast phylogenomics was conducted using complete chloroplast DNA genomes obtained by a newly developed method of de novo assembly. The method was not only cost-effective but also has the potential to extract a wealth of useful information of thousands of chloroplast genomes from Whole Genome Shotgun (WGS) data. We used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. This pipeline includes steps aimed at optimizing assemblies and filling gaps that are left due to coverage variation in the WGS dataset. The pipeline enabled successful de novo assembly across a range of nuclear genome sizes, from Solanum lycopersicon (tomato, 0.9 Gb) to Paphiopedilum heryanum (slipper orchid, 35 Gb).
The pipeline is suitable for studying structural variation in the chloroplast genome, as opposed to the common procedure of read mapping against a reference genome. To support the putative rearrangements, a flexible assembly quality comparison tool was created that combines and visualizes read mapping and alignment results in a two-dimensional plot. We have evaluated the ability of this tool using the de novo assemblies of S. lycopersicon and P. henryanum chloroplasts. The results show that not only we can immediately select the best of two options, but also determine the location of specific artefacts.
In order to explore and evaluate the utility of complete chloroplast phylogenomics, tomato and Paphiopedilum spp were used to conduct phylogenetic inferences based on the complete chloroplast genome. In total 84 tomato chloroplast genomes within the section Lycopersicon were assembled and phylogenetic trees produced. The analyses revealed that next to the chloroplast regions and spacers traditionally used for phylogenetics, additional regions of protein coding and non-coding DNA may be exploited for intraspecific phylogenetic studies. In particular, more than 50% of all phylogenetically relevant information could be included by just using four genes (ycf1, ndhF, ndhA, and ndhH), of which 34% in ycf1 alone. The topology of the phylogenetic tree inferred from ycf1 was the same as that of trees based on all other protein coding genes, although with lower bootstrap values. The phylogenetic analyses based on 32 complete Paphiopedilum spp. chloroplast genomes confirmed the division of the genus into three subgenera Parvisepalum, Brachypetalum and Paphiopedilum. The division of five sections of subgenus Paphiopedilum was also recovered. The de novo assemblies revealed several structural rearrangements including gene loss and inversion. In addition, the chloroplast genome of Paphiopedilum has experienced extreme IR expansion that has included part of or the entire SSC region, resulting in larger IR regions than commonly observed among monocots.
In conclusion, WGS data offer opportunities to generate partial or entire chloroplast genomes for phylogenetic studies. Species discrimination can be achieved already with partial data (subsets of genes), but evolutionarily young lineages may require more informative characters. Therefore, it is expected that many complete chloroplast genomes will be produced in the years to come. While generating these genomes, the urge for de novo assembly of chloroplast genomes rather than mapping against reference genomes is adamant in order to also uncover structural rearrangements in chloroplast genome.
Antibodies and longevity of dairy cattle : genetic analysis
Klerk, B. de - \ 2016
Wageningen University. Promotor(en): Johan van Arendonk, co-promotor(en): Jan van der Poel; Bart Ducro. - Wageningen : Wageningen University - ISBN 9789462577589 - 134
dairy cattle - dairy cows - antibodies - longevity - genetic analysis - breeding value - genomes - genetic improvement - animal genetics - melkvee - melkkoeien - antilichamen - gebruiksduur - genetische analyse - fokwaarde - genomen - genetische verbetering - diergenetica
The dairy sector has a big impact on food production for the growing world population and contributes substantially to the world economy. In order to produce food in a sustainable way, dairy cows need to be able to produce milk without problems and as long as possible. Therefore, breeding programs focuses on improvement of important traits for dairy cows. In order to improve desirable traits and obtain genetic gain there is a constant need for optimization of breeding programs and search for useful parameters to include within breeding programs. Over the last several decades, breeding in dairy cattle mainly focused on production and fertility traits, with less emphasis on health traits. Health problems, however, can cause substantial economic losses to the dairy industry. The economic losses, together with the rising awareness of animal welfare, increased herd size, and less attention for individual animals, have led to an increased need to focus more on health traits. Longevity is strongly related to disease resistance, since a more healthy cow will live a longer productive life (longevity). The identification of biomarkers and the detection of genes controlling health and longevity, would not only greatly enhance the understanding of such traits but also offer the opportunity to improve breeding schemes. The objectives of this thesis therefore were 1) to find an easy measurable disease resistance related biomarker in dairy cows, 2) identify the relation between antibodies and longevity, 3) identify genomic regions that are involved with antibody production/expression. In this thesis antibodies are investigated as parameter for longevity. Antibodies might be a novel parameter that enables selection of cows with an improved ability to stay healthy and to remain productive over a longer period of time. In this thesis antibodies bindiging the naive antigen keyhole limpet hemocyanin (KLH) were assumed to be natural antibodies. Antibodies binding bacteria-derived antigens lipoteichoic acid (LTA), lipopolysaccharide (LPS) and peptidoglycan (PGN) were assumed to be specific antibodies. In chapter 2 it was shown that levels of antibodies are heritable (up to h2 = 0.23). Additionally, antibody levels measured in milk and blood are genetically highly correlated (± 0.80) for the two studied isotypes (IgG and IgM). On the other hand, phenotypically, natural antibodies (from both IgG and IgM isotype) measured in milk cannot be interpreted as the same trait (phenotypic correlation = ± 0.40). In chapter 3 and 4 it was shown that levels of antibodies (both natural-and specific antibodies) showed a negative relation with longevity: first lactation cows with low IgM or IgG levels were found to have a longer productive life. When using estimated breeding values for longevity, only a significant relation was found between natural antibody level (IgM binding KLH) and longevity. Lastly chapter 5 reports on a genome-wide-association study (GWAS), to detect genes contributing to genetic variation in natural antibody level. For natural antibody isotype IgG, genomic regions with a significant association were found on chromosome 21 (BTA). These regions included genes have impact on in isotype class switching (from IgM to IgG). The gained knowledge on relations between antibodies and longevity and the gained insight on genes responsible for natural antibodies level make antibodies potential interesting biomarkers for longevity.
Conservation genetics of the frankincense tree
Bekele, A.A. - \ 2016
Wageningen University. Promotor(en): Frans Bongers, co-promotor(en): Rene Smulders; K. Tesfaye Geletu. - Wageningen : Wageningen University - ISBN 9789462576865 - 158
boswellia - genomes - dna sequencing - tropical forests - genetic diversity - genetic variation - genetics - forest management - plant breeding - boswellia - genomen - dna-sequencing - tropische bossen - genetische diversiteit - genetische variatie - genetica - bosbedrijfsvoering - plantenveredeling
Boswellia papyrifera is an important tree species of the extensive Combretum-Terminalia dry tropical forests and woodlands in Africa. The species produces a frankincense which is internationally traded because of its value as ingredient in cosmetic, detergent, food flavor and perfumes productions, and because of its extensive use as incense during religious and cultural ceremonies in many parts of the world. The forests in which B. papyrifera grows are increasingly overexploited at the expense of the economic benefit and the wealth of ecological services they provide. Populations of B. papyrifera have declined in size and are increasingly fragmented. Regeneration has been blocked for the last 50 years in most areas and adult productive trees are dying. Projections showed a 90% loss of B. papyrifera trees in the coming 50 years and a 50% loss of frankincense production in 15 years time.
This study addressed the conservation genetics of B. papyrifera. Forty six microsatellite (SSR) markers were developed for this species, and these genetic markers were applied to characterize the genetic diversity pattern of 12 B. papyrifera populations in Ethiopia. Next to this, also the generational change in genetic diversity and the within-population genetic structure (FSGS) of two cohort groups (adults and seedlings) were studied in two populations from Western Ethiopia. In these populations seedlings and saplings were found and natural regeneration still takes place, a discovery that is important for the conservation of the species.
Despite the threats the populations are experiencing, ample genetic variation was present in the adult trees of the populations, including the most degraded populations. Low levels of population differentiation and isolation-by-distance patterns were detected. Populations could be grouped into four genetic clusters: the North eastern (NE), Western (W), North western (NW) and Northern (N) part of Ethiopia. The clusters corresponded to environmentally different conditions in terms of temperature, rainfall and soil conditions. We detected a low FSGS and found that individuals are significantly related up to a distance of 60-130 m.
Conservation of the B. papyrifera populations is urgently needed. The regeneration bottlenecks in most existing populations are an urgent prevailing problem that needs to be solved to ensure the continuity of the genetic diversity, species survival and sustainable production of frankincense. Local communities living in and around the forests should be involved in the use and management of the forests. In situ conservation activities will promote gene flow among fragmented populations and scattered remnant trees, so that the existing level of genetic diversity may be preserved. Geographical distance among populations is the main factor to be considered in sampling for ex situ conservation. A minimum of four conservation sites for B. papyrifera is recommended, representing each of the genetic clusters. Based on the findings of FSGS analyses, seed collection for ex situ conservation and plantation programmes should come from trees at least 100 m, but preferably 150 m apart.
Mining microbiota signatures in human intestinal tract metagenomes
Tims, S. - \ 2016
Wageningen University. Promotor(en): Michiel Kleerebezem; Willem de Vos, co-promotor(en): Erwin Zoetendal. - Wageningen : Wageningen University - ISBN 9789462576933 - 264
gastrointestinal microbiota - intestines - genomes - man - hosts - host guest relations - dna microarrays - gastrointestinal diseases - inflammatory bowel diseases - irritable colon - prebiotics - body mass index - oligosaccharides - microbiota van het spijsverteringskanaal - darmen - genomen - mens - gastheren (dieren, mensen, planten) - relaties tussen gastheer en gast - dna microarrays - maagdarmziekten - chronische darmontstekingen - prikkelbaar colon - prebiotica - quetelet index - oligosacchariden
Genetic diversity and evolution in Lactuca L. (Asteraceae) : from phylogeny to molecular breeding
Wei, Z. - \ 2016
Wageningen University. Promotor(en): Eric Schranz. - Wageningen : Wageningen University - ISBN 9789462576148 - 210
lactuca sativa - leafy vegetables - phylogeny - genetic diversity - domestication - molecular breeding - genomes - dna - quantitative trait loci - evolution - lactuca sativa - bladgroenten - fylogenie - genetische diversiteit - domesticatie - moleculaire veredeling - genomen - dna - loci voor kwantitatief kenmerk - evolutie
Cultivated lettuce (Lactuca sativa L.) is an important leafy vegetable worldwide. However, the phylogenetic relationships between domesticated lettuce and its wild relatives are still not clear. In this thesis, I focus on the phylogenetic relationships within Lactuca L., including an analysis of the wild Lactuca species that are endemic to Africa for the first time. The genetic variation of responses to salinity in a recombinant inbred line population, derived from a cross between the lettuce crop (L. sativa ‘Salinas’) and wild species (L. serriola), was investigated and the candidate gene in the identified QTL regions was further studied.
In Chapter 1, I introduce and discuss topics related to genetic diversity and evolution in Lactuca, including an overview of lettuce cultivars and uses, its hypothesized domestication history, the taxonomic position of Lactuca, current status of molecular breeding in lettuce and mechanisms of salinity tolerance in plants, especially the High-affinity K+ Transporter (HKT) gene family.
In Chapter 2, the most extensive molecular phylogenetic analysis of Lactuca was constructed based on two chloroplast genes (ndhF and trnL-F), including endemic African species for the first time. This taxon sampling covers nearly 40% of the total Lactuca species endemic to Africa and 34% of all Lactuca species. DNA sequences from all the subfamilies of Asteraceae in Genbank and those generated from Lactuca herbarium samples were used to elucidate the monophyly of Lactuca and the affiliation of Lactuca within Asteracaeae. Based on the subfamily tree, 33 ndhF sequences from 30 species and 79 trnL-F sequences from 48 species were selected to infer phylogenetic relationships within Lactuca using Randomized Axelerated Maximum Likelihood (RAxML) and Bayesian Inference (BI) analyses. In addition, biogeographical, chromosomal and morphological character states were analysed based on the Bayesian tree topology. The results showed that Lactuca contains two distinct phylogenetic clades - the crop clade and the Pterocypsela clade. Other North American, Asian and widespread species either form smaller clades or mix with the Melanoseris species in an unresolved polytomy. The newly sampled African endemic species probably should be excluded from Lactuca and treated as a new genus.
In Chapter 3, twenty-seven wild Lactuca species and four outgroup species were sequenced using next generation sequencing (NGS) technology. The sampling covers 36% of total Lactuca species and all the important geographical groups in the genus. Thirty chloroplast genomes, including one complete (partial) large single copy region (LSC), one small single copy region (SSC), one inverted repeat (IR) region, and twenty-nine nuclear ribosomal DNA sequences (containing the internal transcribed spacer region ) were successfully assembled and analysed. A methodology paper for which I am co-author, but is not included in this thesis, of the sequencing pipeline was published: ‘Herbarium genomics: plastome sequence assembly from a range of herbarium specimens using an Iterative Organelle Genome Assembly (IOGA) pipeline’. These NGS data helped resolve deeper nodes in the phylogeny within Lactuca and resolved the polytomy from Chapter 2. The results showed that there are at least four main groups within Lactuca: the crop group, the Pterocypsela group, the North American group and the group containing widely-distributed species. I also confirmed that the endemic African species should be removed and treated as a new genus.
In Chapter 4, quantitative trait loci (QTLs) related to salt-induced changes in Root System Architecture (RSA) and ion accumulation were determined using a recombinant inbred line population derived from a cross between cultivated lettuce and wild lettuce. I measured the components of RSA by replicated lettuce seedlings grown on vertical agar plates with different NaCl concentrations in a controlled growth chamber environment. I also quantified the concentration of sodium and potassium in replicates of greenhouse-grown plants watered with 100 mM NaCl. The results identified a total of fourteen QTLs using multi-trait linkage analysis, including three major QTLs associated with general root development (qRC9.1), root growth in salt stress condition (qRS2.1), and ion accumulation (qLS7.2).
In Chapter 5, one of the identified QTL regions (qLS7.2) reported in Chapter 4 was found to contain a homolog of the HKT1 from Arabidopsis thaliana. I did a phylogenetic analysis of Lactuca HKT1-like protein sequences with other published HKT protein sequences and determined transmembrane and pore segments of lettuce HKT1;1 alleles, according to the model proposed for AtHKT1;1. Gene expression pattern and level of LsaHKT1;1 (L. sativa ‘Salinas’) and LseHKT1;1 (L. serriola) in root and shoot were investigated in plants growing hydroponically over a time-course. The measurements of Na+ and K+ contents were sampled at the same time as the samples used for gene expression test. In addition, I examined the 5’ promoter regions of the two genotypes. The results showed low expression levels of both HKT1;1 alleles in Lactuca root and relatively higher expression in shoot, probably due to the negative cis-regulatory elements of HKT1 alleles found in Lactuca promoter regions. Significant allelic differences were found in HKT1;1 expression in early stage (0-24 hours) shoots in and in late stage (2-6 days) roots. shoot HKT1;1 expression/root HKT1;1 expression was generally consistent with the ratios of Na+/K+ balance in the relevant tissues (shoot Na+/K+ divided by root Na+/K+).
In Chapter 6, I summarize and discuss the results from previous chapters briefly. The implications of Chapter 2 and 3 for Lactuca phylogenetics are discussed, including some key characters for the diagnosis of species within Lactuca, the use of herbarium DNA for NGS technology, and perspectives into Lactuca phylogeny. Future perspectives of genome-wide association mapping for lettuce breeding were also discussed. Lastly, I propose to integrate phylogenetic approaches into investigations of allelic differences in lettuce, not just associated with salinity stress but also with other stressed and beneficial characters, both within and between species.
Natural genetic variation in Arabidopsis thaliana photosynthesis
Flood, P.J. - \ 2015
Wageningen University. Promotor(en): Maarten Koornneef, co-promotor(en): Mark Aarts; Jeremy Harbinson. - Wageningen : Wageningen University - ISBN 9789462575004 - 278
arabidopsis thaliana - genetische variatie - fotosynthese - genomen - chlorofyl - fenotypen - arabidopsis thaliana - genetic variation - photosynthesis - genomes - chlorophyll - phenotypes
Oxygenic photosynthesis is the gateway of the sun’s energy into the biosphere, it is where light becomes life. Genetic variation is the fuel of evolution, without it natural selection is powerless and adaptation impossible. In this thesis I have set out to study a relatively unexplored field which sits at the intersection of these two topics, namely natural genetic variation in plant photosynthesis. To begin I reviewed the available literature (Chapter 2), from this it became clear that the main bottleneck restricting progress was the lack of high-throughput phenotyping platforms for photosynthesis. To address this an automated high-throughput chlorophyll fluorescence phenotyping system was developed, which could measure 1440 plants in less than an hour for ΦPSII, a measure of photosynthetic efficiency (Chapter 3). Using this phenotyping platform I screened five populations of Arabidopsis thaliana. Three of these populations resulted from bi-parental crosses and segregated for only two genomes, using these I conducted family mapping (Chapter 4). The final two populations were composed of natural, field collected, accessions and were analysed using a genome wide association approach (Chapter 5). The family mapping approach had greater statistical power due to within population replication and the genome wide association approach had higher mapping resolution due to historical recombination. Both approaches were used to identify genomic regions (loci) which were responsible for some of the variation in photosynthesis observed. The number and average effect of these loci was used to infer the genetic architecture of photosynthesis as a highly complex polygenic trait for which there are many loci of very small effect. In addition to screening these large populations a smaller subset of 18 lines was assayed for natural variation in phosphorylation of photosystem II (PSII) proteins in response to changing light (Chapter 6). This exploratory study indicated that this process shows considerable variation and may be important for adaptation of the photosynthetic apparatus to photosynthetic extremes. The genetic mapping studies just described, focus exclusively on genetic variation in the nuclear genome, whilst this contains the majority of the plants genetic information there is also a store of genetic information in the chloroplast and mitochondria. These genetic repositories contain genes which are essential for photosynthesis and energy metabolism. Any variation in these genes could have a large impact on photosynthesis. To study natural variation in these genomes I developed a new population of reciprocal nuclear-organellar hybrids (cybrids) which could be used to study the effect of genetic variation in organelles whilst controlling for nuclear genetic variation (Chapter 7). Preliminary results indicate that this resource will be of great use in disentangling natural genetic variation in nucleo-organelle interactions. Finally I looked at one chloroplast encoded photosynthetic mutation in more detail (Chapter 8). This mutation had evolved in response to herbicide application and had spread along British railways. When studying this population of resistant plants I found empirical evidence for organelle mediated nuclear genetic hitchhiking. This is a previously undescribed evolutionary phenomenon and is likely to be quite common. In conclusion there is an abundance of genetic variation in photosynthesis which can be used to improve the trait for agriculture and provide insights into novel evolutionary phenomena in the field.
Using natural variation to unravel the dynamic regulation of plant performance in diverse environments
Molenaar, J.A. - \ 2015
Wageningen University. Promotor(en): Harro Bouwmeester; Joost Keurentjes, co-promotor(en): Dick Vreugdenhil. - Wageningen : Wageningen University - ISBN 9789462573444 - 186
planten - genomen - loci voor kwantitatief kenmerk - warmtestress - genetische kartering - groei - droogte - plantengenetica - plantenfysiologie - plants - genomes - quantitative trait loci - heat stress - genetic mapping - growth - drought - plant genetics - plant physiology
All plants are able to respond to changes in their environment by adjusting their morphology and metabolism, but large differences are observed in the effectiveness of these responses in the light of plant fitness. Between and within species large differences are observed in plant responses to drought, heat and other abiotic stresses. This natural variation is partly due to variation in the genetic composition of individuals. Within-species variation can be used to identify and study genes involved in the genetic regulation of plant performance.
Growth of the world population will, in the coming years, lead to an increased demand for food, feed and other natural products. In addition, extreme weather conditions with, amongst others, more and prolonged periods of drought and heat are expected to occur due to climate change. Therefore breeders are challenged to produce stress tolerant cultivars with improved yield under sub-optimal conditions. Knowledge about the mechanisms and genes that underlie tolerance to drought, heat and other abiotic stresses will ease this challenge.
The aim of this thesis was to identify and study the role of genes that are underlying natural variation in plant performance under drought, salt and heat stress. To reach this goal a genome wide association (GWA) mapping approach was taken in the model species Arabidopsis thaliana. A population of 350 natural accessions of Arabidopsis, genotyped with 215k SNPs, was grown under control and several stress conditions and plant performance was evaluated by phenotyping one or several plant traits per environment. Genes located in the genomic regions that were significantly associated with plant performance, were studied in more detail.
Plant performance was first evaluated upon osmotic stress (Chapter 2). This treatment resulted not only in a reduced plant size, but also caused the colour of the rosette leaves to change from green to purple-red due to anthocyanin accumulation. The latter was visually quantified and subsequent GWA mapping revealed that a large part of the variation in anthocyanin accumulation could be explained by a small genomic region on chromosome 1. The analysis of re-sequence data allowed us to associate the second most frequent allele of MYB90 with higher anthocyanin accumulation and to identify the causal SNP. Interestingly MYB75, a close relative of MYB90, was not identified by GWA mapping, although causal sequence variation of this gene for anthocyanin accumulation was identified in the Cvi x Ler and Ler x Eri-1 RIL populations. Re-sequence data revealed that one allele of MYB75 was dominating the population and that the MYB75 alleles of Cvi and Ler were both rare, explaining the lack of association at this locus in GWA mapping. For MYB90, two alleles were present in a substantial part of the population, suggesting balancing selection between them.
Next, the natural population was exposed to short-term heat stress during flowering (Chapter 3). This short-term stress has a large impact on seed set, while it hardly affects the vegetative tissues. Natural variation for tolerance against the effect of heat on seed set was evaluated by measuring the length of all siliques along the inflorescence in both heat-treated and control plants. Because the flower that opened during the treatment was tagged, we could analyse the heat response for several developmental stages separately. GWA mapping revealed that the heat response before and after anthesis involved different genes. For the heat response before anthesis strong evidence was gained that FLC, a flowering time regulator and QUL2, a gene suggested to play a role in vascular tissue development, were causal for two strong associations.
Furthermore, the impact of moderate drought on plant performance was evaluated in the plant phenotyping platform PHENOPSIS. Homogeneous drought was assured by tight regulation of climate cell conditions and the robotic weighing and watering of the pots twice a day. Because plant growth is a dynamic trait it was monitored over time by top-view imaging under both moderate drought and control conditions (Chapter 4 and 5). To characterise growth it was modelled with an exponential function. GWA mapping of temporal growth data resulted in the detection of time-dependent QTLs whereas mapping of model parameters resulted in another set of QTLs related to the entire growth period. Most of these QTLs would not have been identified if plant size had only been determined on a single day. For the QTLs detected under control conditions eight candidate genes with a growth-related mutant or overexpression phenotype were identified (Chapter 4). Genes in the support window of the drought-QTLs were prioritized based on previously reported gene expression data (Chapter 5). Additional validation experiments are needed to confirm causality of the candidate genes.
Next, to search for genes that determine plant size across many environments, biomass accumulation in the natural population was determined in 25 different environments (Chapter 6). Joint analysis of these data by multi-environment GWA mapping resulted in the detection of 106 strongly associated SNPs with significant effects in 7 to 16 environments. Several genes involved in starch metabolism, leaf size control and flowering time determination were located in close proximity of the associated SNPs. Two genes, RPM1 and ACD6, were located in close proximity of SNPs with significant GxE effects. For both genes, alleles have been identified that increase resistance to bacterial infection, but that reduce biomass accumulation. The sign of the allelic effect is therefore dependent on the environmental conditions. Whole genome predictions revealed that most of the GxE interactions observed at the phenotypic level were not the consequence of strong associations with strong QxE effects, but of moderate and weak associations with weak QxE effects.
Finally, in Chapter 7 I discuss the usefulness of GWA mapping in the identification of genes underlying natural variation in plant performance under drought, heat stress and a number of other environments. Strong associations were observed for both environment-specific as well as common plant performance regulators. Some choices in phenotyping and experimental design were crucial for our success, like evaluation of plant performance over time and simplification of the quantification of the phenotype. It is suggested that follow-up work should focus on the functional characterization of the causal genes, because such analyses would be helpful to identify pathways in which the causal genes are involved and to understand why sequence variation results in changes at the phenotype level. Although translation of the findings to applications in crops is challenging, this thesis contributes to the understanding of the genetic regulation of stress response and therefore will likely contribute to the development of stress tolerant and stable yielding crops.
Breeding program for indigenous chicken in Kenya
Ngeno, K. - \ 2015
Wageningen University. Promotor(en): Johan van Arendonk, co-promotor(en): Liesbeth van der Waaij; A.K. Kahi. - Wageningen : Wageningen University - ISBN 9789462572775 - 154
kippen - pluimvee - inheems vee - dierveredeling - veredelingsprogramma's - genetische diversiteit - ecotypen - genomen - genetische verbetering - kenya - fowls - poultry - native livestock - animal breeding - breeding programmes - genetic diversity - ecotypes - genomes - genetic improvement - kenya
Ngeno, K. (2015). Breeding program for indigenous chicken in Kenya. Analysis of diversity in indigenous chicken populations. PhD thesis, Wageningen University, the Netherlands
The objective of this research was to generate knowledge required for the development of an indigenous chicken (IC) breeding program for enhanced productivity and improved human livelihood in Kenya. The initial step was to review five questions; what, why and how should we conserve IC in an effective and sustainable way, who are the stakeholders and what are their roles in the IC breeding program. The next step of the research focused on detecting distinctive IC ecotypes through morphological and genomic characterization. Indigenous chicken ecotypes were found to be populations with huge variability in the morphological features. Molecular characterization was carried out using microsatellite markers and whole genome re-sequenced data. The studied IC ecotypes are genetically distinct groups. The MHC-linked microsatellite markers divided the eight IC ecotypes studied into three mixed clusters, composing of individuals from the different ecotypes whereas non-MHC markers grouped ICs into two groups. Analysis revealed high genetic variation within the ecotype with highly diverse MHC-linked alleles which are known to be involved in disease resistance. Whole genome re-sequencing revealed genomic variability, regions affected by selection, candidate genes and mutations that can explain partially the phenotypic divergence between IC and commercial layers. Unlike commercial chickens, IC preserved a high genomic variability that may be important in addressing present and future challenges associated with environmental adaptation and farmers’ breeding goals. Lastly, this study showed that there is an opportunity to improve IC through selection within the population. Genetic improvement utilizing within IC selection requires setting up a breeding program. The study described the systematic and logical steps in designing a breeding program by focusing on farmers’ need, how to improve IC to fit the farming conditions, and management regimes.
The hybrid nature of pig genomes : unraveling the mosaic haplotype structure in wild and commercial Sus scrofa populations
Bosse, M. - \ 2015
Wageningen University. Promotor(en): Martien Groenen, co-promotor(en): Hendrik-Jan Megens; Ole Madsen. - Wageningen : Wageningen University - ISBN 9789462573000 - 253
dieren - varkens - dierveredeling - genomen - hybridisatie - sus scrofa - haplotypen - genomica - populaties - genetische variatie - animals - pigs - animal breeding - genomes - hybridization - sus scrofa - haplotypes - genomics - populations - genetic variation - cum laude
cum laude graduation
Genomics 4.0 : syntenic gene and genome duplication drives diversification of plant secondary metabolism and innate immunity in flowering plants : advanced pattern analytics in duplicate genomes
Hofberger, J.A. - \ 2015
Wageningen University. Promotor(en): Eric Schranz. - Wageningen : Wageningen University - ISBN 9789462573147 - 142
genomica - planten - metabolisme - bloeiende planten - genomen - genen - next generation sequencing - genomics - plants - metabolism - flowering plants - genomes - genes - next generation sequencing
Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants
Johannes A. Hofberger1, 2, 3
1 Biosystematics Group, Wageningen University & Research Center, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands (August 2012 – December 2013)
2 Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands (December 2010 – July 2012)
3 Chinese Academy of Sciences/Max Planck Partner Institute for Computational Biology, 320 Yueyang Road,
Shanghai 200031, PR China (January 2014 – December 2014)TWO-SENTENCE SUMMARY
Large-scale comparative analysis of Big Data from next generation sequencing provides powerful means to exploit the potential of nature in context of plant breeding and biotechnology. In this thesis, we combine various computational methods for genome-wide identification of gene families involved in (a) plant innate immunity and (a) biosynthesis of defense-related plant secondary metabolites across 21 species, assess dynamics that affected evolution of underlying traits during 250 Million Years of flowering plant radiation and provide data on more than 4500 loci that can underpin crop improvement for future food and live quality.
As sessile organisms, plants are permanently exposed to a plethora of potentially harmful microbes and other pests. The surprising resilience to infections observed in successful lineages is due to a complex defense network fighting off invading pathogens. Within this network, a sophisticated plant innate immune system is accompanied by a multitude of specialized biosynthetic pathways that generate more than 200,000 secondary metabolites with ecological, agricultural, energy and medicinal importance. The rapid diversification of associated genes was accompanied by a series of duplication events in virtually all plant species, including local duplication of short sequences as well as multiplication of all chromosomes due to meiotic errors (plant polyploidy). In a comparative genomics approach, we combined several bioinformatics techniques for large-scale identification of multi-domain and multi-gene families that are involved in plant innate immunity or defense-related secondary metabolite pathways across 21 representative flowering plant genomes. We introduced a framework to trace back duplicate gene copies to distinct ancient duplication events, thereby unravelling a differential impact of gene and genome duplication to molecular evolution of target genes. Comparing the genomic context among homologs within and between species in a phylogenomics perspective, we discovered orthologs conserved within genomic regions that remained structurally immobile during flowering plant radiation. In summary, we described a complex interplay of gene and genome duplication that increased genetic versatility of disease resistance and secondary metabolite pathways, thereby expanding the playground for functional diversification and thus plant trait innovation and success. Our findings give fascinating insights to evolution across lineages and can underpin crop improvement for food, fiber and biofuels production
Structural variations in pig genomes
Paudel, Y. - \ 2015
Wageningen University. Promotor(en): Martien Groenen, co-promotor(en): Ole Madsen; Hendrik-Jan Megens. - Wageningen : Wageningen University - ISBN 9789462572171 - 204
varkens - dierveredeling - genomen - genomica - single nucleotide polymorphism - dna-sequencing - fenotypische variatie - chromosoomafwijkingen - evolutie - soortvorming - pigs - animal breeding - genomes - genomics - single nucleotide polymorphism - dna sequencing - phenotypic variation - chromosome aberrations - evolution - speciation
Paudel, Y. (2015). Structural variations in pig genomes. PhD thesis, Wageningen University, the Netherlands
Structural variations are chromosomal rearrangements such as insertions-deletions (INDELs), duplications, inversions, translocations, and copy number variations (CNVs). It has been shown that structural variations are as important as single nucleotide polymorphisms (SNPs) in regards to phenotypic variations. The general aim of this thesis was to use next generation sequencing data to improve our understanding of the evolution of structural variations such as CNVs, and INDELs in pigs. We found that: 1) the frequency of copy number variable regions did not change during pig domestications but rather reflected the demographic history of pigs. 2) CNV of olfactory receptor genes seems to play a role in the on-going speciation of the genus Sus. 3) Variation in copy number of olfactory receptor genes in pigs (Sus scrofa) seems to be shaped by a combination of selection and genetic drift, where the clustering of ORs in the genome is the major source of variation in copy number. 4) Analysis on short INDELs in the pig genome shows that the level of purifying selection of INDELs positively correlates with the functional importance of a genomic region, i.e. strongest purifying selection was observed in gene coding regions. This thesis provides a highly valuable resource for copy number variable regions, INDELs, and SNPs, for future pig genetics and breeding research. Furthermore, this thesis discusses the limitations and improvements of the available tools to conduct structural variation analysis and insights into the future trends in the detection of structural variations.
Sociable swine: prospects of indirect genetic effects for the improvement of productivity, welfare and quality
Duijvesteijn, N. - \ 2014
Wageningen University. Promotor(en): Johan van Arendonk, co-promotor(en): Piter Bijma; E.F. Knol. - Wageningen : Wageningen University - ISBN 9789462571525 - 202
varkens - sociaal gedrag - groepsinteractie - genetische effecten - prestatieniveau - dierenwelzijn - androsteron - genomen - genetische correlatie - varkensfokkerij - stakeholders - beoordeling - dierlijke productie - pigs - social behaviour - group interaction - genetic effects - performance - animal welfare - androsterone - genomes - genetic correlation - pig breeding - stakeholders - assessment - animal production
Towards Healthy Diets for parents: efectiveness of a counselling intervention
Eveline J.C. Hooft van Huysduynen
Introduction and Objective: As parents’ modelling of dietary behaviour is one of the factors influencing children’s diets, improving parents’ diets is expected to result in improved dietary intake of their children. This thesis describes research that was conducted to develop and evaluate a counselling intervention to improve parental adherence to the Dutch dietary guidelines.
Methods: A counselling intervention was developed, which was underpinned with the theory of planned behaviour and the transtheoretical model. In 20 weeks, five face-to-face counselling sessions were provided by a registered dietician who used motivational interviewing to improve parental adherence to the Dutch dietary guidelines. In addition, parents received three individually tailored email messages. During the counselling, the dietary guidelines and additional eating behaviours, that were hypothesized to affect diet quality, were addressed. The intervention was evaluated in a randomised controlled trial with 92 parents receiving the counselling and 94 parents as controls. Effects on dietary intake, biomarkers, intermediate markers of health and children’s dietary intake were evaluated. With mediation analyses, it was investigated if changes in dietary intake were established via changes in behavioural determinants. Thereby, it was also examined if spot urine samples could be used to replace 24 h urine samples for evaluating changes in sodium and potassium intake.
Results: The intervention group increased their adherence to the dietary guidelines, as assessed with the Dutch Healthy Diet-index (ranging from 0 to 100 points), by 6.7 points more than the control group did. This improvement was achieved by small increases in the scores of seven out of ten index components. The most substantial changes were shown in fruit and fish intakes of which increases in fish intake were reflected in changes in fatty acid profiles derived from blood plasma. Also a small decrease in waist circumference was observed. Based on parental reports, the children in the intervention group increased their intakes of fruit, vegetables and fish more than the children in the control group. Improvements in parental fruit intake were mediated by changes in the behavioural determinants attitude and habit strength. Decreases in snack intake were mediated by changes in self-identity as a healthy eater. Although the results of a study in young Caucasian women showed that spot urine can be used to rank individuals for their ratios of sodium to potassium, no intervention effects on these ratios were observed.
Conclusion: This thesis provides empirical knowledge on potential effective elements for counselling interventions aiming at improving the dietary pattern as a whole of parents and provides knowledge on methods to evaluate changes in dietary intake.
Filling the gap between sequence and function: a bioinformatics approach
Bargsten, J.W. - \ 2014
Wageningen University. Promotor(en): Richard Visser, co-promotor(en): Jan-Peter (Jp) Nap. - Wageningen : Wageningen University - ISBN 9789462570764 - 170
bio-informatica - planten - genomica - nucleotidenvolgordes - functionele genomica - vergelijkende genomica - vergelijkende genetische kartering - genomen - genetische kartering - plantenveredeling - methodologie - bioinformatics - plants - genomics - nucleotide sequences - functional genomics - comparative genomics - comparative mapping - genomes - genetic mapping - plant breeding - methodology
The research presented in this thesis focuses on deriving function from sequence information, with the emphasis on plant sequence data. Unravelling the impact of genomic elements, in most cases genes, on the phenotype of an organism is a major challenge in biological research and modern plant breeding. An important part of this challenge is the (functional) annotation of such genomic elements. Currently, wet lab experiments may provide high quality, but they are laborious and costly. With the advent of next generation sequencing platforms, vast amounts of sequence data are generated. This data are used in connection with the available experimental data to derive function from a bioinformatics perspective.
The connection between sequence information and function was approached on the level of chromosome structure (chapter 2) and of gene families (chapter 3) using combinations of existing bioinformatics tools. The applicability of using interaction networks for function prediction was demonstrated by first markedly improving an existing method (chapter 4) and by exploring the role of network topology in function prediction (chapter 5). Taken together, the combination of methods and results presented indicate the potential as well as the current state-of-the-art of function prediction in (plant) bioinformatics.
Chapter 1 introduces the basis for the approaches used and developed in this thesis. This includes the concepts of genome annotation, comparative genomics, gene function prediction and the analysis of network topology for gene function prediction. A requirement for the study of any new organism is the sequencing and annotation of its genome. Current genome annotation is divided into structural identification and functional categorization of genomic elements. The de facto standard for categorizing functional annotation is provided by the Gene Ontology. The Gene Ontology is divided into three domains, molecular function, biological process and cellular component. Approaches to predict molecular function and biological process are outlined. Accurate function prediction generally relies on existing input data, often of experimental origin, that can be transferred to unannotated genomic elements. Plants often lack such input data, which poses a big challenge for current function prediction algorithms. In unravelling the function of genomic elements, comparative genomics is an important approach. Via the comparison of multiple genomes it gives insights into evolution, function as well as genomic structure and variation. Comparative genomics has become an essential toolkit for the analysis of newly sequenced organisms. Often bioinformatics methods need to be adapted to the specific needs of plant genome research. With a focus on the commercially important crop plants tomato and potato, specific requirements of plant bioinformatics, such as the high amount of repetitive elements and the lack of experimental data, are outlined.
In chapter 2, the structural homology of the long arm of chromosome 2 (2L) of tomato, potato and pepper is analyzed. Molecular organization and collinear junctions are delineated using multi-color BAC FISH analysis and comparative sequence alignment. We identify several large-scale rearrangements including inversions and segmental translocations that were not reported in previous comparative studies. Some of the structural rearrangements are specific for the tomato clade, and differentiate tomato from potato, pepper and other solanaceous species. There are many small-scale synteny perturbations, but local gene vicinity is largely preserved. The data suggests that long distance intra-chromosomal rearrangements and local gene rearrangements have evolved frequently during speciation in the Solanum genus, and that small changes are more prevalent than large-scale differences. The occurrence of transposable elements and other repeats near or at junction breaks may indicate repeat-mediated rearrangements. The ancestral 2L topology is reconstructed and the evolutionary events leading to the current topology are discussed.
In chapter 3, we analyze the Snf2 gene family. As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. The analysis is the first comprehensive study of Snf2 family ATPases in plants. Some subfamilies of the Snf2 gene family are remarkably stable in number of genes per genome, whereas others show expansion and contraction in several plants. One of these subfamilies, the plant-specific DRD1 subfamily, is non-existent in lower eukaryote genomes, yet it developed into the largest Snf2 subfamily in plant genomes. It shows the occurrence of a complex series of evolutionary events. Its expansion, notably in tomato, suggests novel functionality in processes connected to chromatin remodeling. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding with these genes.
In chapter 4, a new approach to improve the prediction of protein function in terms of biological processes is developed that is particularly attractive for sparsely annotated plant genomes. The combination of the network-based prediction method Bayesian Markov Random Field (BMRF) with the sequence-based prediction method Argot2 shows significantly improved performance compared to each of the methods separately, as well as compared to Blast2GO. The approach was applied to predict biological processes for the proteomes of rice, barrel clover, poplar, soybean and tomato. Analysis of the relationships between sequence similarity and predicted function similarity identifies numerous cases of divergence of biological processes in which proteins are involved, in spite of sequence similarity. Examples of potential divergence are identified for various biological processes, notably for processes related to cell development, regulation, and response to chemical stimulus. Such divergence in biological process annotation for proteins with similar sequences should be taken into account when analyzing plant gene and genome evolution. This way, the integration of network-based and sequence-based function prediction will strengthen the analysis of evolutionary relationships of plant genomes.
In chapter 5 the influence of network topology on network-based function prediction algorithms is investigated. The analysis of biological networks using algorithms such as Bayesian Markov Random Field (BMRF) is a valuable predictor of the biological processes that proteins are involved in. The topological properties and constraints that determine prediction performance in such networks are however largely unknown. This chapter presents analyses based on network centrality measures, such as node degree, to evaluate the performance of BMRF upon progressive removal of highly connected hub nodes (pruning). Three different protein-protein interaction networks with data from Arabidopsis, human and yeast were analyzed. All three show that the average prediction performance can improve significantly. The chapter paves the way for further improvement of network-based function prediction methods based on node pruning.
Chapter 6 discusses the results and methods developed in this thesis in the context of the vast amount of generated sequencing data. Sequencing or re-sequencing a (plant) genome has become fairly straightforward and affordable, but the interpretation for subsequent use of this sequence data is far from trivial. The topics addressed in this thesis, annotation of function, analysis of genome structure and identifying genomic variation, focus on this main bottleneck of biological research. Issues discussed in connection with this work and its future are data accuracy, error propagation, possible improvements and future implications for biological research in crop plants. In particular the shift of costs from sequencing to downstream analyses, with functional genome annotation as essential step, is covered. One of the biggest challenges biology and bioinformatics will face is the integration of results from such downstream analyses and other sources into a complete picture. Only this will allow understanding of complex biological systems.
Natural variation in memory formation among Nasonia parasitic wasps : from genes to behaviour
Hoedjes, K.M. - \ 2014
Wageningen University. Promotor(en): Louise Vet; Marcel Dicke, co-promotor(en): Hans Smid. - Wageningen : Wageningen University - ISBN 9789461739483 - 191
nasonia - hymenoptera - geheugen - leervermogen - genetische factoren - dierecologie - diergedrag - genomen - nasonia - hymenoptera - memory - learning ability - genetic factors - animal ecology - animal behaviour - genomes
The ability to learn and form memory has been demonstrated in various animal species, ranging from relatively simple invertebrates, such as snails and insects, to more complex vertebrate species, including birds and mammals. The opportunity to acquire new skills or to adapt behaviour through learning is an obvious benefit. However, memory formation is also costly: it can be maladaptive when unreliable associations are formed and the process of memory formation can be energetically costly. The balance between costs and benefits determines if learning and memory formation are beneficial to an animal or not. Variation in learning abilities and memory formation between species is thought to reflect species-specific differences in ecology.
This thesis focused on variation in the number of trials required to form long-term memory (LTM). LTM is considered the most stable and durable type of memory, but also the most costly, because it requires protein synthesis. Many animal species require multiple learning experiences, which are spaced in time, to form LTM. This allows re-evaluation of information before an animal invests in costly LTM. There is, however, variation in the number of trials that animal species require to induce LTM formation. A number of insect species, including a number of parasitic wasp species, form LTM after only a single learning experience. Parasitic wasps can learn odours that guide them towards suitable hosts for their offspring, so-called oviposition learning. Substantial differences in LTM formation are observed among closely related species of parasitic wasps, which provides excellent opportunities for comparative studies. Both ecological and genetic factors involved in variation in LTM formation have been studied in this project. A multidisciplinary approach is essential to understand the evolution of variation in LTM formation, because the interaction between genes and environment shapes learning and memory formation.
LTM formation was studied in closely related species of the genus Nasonia. These small parasitic wasps (~2 mm in length) lay their eggs in various species of fly pupae and differences in the ecology of the four known species of this genus have been described. A high-throughput method for olfactory conditioning was developed in which the wasps associated an odour, either chocolate or vanilla, with the reward of a host. A T-maze olfactometer was designed for high-throughput testing of memory retention. Using these methods, variation in memory retention was observed between three Nasonia species. Both N. vitripennis and N. longicornis form a long-lasting memory after a single conditioning trial, which lasts at least 5 days. Nasonia giraulti, on the other hand, lost its memory after 1 to 2 days after a single conditioning trial. Further studies focused on the difference between N. vitripennis and N. giraulti, which was most pronounced. By inhibiting LTM with transcription and translation inhibitors, it was confirmed that N. vitripennis forms this type of memory after a single conditioning trial. LTM is visible 4 days after conditioning in N. vitripennis. Nasonia giraulti does not form LTM after a single conditioning trial. Long-lasting memory is only formed after two trials, with a 4-hour interval between them. This difference in LTM formation makes N. vitripennis and N. giraulti excellent model species to study both ecological and genetic factors involved in this difference.
Ecological factors such as the value of the reward and the reliability of the learned association have been shown to affect memory formation in a number of animal species. A recent study on oviposition learning in two parasitic wasp species demonstrated that LTM formation depends on the host species, i.e. the reward offered during conditioning. LTM was formed when a host with a higher quality was offered, but not when a host of lower quality was offered. The effect of host quality on memory retention of N. vitripennis and N. giraulti was tested. Either a large host, Calliphora vomitoria, a medium-sized host, Lucilia sericata, or a small host, Musca domestica, was offered during conditioning. These hosts were observed to differ significantly in their quality, i.e. in the number of parasitoid offspring that emerged and the size of the offspring. There was, however, no effect of host species on memory retention in either Nasonia species. These results suggest that host quality is not important for LTM formation in N. vitripennis and N. giraulti. This observation shows that ecological factors that are important for memory formation in one species may not be important for another species.
The genetic basis of memory formation is highly conserved among distant animal phyla. A large number of genes involved in LTM formation have been identified in genetic model organisms, including fruit flies, honeybees, the California sea hare, mice and rats, and the zebra finch.Genetic factors responsible for natural variation in LTM formation between species are currently unknown, however. Two approaches were used to study genetic factors responsible for the difference in LTM formation between N. vitripennis and N. giraulti. The first approach took advantage of the unique possibility to interbreed Nasonia species. Hybrid offspring of N. vitripennis and N. giraulti did not form LTM after a single conditioning trial, similar to N. giraulti. The dominant LTM phenotype of N. giraulti was then backcrossed into the genetic background of N. vitripennis for up to 5 generations. Using a genotyping microarray analysis and subsequent confirmation experiments, we detected two genomic regions (quantitative trait loci – QTLs) that both reduce long-lasting memory, but not completely remove this memory. These results indicate that multiple QTLs regulate the difference in LTM formation between the two Nasonia species. Concluding, our approach has provided insights in the genomic basis of a naturally occurring difference in LTM formation between two species. Excellent opportunities for fine-scale QTL mapping are available for the genus Nasonia. This will allow identification of decisive regulatory mechanisms involved in LTM formation that are located in the two genomic regions detected in this study.
The second approach took advantage of next-generation sequencing techniques that allow transcriptome-wide studies of gene expression levels. RNA from heads of N. vitripennis and N. giraulti was collected before conditioning and immediately, 4 hours, or 24 hours after conditioning. This RNA was sequenced strand-specifically using HiSeq technology, which allows detection of sense and antisense transcripts. Various genes, from a number of different signalling pathways known to be involved in LTM formation, were uniquely differentially expressed after conditioning in N. vitripennis. These genes are likely involved in the ongoing process of LTM formation in this species. A number of other genes with a known role in LTM formation,including genes involved in dopamine synthesis and in the Ras-MAPK and PI3K signalling pathways, were uniquely differentially expressed in N. giraulti. These genes may have a role in a LTM inhibitory mechanism in this species. Antisense transcripts were detected for a number of known memory genes, which may indicate a role inregulation of transcription, alternative splicing, or translation. This study is the first to compare gene expression patterns after conditioning between two species that differ in LTM formation. The results provide promising candidate genes for future studies in which the regulation of these genes, the function of specific splice variants, and spatial expression patterns in the brain should be studied to understand how these genes are involved in the regulation of LTM formation.
Learning and memory formation have an important role in animal and human behaviour.Novel and valuable insights on both ecological and genetic factors responsible for variation in LTM formation have been revealed by the research presented in this thesis. Integrating ecological factors and genetic factors is essential, as genes are the level on which ecological factors can drive the evolution of variation in learning and memory formation. The genus Nasonia has offered excellent opportunities for ecological research as well as unique opportunities for studies on genomic and genetic factors, which were addressed by comparing closely related species that differ in memory formation. This thesis provides the basis for the identification of genomic differences responsible for the difference in memory formation between Nasonia species, but it also characterized the consequences of these genomic differences on gene expression. The genetic basis of learning and memory formation are highly conserved among distant animal species and insights from this thesis are likely applicable to other animal species and humans, as well.Altogether, these small parasitic wasps allow us to understand and value differences in memory formation.
Comparative genomics of Dothideomycete fungi
Burgt, A. van der - \ 2014
Wageningen University. Promotor(en): Pierre de Wit, co-promotor(en): Jerome Collemare. - Wageningen : Wageningen University - ISBN 9789461739056 - 176
dothideomycetes - plantenziekteverwekkende schimmels - passalora fulva - dothistroma - genomen - vergelijkende genomica - dothideomycetes - plant pathogenic fungi - passalora fulva - dothistroma - genomes - comparative genomics
Fungi are a diverse group of eukaryotic micro-organisms particularly suited for comparative genomics analyses. Fungi are important to industry, fundamental science and many of them are notorious pathogens of crops, thereby endangering global food supply. Dozens of fungi have been sequenced in the last decade and with the advances of the next generation sequencing, thousands of new genome sequences will become available in coming years. In this thesis I have used bioinformatics tools to study different biological and evolutionary processes in various genomes with a focus on the genomes of the Dothideomycetefungi Cladosporium fulvum, Dothistroma septosporumand Zymoseptoria tritici.
Chapter 1introduces the scientific disciplines of mycology and bioinformatics from a historical perspective. It exemplifies a typical whole-genome sequence analysis of a fungal genome, and focusses in particular on structural gene annotation and detection of transposable elements. In addition it shortly reviews the microRNA pathway as known in animal and plants in the context of the putative existence of similar yet subtle different small RNA pathways in other branches of the eukaryotic tree of life.
Chapter 2addresses the novel sequenced genomes of the closely related Dothideomyceteplant pathogenic fungi Cladosporium fulvumand Dothistroma septosporum. Remarkably, it revealed occurrence of a surprisingly high similarity at the protein level combined with striking differences at the DNA level, gene repertoire and gene expression. Most noticeably, the genome of C. fulvumappears to be at least twice as large, which is solely attributable to a much larger content in repetitive sequences.
Chapter 3describes a novel alignment-based fungal gene prediction method (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It shows excellent performance benchmarked on a dataset of 7,000 unigene-supported gene models from ten different fungi. Applicability of the method was shown by revisiting the annotations of C. fulvumand D. septosporumand of various other fungal genomes from the first-generation sequencing era. Thousands of gene models were revised in each of the gene catalogues, indeed revealing a correlation to the quality of the genome assembly, and to sequencing strategies used in the sequencing centres, highlighting different types of errors in different annotation pipelines.
Chapter 4focusses on the unexpected high number of gene models that were identified by ABFGP that align nicely to informant genes, but only upon toleration of frame shifts and in-frame stop-codons. These discordances could represent sequence errors (SEs) and/or disruptive mutations (DMs) that caused these truncated and erroneous gene models. We revisited the same fungal gene catalogues as in chapter 3, confirmed SEs by resequencing and successively removed those, yielding a high-confidence and large dataset of nearly 1,000 pseudogenes caused by DMs. This dataset of fungal pseudogenes, containing genes listed as bona fide genes in current gene catalogues, does not correspond to various observations previously done on fungal pseudogenes. Moreover, the degree of pseudogenization showing up to a ten-fold variation for the lowest versus the highest affected species, is generally higher in species that reproduce asexually compared to those that in addition reproduce sexually.
Chapter 5describes explorative genomics and comparative genomics analyses revealing the presence of introner-like elements (ILEs) in various Dothideomycetefungi including Zymoseptoria triticiin which they had not identified yet, although its genome sequence is already publicly available for several years. ILEs combine hallmark intron properties with the apparent capability of multiplying themselves as repetitive sequence. ILEs strongly associate with events of intron gain, thereby delivering in silico proof of their mobility. Phylogenetic analyses at the intra- and inter-species level showed that most ILEs are related and likely share common ancestry.
Chapter 6provides additional evidence that ILE multiplication strongly dominates over other types of intron duplication in fungi. The observed high rate of ILE multiplication followed by rapid sequence degeneration led us to hypothesize that multiplication of ILEs has been the major cause and mechanism of intron gain in fungi, and we speculate that this could be generalized to all eukaryotes.
Chapter 7describes a new strategy for miRNA hairpin prediction using statistical distributions of observed biological variation of properties (descriptors) of known miRNA hairpins. We show that the method outperforms miRNA prediction by previous, conventional methods that usually apply threshold filtering. Using this method, several novel candidate miRNAs were assigned in the genomes of Caenorhabditis elegansand two human viruses. Although this chapter is not applied on fungi, the study does provide a flexible method to find evidence for existence of a putative miRNA-like pathway in fungi.
Chapter 8provides a general discussion on the advent of bioinformatics in mycological research and its implications. It highlights the necessity of a prioriplanning and integration of functional analysis and bioinformatics in order to achieve scientific excellence, and describes possible scenarios for the near future of fungal (comparative) genomics research. Moreover, it discusses the intrinsic error rate in large-scale, automatically inferred datasets and the implications of using and comparing those.