PhD theses

All Wageningen University PhD theses

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    Wageningen PhD theses

    This database contains bibliographic descriptions of all Wageningen University PhD theses from 1920 onwards. It is updated on a daily basis by WUR Library.

    Author abstracts and/or summaries are added to all descriptions. A link to the full text dissertation is added to the bibliographic description. In a few cases, no electronic version is available, mostly because of copyright issues.

    Hard copies of all theses are available for loan at WUR Library. To request them, click the link Request this publication in the full record presentation. This is a fee based service.

    mail icon WUR Library, 9 july 2012


Record number 2249386
Title A FAIR approach to genomics
show extra info.
Jasper Jan Koehorst
Author(s) Koehorst, Jasper Jan (dissertant)
Publisher Wageningen : Wageningen University
Publication year 2019
Description 244 pages figures, diagrams
Description 1 online resource (PDF, 244 pages) figures, diagrams
Notes Includes bibliographical references. - With summary in English
ISBN 9789463433693; 9463433694
Tutors Martins dos Santos, Prof. dr. V.A.P. ; Schaap, Dr. P.J. ; Saccenti, Dr. E.
Graduation date 2019-01-25
Dissertation no. 7141
Author abstract show abstract

The aim of this thesis was to increase our understanding on how genome information leads to function and phenotype. To address these questions, I developed a semantic systems biology framework capable of extracting knowledge, biological concepts and emergent system properties, from a vast array of publicly available genome information. In chapter 2, Empusa is described as an infrastructure that bridges the gap between the intended and actual content of a database. This infrastructure was used in chapters 3 and 4 to develop the framework. Chapter 3 describes the development of the Genome Biology Ontology Language and the GBOL stack of supporting tools enforcing consistency within and between the GBOL definitions in the ontology (OWL) and the Shape Expressions (ShEx) language describing the graph structure. A practical implementation of a semantic systems biology framework for FAIR (de novo) genome annotation is provided in chapter 4. The semantic framework and genome annotation tool described in this chapter has been used throughout this thesis to consistently, structurally and functionally annotate and mine microbial genomes used in chapter 5-10. In chapter 5, we introduced how the concept of protein domains and corresponding architectures can be used in comparative functional genomics to provide for a fast, efficient and scalable alternative to sequence-based methods. This allowed us to effectively compare and identify functional variations between hundreds to thousands of genomes. In chapter 6, we used 432 available complete Pseudomonas genomes to study the relationship between domain essentiality and persistence. In this chapter the focus was mainly on domains involved in metabolic functions. The metabolic domain space was explored for domain essentiality and persistence through the integration of heterogeneous data sources including six published metabolic models, a vast gene expression repository and transposon data. In chapter 7, the correlation between the expected and observed genotypes was explored using 16S-rRNA phylogeny and protein domain class content as input. In this chapter it was shown that domain class content yields a higher resolution in comparison to 16S-rRNA when analysing evolutionary distances. Using protein domain classes, we also were able to identify signifying domains, which may have important roles in shaping a species. To demonstrate the use of semantic systems biology workflows in a biotechnological setting we expanded the resource with more than 80.000 bacterial genomes. The genomic information of this resource was mined using a top down approach to identify strains having the trait for 1,3-propanediol production. This resulted in the molecular identification of 49 new species. In addition, we also experimentally verified that 4 species were capable of producing 1,3-propanediol.

As discussed in chapter 10, the here developed semantic systems biology workflows were successfully applied in the discovery of key elements in symbiotic relationships, to improve functional genome annotation and in comparative genomics studies. Wet/dry-lab collaboration was often at the basis of the obtained results.

The success of the collaboration between the wet and dry field, prompted me to develop an undergraduate course in which the concept of the “Moist” workflow was introduced (Chapter 9).

Online full textINTERNET
On paper Get the document, find related information or use other SFX services
Publication type PhD thesis
Language English
There are no comments yet. You can post the first one!
Post a comment

To support researchers to publish their research Open Access, deals have been negotiated with various publishers. Depending on the deal, a discount is provided for the author on the Article Processing Charges that need to be paid by the author to publish an article Open Access. A discount of 100% means that (after approval) the author does not have to pay Article Processing Charges.

For the approval of an Open Access deal for an article, the corresponding author of this article must be affiliated with Wageningen University & Research.

Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.