Staff Publications

Staff Publications

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    'Staff publications' is the digital repository of Wageningen University & Research

    'Staff publications' contains references to publications authored by Wageningen University staff from 1976 onward.

    Publications authored by the staff of the Research Institutes are available from 1995 onwards.

    Full text documents are added when available. The database is updated daily and currently holds about 240,000 items, of which 72,000 in open access.

    We have a manual that explains all the features 

Record number 535644
Title Interoperable genome annotation with GBOL, an extendable infrastructure for functional data mining
Author(s) Dam, J.C.J. van; Koehorst, J.J.; Vik, Jon Olav; Schaap, P.J.; Suarez-Diez, M.
Source bioRxiv (2017). - 9 p.
DOI http://dx.doi.org/10.1101/184747
Department(s) Systems and Synthetic Biology
VLAG
Publication type Non-refereed article in scientific journal
Publication year 2017
Abstract Background: A standard structured format is used by the public sequence databases to present genome annotations. A prerequisite for a direct functional comparison is consistent annotation of the genetic elements with evidence statements. However, the current format provides limited support for data mining, hampering comparative analyses at large scale. Results: The provenance of a genome annotation describes the contextual details and derivation history of the process that resulted in the annotation. To enable interoperability of genome annotations, we have developed the Genome Biology Ontology Language (GBOL) and associated infrastructure (GBOL stack). GBOL is provenance aware and thus provides a consistent representation of functional genome annotations linked to the provenance. GBOL is modular in design, extendible and linked to existing ontologies. The GBOL stack of supporting tools enforces consistency within and between the GBOL definitions in the ontology (OWL) and the Shape Expressions (ShEx) language describing the graph structure. Modules have been developed to serialize the linked data (RDF) and to generate a plain text format files. Conclusion: The main rationale for applying formalized information models is to improve the exchange of information. GBOL uses and extends current ontologies to provide a formal representation of genomic entities, along with their properties and relations. The deliberate integration of data provenance in the ontology enables review of automatically obtained genome annotations at a large scale. The GBOL stack facilitates consistent usage of the ontology.
Comments
There are no comments yet. You can post the first one!
Post a comment
 
Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.