Staff Publications

Staff Publications

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    'Staff publications' is the digital repository of Wageningen University & Research

    'Staff publications' contains references to publications authored by Wageningen University staff from 1976 onward.

    Publications authored by the staff of the Research Institutes are available from 1995 onwards.

    Full text documents are added when available. The database is updated daily and currently holds about 240,000 items, of which 72,000 in open access.

    We have a manual that explains all the features 

Record number 558560
Title A semantic approach for timeseries data fusion
Author(s) Samourkasidis, Argyrios; Athanasiadis, Ioannis N.
Source Computers and Electronics in Agriculture 169 (2020). - ISSN 0168-1699
DOI https://doi.org/10.1016/j.compag.2019.105171
Department(s) Information Technology
WASS
Publication type Refereed Article in a scientific journal
Publication year 2020
Keyword(s) AgMIP - APSIM - Data reuse - DSSAT - Environmental timeseries - FAIR data - Internet of Things - Interoperability - Legacy data - Reasoning - Semantic heterogeneity - Templates - WOFOST
Abstract

The data deluge following the rise of Internet of Things contributes towards the creation of non-reusable data silos. Especially in the environmental sciences domain, syntactic and semantic heterogeneity hinders data re-usability as most times manual labour and domain expertise is required. Both the different syntaxes under which environmental timeseries are formatted and the implicit semantics which are used to describe them contribute to this end. Usually, the real meaning of data is obscured in a combination of short data labels, titles and various value codes, that require domain or institutional knowledge to decipher. The FAIR data principles for scientific data sharing are stewardship offer a framework based on community-adopted metadata. In this work, we present the Environmental Data Acquisition Module (EDAM) which focuses on data interoperability and reuse, and deals with syntactic and semantic heterogeneity using a template approach. Data curators draft templates to describe in an abstract fashion the syntax of the timeseries datasets they want to acquire or disseminate. They complement each template with a metadata file, which is used to annotate observables and their properties (including physical quantities and units of measurement) with terms from an ontology. EDAM employs a reasoner to infer compatibility among syntactically and semantically heterogeneous datasets, and enables timeseries, format and units of measurement transformation on-the-fly. Our approach utilizes a local ontology to store metadata about datasets, which enables EDAM to acquire and transform datasets which were originally stored with different semantics and syntaxes. We demonstrate EDAM in a case study where we transform meteorological input files of four agricultural models. Our approach, allows to cut across environmental data silos and facilitate timeseries reusability, as it enables users to (a) discover datasets in other formats, (b) transform them and (c) reuse them in their scientific workflows. This directly contributes to the toolshed for FAIR data management in environmental sciences. EDAM implementation has been released under an open-source license.

Comments
There are no comments yet. You can post the first one!
Post a comment
 
Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.