Staff Publications

Staff Publications

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    'Staff publications' is the digital repository of Wageningen University & Research

    'Staff publications' contains references to publications authored by Wageningen University staff from 1976 onward.

    Publications authored by the staff of the Research Institutes are available from 1995 onwards.

    Full text documents are added when available. The database is updated daily and currently holds about 240,000 items, of which 72,000 in open access.

    We have a manual that explains all the features 

Record number 418448
Title On the increase of predictive performance with high-level data fusion Highlighted and/or underlined version
Author(s) Doeswijk, T.G.; Smilde, A.K.; Hageman, J.A.; Westerhuis, J.A.; Eeuwijk, F.A. van
Source Analytica Chimica Acta 705 (2011)1-2. - ISSN 0003-2670 - p. 41 - 47.
DOI http://dx.doi.org/10.1016/j.aca.2011.03.025
Department(s) Biometris (WU MAT)
PE&RC
Publication type Refereed Article in a scientific journal
Publication year 2011
Abstract The combination of the different data sources for classification purposes, also called data fusion, can be done at different levels: low-level, i.e. concatenating data matrices, medium-level, i.e. concatenating data matrices after feature selection and high-level, i.e. combining model outputs. In this paper the predictive performance of high-level data fusion is investigated. Partial least squares is used on each of the data sets and dummy variables representing the classes are used as response variables. Based on the estimated responses View the MathML source for data set j and class k, a Gaussian distribution View the MathML source is fitted. A simulation study is performed that shows the theoretical performance of high-level data fusion for two classes and two data sets. Within group correlations of the predicted responses of the two models and differences between the predictive ability of each of the separate models and the fused models are studied. Results show that the error rate is always less than or equal to the best performing subset and can theoretically approach zero. Negative within group correlations always improve the predictive performance. However, if the data sets have a joint basis, as with metabolomics data, this is not likely to happen. For equally performing individual classifiers the best results are expected for small within group correlations. Fusion of a non-predictive classifier with a classifier that exhibits discriminative ability lead to increased predictive performance if the within group correlations are strong. An example with real life data shows the applicability of the simulation results
Comments
There are no comments yet. You can post the first one!
Post a comment
 
Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.