Staff Publications

Staff Publications

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    'Staff publications' is the digital repository of Wageningen University & Research

    'Staff publications' contains references to publications authored by Wageningen University staff from 1976 onward.

    Publications authored by the staff of the Research Institutes are available from 1995 onwards.

    Full text documents are added when available. The database is updated daily and currently holds about 240,000 items, of which 72,000 in open access.

    We have a manual that explains all the features 

Current refinement(s):

Records 1 - 20 / 66

  • help
  • print

    Print search results

  • export

    Export search results

  • alert
    We will mail you new results for this query: keywords==regression
Check title to add to marked list
Bayesian analysis of energy balance data from growing cattle using parametric and non-parametric modelling
Moraes, L.E. ; Kebreab, E. ; Strathe, A.B. ; France, J. ; Dijkstra, J. ; Casper, D. ; Fadel, J.G. - \ 2014
Animal Production Science 54 (2014)12. - ISSN 1836-0939 - p. 2068 - 2081.
lactating dairy-cows - metabolizable energy - net energy - penalized splines - dynamic-model - mixed models - efficiency - growth - regression - gain
Linear and non-linear models have been extensively utilised for the estimation of net and metabolisable energy requirements and for the estimation of the efficiencies of utilising dietary energy for maintenance and tissue gain. In growing animals, biological principles imply that energy retention rate is non-linearly related to the energy intake level because successive increments in energy intake above maintenance result in diminishing returns for tissue energy accretion. Heat production in growing cattle has been traditionally described by logarithmic regression and exponential models. The objective of the present study was to develop Bayesian models of energy retention and heat production in growing cattle using parametric and non-parametric techniques. Parametric models were used to represent models traditionally employed to describe energy use in growing steers and heifers whereas the non-parametric approach was introduced to describe energy utilisation while accounting for non-linearities without specifying a particular functional form. The Bayesian framework was used to incorporate prior knowledge of bioenergetics on tissue retention and heat production and to estimate net and metabolisable energy requirements (NEM and MEM, respectively), and the partial efficiencies of utilising dietary metabolisable energy for maintenance (km) and tissue energy gain (kg). The database used for the study consisted of 719 records of indirect calorimetry on steers and non-pregnant, non-lactating heifers. The NEM was substantially larger in energy retention models (ranged from 0.40 to 0.50 MJ/kg than were NEM estimates from heat-production models (ranged from 0.29 to 0.49 MJ/kg Similarly, km was also larger in energy retention models than in heat production models. These differences are explained by the nature of y-intercepts (NEM) in these two models. Energy retention models estimate fasting catabolism as the y-intercept, while heat production models estimate fasting heat production. Conversely, MEM was virtually identical in all models and approximately equal to 0.53 MJ/kg in this database.
A Unimodal Species Response Model Relating Traits to Environment with Application to Phytoplankton Communities.
Jamil, T. ; Kruk, C. ; Braak, C.J.F. ter - \ 2014
PLoS One 9 (2014)5. - ISSN 1932-6203 - 14 p.
bayesian variable selection - climate-change - ecology - lake - variability - strategies - diversity - habitat - classification - regression
In this paper we attempt to explain observed niche differences among species (i.e. differences in their distribution along environmental gradients) by differences in trait values (e.g. volume) in phytoplankton communities. For this, we propose the trait-modulated Gaussian logistic model in which the niche parameters (optimum, tolerance and maximum) are made linearly dependent on species traits. The model is fitted to data in the Bayesian framework using OpenBUGS (Bayesian inference Using Gibbs Sampling) to identify according to which environmental variables there is niche differentiation among species and traits. We illustrate the method with phytoplankton community data of 203 lakes located within four climate zones and associated measurements on 11 environmental variables and six morphological species traits of 60 species. Temperature and chlorophyll-a (with opposite signs) described well the niche structure of all species. Results showed that about 25% of the variance in the niche centres with respect to chlorophyll-a were accounted for by traits, whereas niche width and maximum could not be predicted by traits. Volume, mucilage, flagella and siliceous exoskeleton are found to be the most important traits to explain the niche centres. Species were clustered in two groups with different niches structures, group 1 high temperature-low chlorophyll-a species and group 2 low temperature-high chlorophyll-a species. Compared to group 2, species in group 1 had larger volume but lower surface area, had more often flagella but neither mucilage nor siliceous exoskeleton. These results might help in understanding the effect of environmental changes on phytoplankton community. The proposed method, therefore, can also apply to other aquatic or terrestrial communities for which individual traits and environmental conditioning factors are available.
Observational Support for the Stability Dependence of the Bulk Richardson Number across the Stable Boundary Layer
Basu, S. ; Holtslag, A.A.M. ; Caporaso, L. ; Riccio, A. ; Steeneveld, G.J. - \ 2014
Boundary-Layer Meteorology 150 (2014)3. - ISSN 0006-8314 - p. 515 - 523.
self-correlation - resistance laws - surface fluxes - least-squares - model - height - regression - formulations - parameter - breakdown
The bulk Richardson number (Ri Bh ; defined over the entire stable boundary layer) is commonly utilized in observational and modelling studies for the estimation of the boundary-layer height. Traditionally, Ri Bh is assumed to be a quasi-universal constant. Recently, based on large-eddy simulation and wind-tunnel data, a stability-dependent relationship has been proposed for Ri Bh . In this study, we analyze extensive observational data from several field campaigns and provide further support for this newly proposed relationship.
Mapping a priori defined plant associations using remotely sensed vegetation characteristics
Roelofsen, H.D. ; Kooistra, L. ; Bodegom, P.M. van; Verrelst, J. ; Krol, J. ; Witte, J.M.P. - \ 2014
Remote Sensing of Environment 140 (2014). - ISSN 0034-4257 - p. 639 - 651.
ellenberg indicator values - continuous floristic gradients - hyperspectral imagery - imaging spectroscopy - endmember selection - tropical forests - aviris data - classification - regression - moisture
Incorporation of a priori defined plant associations into remote sensing products is a major challenge that has only recently been confronted by the remote sensing community. We present an approach to map the spatial distribution of such associations by using plant indicator values (IVs) for salinity, moisture and nutrients as an intermediate between spectral reflectance and association occurrences. For a 12 km2 study site in the Netherlands, the relations between observed IVs at local vegetation plots and visible and near-infrared (VNIR) and short-wave infrared (SWIR) airborne reflectance data were modelled using Gaussian Process Regression (GPR) (R2 0.73, 0.64 and 0.76 for salinity, moisture and nutrients, respectively). These relations were applied to map IVs for the complete study site. Association occurrence probabilities were modelled as function of IVs using a large database of vegetation plots with known association and IVs. Using the mapped IVs, we calculated occurrence probabilities of 19 associations for each pixel, resulting in both a crisp association map with the most likely occurring association per pixel, as well as occurrence probability maps per association. Association occurrence predictions were assessed by a local vegetation expert, which revealed that the occurrences of associations situated at frequently predicted indicator value combinations were over predicted. This seems primarily due to biases in the GPR predicted IVs, resulting in associations with envelopes located in extreme ends of IVs being scarcely predicted. Although the results of this particular study were not fully satisfactory, the method potentially offers several advantages compared to current vegetation classification techniques, like site-independent calibration of association probabilities, site-independent selection of associations and the provision of IV maps and occurrence probabilities per association. If the prediction of IVs can be improved, this method may thus provide a viable roadmap to bring a priori defined plant associations into the domain of remote sensing.
The influence of spatiotemporal variability and adaptations to hypoxia on empirical relationships between soil acidity and vegetation
Cirkel, D.G. ; Witte, J.P.M. ; Bodegom, P.M. van; Nijp, J.J. ; Zee, S.E.A.T.M. van der - \ 2014
Ecohydrology 7 (2014)1. - ISSN 1936-0584 - p. 21 - 32.
bodemchemie - bodemaciditeit - vegetatietypen - bodem-plant relaties - soortensamenstelling - plantenfysiologie - rizosfeer - wetlands - heterogeniteit - ecohydrologie - ruimtelijke variatie - soil chemistry - soil acidity - vegetation types - soil plant relationships - species composition - plant physiology - rhizosphere - heterogeneity - ecohydrology - spatial variation - ellenberg indicator values - field-measurements - plant ecology - ph changes - iron - regression - diversity - diffusion - oxidation
Soil acidity is well known to affect the species composition of natural vegetation. The physiological adaptations of plants to soil acidity and related toxicity effects and nutrient deficiencies are, however, complex, manifold and hard to measure. Therefore, generally applicable quantifications of mechanistic plant responses to soil acidity are still not available. An alternative is the semi-quantitative and integrated response variable ‘indicator value for soil acidity’ (Rm). Although relationships between measured soil pH and Rm from various studies are usually strong, they often show systematic bias and still contain high residual variances. On the basis of a well-documented national dataset consisting of 91 vegetation plots and a dataset with detailed, within-plot, pH measurements taken at three periods during the growing season, it is shown that strong spatiotemporal variation of soil pH can be a critical source of systematic errors and statistical noise. The larger part of variation, however, could be explained by the moisture status of plots. For instance, Spearman's rho decreased from 93% for dry plots and 87% for moist plots to 59% for wet plots. The loss of relation between soil pH and Rm in the moderately acid to alkaline range at increasingly wetter plots is probably due to the establishment of aerenchyma-containing species, which are able to control their rhizosphere acidity. Adaptation to one site factor (oxygen deficit) apparently may induce indifference for other environmental factors (Fe2+, soil pH). For predictions of vegetation response to soil acidity, it is thus important to take the wetness of plots into account
Do intensive care data on respiratory infections reflect influenza epidemics?
Koetsier, A. ; Asten, L. van; Dijkstra, F. ; Hoek, W. van der; Snijders, B.E. ; Wijngaard, C.C. van den; Boshuizen, H.C. ; Donker, G.A. ; Lange, D.W. de; Keizer, N.F. de - \ 2013
PLoS One 8 (2013)12. - ISSN 1932-6203
regression - severity - models
Objectives Severe influenza can lead to Intensive Care Unit (ICU) admission. We explored whether ICU data reflect influenza like illness (ILI) activity in the general population, and whether ICU respiratory infections can predict influenza epidemics. Methods We calculated the time lag and correlation between ILI incidence (from ILI sentinel surveillance, based on general practitioners (GP) consultations) and percentages of ICU admissions with a respiratory infection (from the Dutch National Intensive Care Registry) over the years 2003–2011. In addition, ICU data of the first three years was used to build three regression models to predict the start and end of influenza epidemics in the years thereafter, one to three weeks ahead. The predicted start and end of influenza epidemics were compared with observed start and end of such epidemics according to the incidence of ILI. Results Peaks in respiratory ICU admissions lasted longer than peaks in ILI incidence rates. Increases in ICU admissions occurred on average two days earlier compared to ILI. Predicting influenza epidemics one, two, or three weeks ahead yielded positive predictive values ranging from 0.52 to 0.78, and sensitivities from 0.34 to 0.51. Conclusions ICU data was associated with ILI activity, with increases in ICU data often occurring earlier and for a longer time period. However, in the Netherlands, predicting influenza epidemics in the general population using ICU data was imprecise, with low positive predictive values and sensitivities.
Convergence of European wheat yields
Powell, J.P. ; Rutten, M.M. - \ 2013
Renewable and Sustainable Energy Reviews 28 (2013). - ISSN 1364-0321 - p. 53 - 70.
agricultural land-use - panel-data - model - productivity - estimators - regression - scenarios - emissions - gas
The paper makes several contributions to the study of wheat yield changes in Europe and the resulting economic consequences in the near to medium term future. In particular, it addresses the issue of the effects of yield changes on land use. The transition and growth of yields are estimated using a combination of convergence, time-series and dynamic panel models. Scenarios are then run using estimated yields as input into a computable general equilibrium (CGE) model. The CGE model provides a narrative framework through which the total economic impact of changes in yields can be analyzed. Together, the complementary approaches of econometrics and general equilibrium models allow a more complete economic analysis of the consequences of yield changes for this important biofuels crop to emerge. Although there is no evidence of a common rate of yield convergence across Europe, there is evidence of absolute convergence. Standard time series and panel forecasting methods indicate the potential for only very modest yearly yield increases across most of Europe given optimistic assumptions; although potential yearly increases in newer European states could, in some cases, be substantially higher. However, the total amount of land released as a result of potential yield increases in the wheat sector is only modest because of an increase in demand for land by sectors other than wheat. The overall question of whether significant yield increases will necessarily lead to large increases in land available to produce bio-energy crops is rejected. Land freed by wheat yield increases will go to the production of a wide range of agricultural products that value it as an input. The same reasoning which links yields and land use applies to all agricultural products when there are well functioning markets. (C) 2013 Elsevier Ltd. All rights reserved.
Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review
Lahti, L.M. ; Schäfer, M. ; Klein, H.U. ; Bicciato, S. ; Dugas, M. - \ 2013
Briefings in Bioinformatics 14 (2013)1. - ISSN 1467-5463 - p. 27 - 35.
canonical correlation-analysis - acute lymphoblastic-leukemia - breast-cancer - r-package - genomic data - microarray - aberrations - regression - pathways - impact
A variety of genome-wide profiling techniques are available to investigate complementary aspects of genome structure and function. Integrative analysis of heterogeneous data sources can reveal higher level interactions that cannot be detected based on individual observations. A standard integration task in cancer studies is to identify altered genomic regions that induce changes in the expression of the associated genes based on joint analysis of genome-wide gene expression and copy number profiling measurements. In this review, we highlight common approaches to genomic data integration and provide a transparent benchmarking procedure to quantitatively compare method performances in cancer gene prioritization. Algorithms, data sets and benchmarking results are available at
Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies
Calus, M.P.L. ; Haas, Y. de; Veerkamp, R.F. - \ 2013
Journal of Dairy Science 96 (2013)10. - ISSN 0022-0302 - p. 6703 - 6715.
nucleotide polymorphism information - experimental research herds - quantitative trait loci - milk-production traits - 4 european countries - friesian dairy-cows - selection-strategies - holstein cattle - missing-data - regression
Genomic selection holds the promise to be particularly beneficial for traits that are difficult or expensive to measure, such that access to phenotypes on large daughter groups of bulls is limited. Instead, cow reference populations can be generated, potentially supplemented with existing information from the same or (highly) correlated traits available on bull reference populations. The objective of this study, therefore, was to develop a model to perform genomic predictions and genome-wide association studies based on a combined cow and bull reference data set, with the accuracy of the phenotypes differing between the cow and bull genomic selection reference populations. The developed bivariate Bayesian stochastic search variable selection model allowed for an unbalanced design by imputing residuals in the residual updating scheme for all missing records. The performance of this model is demonstrated on a real data example, where the analyzed trait, being milk fat or protein yield, was either measured only on a cow or a bull reference population, or recorded on both. Our results were that the developed bivariate Bayesian stochastic search variable selection model was able to analyze 2 traits, even though animals had measurements on only 1 of 2 traits. The Bayesian stochastic search variable selection model yielded consistently higher accuracy for fat yield compared with a model without variable selection, both for the univariate and bivariate analyses, whereas the accuracy of both models was very similar for protein yield. The bivariate model identified several additional quantitative trait loci peaks compared with the single-trait models on either trait. In addition, the bivariate models showed a marginal increase in accuracy of genomic predictions for the cow traits (0.01–0.05), although a greater increase in accuracy is expected as the size of the bull population increases. Our results emphasize that the chosen value of priors in Bayesian genomic prediction models are especially important in small data sets.
Patterns of covariance between airborne laser scanning metrics and Lorenz curve descriptors of tree size inequality
Valbuena, R. ; Maltamo, M. ; Martín-Fernández, S. ; Packalén, P. ; Pascual, C. ; Nabuurs, G.J. - \ 2013
Canadian Journal of Remote Sensing 39 (2013)Suppl. 1. - ISSN 1712-7971 - p. S18 - S31.
nearest-neighbor imputation - partial least-squares - forest structure - lidar data - stand - regression - inventory - northwest - selection - canopies
The Lorenz curve, as a descriptor of tree size inequality within a stand, has been suggested as a reliable means for characterizing forest structure and distinguishing even from uneven-sized areas. The aim of this study was to achieve a thorough understanding on the relations between airborne laser scanning (ALS) metrics and indicators based on Lorenz curve ordering: Gini coefficient (GC) and Lorenz asymmetry (S). Exploratory multivariate analysis was carried out using correlation tests, partial least squares (PLS), and an information-theoretic approach for multimodel inference (MMI). Best subset linear model was selected for GC and S prediction, as variable transformations yielded no improvement in the relation of ALS with the given response. Relative variable importance based on the MMI model showed that GC is best predicted by ALS metrics expressing canopy coverage, return dispersion, and low high percentile combinations. Although ALS metrics showed no correlation with S, they did so against its constituting components: the proportions of basal area (Mg) and stem density (xg) stocked above the mean quadratic diameter. The study of PLS loading vectors illustrated how ALS metrics explain variance in opposing directions for each of these components, so that their effects cancel each other out in the overall S. Cross-validation showed that only marginal differences are nevertheless found between predicting S directly or as the sum Mg and xg estimations. The differing relation of diverse ALS metrics was therefore observed for Mg and xg. The conclusions obtained by this research may assist in selecting relevant Lorenz curve descriptors for forest structure characterization, as well as in variable reduction strategies for their wall-to-wall prediction by means of ALS metrics.
Detection of hydrocarbons in clay soils: A laboratory experiment using spectroscopy in the mid- and thermal infrared
Meijde, M. van der; Knox, N. ; Cundill, S.L. ; Noomen, M.F. ; Werff, H.M.A. van der; Hecker, C. - \ 2013
International Journal of applied Earth Observation and Geoinformation 23 (2013). - ISSN 0303-2434 - p. 384 - 388.
pipeline leakage - natural-gas - reflectance - airborne - spectrometer - regression
Remote sensing has been used for direct and indirect detection of hydrocarbons. Most studies so far focused on indirect detection in vegetated areas. We investigated in this research the possibility of detecting hydrocarbons in bare soil through spectral analysis of laboratory samples in the short wave and thermal infrared regions. Soil/oil mixtures were spectrally measured in the laboratory. Analysis of spectra showed development of hydrocarbon absorption features as soils became progressively more contaminated. The future application of these results airborne seems to be a challenge as present and future sensors only cover the diagnostic regions to a limited extent
Characterizing regional soil mineral composition using spectroscopy and geostatistics
Mulder, V.L. ; Bruin, S. de; Weyermann, J. ; Kokaly, R.F. ; Schaepman, M.E. - \ 2013
Remote Sensing of Environment 139 (2013). - ISSN 0034-4257 - p. 415 - 429.
spatial prediction - usgs tetracorder - regression - vegetation - carbon - model - area - attributes - variograms - variables
This work aims at improving the mapping of major mineral variability at regional scale using scale-dependent spatial variability observed in remote sensing data. Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data and statistical methods were combined with laboratory-based mineral characterization of field samples to create maps of the distributions of clay, mica and carbonate minerals and their abundances. The Material Identification and Characterization Algorithm (MICA) was used to identify the spectrally-dominant minerals in field samples; these results were combined with ASTER data using multinomial logistic regression to map mineral distributions. X-ray diffraction (XRD) was used to quantify mineral composition in field samples. XRD results were combined with ASTER data using multiple linear regression to map mineral abundances. We tested whether smoothing of the ASTER data to match the scale of variability of the target sample would improve model correlations. Smoothing was done with Fixed Rank Kriging (FRK) to represent the medium and long-range spatial variability in the ASTER data. Stronger correlations resulted using the smoothed data compared to results obtained with the original data. Highest model accuracies came from using both medium and long-range scaled ASTER data as input to the statistical models. High correlation coefficients were obtained for the abundances of calcite and mica (R2 = 0.71 and 0.70, respectively). Moderately-high correlation coefficients were found for smectite and kaolinite (R2 = 0.57 and 0.45, respectively). Maps of mineral distributions, obtained by relating ASTER data to MICA analysis of field samples, were found to characterize major soil mineral variability (overall accuracies for mica, smectite and kaolinite were 76%, 89% and 86% respectively). The results of this study suggest that the distributions of minerals and their abundances derived using FRK-smoothed ASTER data more closely match the spatial variability of soil and environmental properties at regional scale.
Multi-trait and multi-environment QTL analyses of yield and a set of physiological traits in pepper
Alimi, N.A. ; Bink, M.C.A.M. ; Dieleman, J.A. ; Magán, J.J. ; Wubs, A.M. ; Palloix, A. ; Eeuwijk, F.A. van - \ 2013
Theoretical and Applied Genetics 126 (2013)10. - ISSN 0040-5752 - p. 2597 - 2625.
mixed-model approach - capsicum-annuum - complex traits - fruit size - loci - populations - barley - maize - covariables - regression
For many agronomic crops, yield is measured simultaneously with other traits across multiple environments. The study of yield can benefit from joint analysis with other traits and relations between yield and other traits can be exploited to develop indirect selection strategies. We compare the performance of three multi-response QTL approaches based on mixed models: a multi-trait approach (MT), a multi-environment approach (ME), and a multi-trait multi-environment approach (MTME). The data come from a multi-environment experiment in pepper, for which 15 traits were measured in four environments. The approaches were compared in terms of number of QTLs detected for each trait, the explained variance, and the accuracy of prediction for the final QTL model. For the four environments together, the superior MTME approach delivered a total of 47 regions containing putative QTLs. Many of these QTLs were pleiotropic and showed quantitative QTL by environment interaction. MTME was superior to ME and MT in the number of QTLs, the explained variance and accuracy of predictions. The large number of model parameters in the MTME approach was challenging and we propose several guidelines to help obtain a stable final QTL model. The results confirmed the feasibility and strengths of novel mixed model QTL methodology to study the architecture of complex traits.
Changes in plant defense chemistry (pyrrolizidine alkaloids) revealed through high-resolution spectroscopy
Almeida De Carvalho, S. ; Macel, M. ; Schlerf, M. ; Moghaddam, F.E. ; Mulder, P.P.J. ; Skidmore, A.K. ; Putten, W.H. van der - \ 2013
ISPRS Journal of Photogrammetry and Remote Sensing 80 (2013). - ISSN 0924-2716 - p. 51 - 60.
near-infrared spectroscopy - senecio-jacobaea - red edge - nitrogen - leaf - reflectance - forest - regression - vegetation - prediction
Plant toxic biochemicals play an important role in defense against natural enemies and often are toxic to humans and livestock. Hyperspectral reflectance is an established method for primary chemical detection and could be further used to determine plant toxicity in the field. In order to make a first step for pyrrolizidine alkaloids detection (toxic defense compound against mammals and many insects) we studied how such spectral data can estimate plant defense chemistry under controlled conditions. In a greenhouse, we grew three related plant species that defend against generalist herbivores through pyrrolizidine alkaloids: Jacobaea vulgaris, Jacobaea erucifolia and Senecio inaequidens, and analyzed the relation between spectral measurements and chemical concentrations using multivariate statistics. Nutrient addition enhanced tertiary-amine pyrrolizidine alkaloids contents of J. vulgaris and J. erucifolia and decreased N-oxide contents in S. inaequidens and J. vulgaris. Pyrrolizidine alkaloids could be predicted with a moderate accuracy. Pyrrolizidine alkaloid forms tertiary-amines and epoxides were predicted with 63% and 56% of the variation explained, respectively. The most relevant spectral regions selected for prediction were associated with electron transitions and CH, OH, and NH bonds in the 1530 and 2100 nm regions. Given the relatively low concentration in pyrrolizidine alkaloids concentration (in the order of mg g-1) and resultant predictions, it is promising that pyrrolizidine alkaloids interact with incident light. Further studies should be considered to determine if such a non-destructive method may predict changes in PA concentration in relation to plant natural enemies. Spectroscopy may be used to study plant defenses in intact plant tissues, and may provide managers of toxic plants, food industry and multitrophic-interaction researchers with faster and larger monitoring possibilities
Gene Ontology consistent protein function prediction: the FALCON algorithm applied to six eukaryotic genomes
Kourmpetis, Y.A.I. ; Dijk, A.D.J. van; Braak, C.J.F. ter - \ 2013
Algorithms for Molecular Biology 8 (2013)1. - ISSN 1748-7188
arabidopsis-thaliana - integration - annotation - regression - network - classification - association - terms - tool
Gene Ontology (GO) is a hierarchical vocabulary for the description of biological functions and locations, often employed by computational methods for protein function prediction. Due to the structure of GO, function predictions can be self- contradictory. For example, a protein may be predicted to belong to a detailed functional class, but not in a broader class that, due to the vocabulary structure, includes the predicted one.We present a novel discrete optimization algorithm called Functional Annotation with Labeling CONsistency (FALCON) that resolves such contradictions. The GO is modeled as a discrete Bayesian Network. For any given input of GO term membership probabilities, the algorithm returns the most probable GO term assignments that are in accordance with the Gene Ontology structure. The optimization is done using the Differential Evolution algorithm. Performance is evaluated on simulated and also real data from Arabidopsis thaliana showing improvement compared to related approaches. We finally applied the FALCON algorithm to obtain genome-wide function predictions for six eukaryotic species based on data provided by the CAFA (Critical Assessment of Function Annotation) project
Simultaneous estimation of quantile curves using quantile sheets
Schnabel, S.K. ; Eilers, P.H.C. - \ 2013
AStA Advances in Statistical Analysis 97 (2013)1. - ISSN 1863-8171 - p. 77 - 87.
absolute deviations - regression - constraints - splines
The results of quantile smoothing often show crossing curves, in particular, for small data sets. We define a surface, called a quantile sheet, on the domain of the independent variable and the probability. Any desired quantile curve is obtained by evaluating the sheet for a fixed probability. This sheet is modeled by $P$-splines in form of tensor products of $B$-splines with difference penalties on the array of coefficients. The amount of smoothing is optimized by cross-validation. An application for reference growth curves for children is presented.
Implications of using alternative methods of vessel monitoring system (VMS) data analysis to describe fishing activities and impacts
Lambert, G.I. ; Jennings, S. ; Hiddink, J.G. ; Hintzen, N.T. ; Hinz, H. ; Kaiser, M.J. ; Murray, L.G. - \ 2012
ICES Journal of Marine Science 69 (2012)4. - ISSN 1054-3139 - p. 682 - 693.
trawl disturbance - benthic communities - different habitats - scale - sea - regression - abundance - patterns - biomass - size
Understanding the spatial distribution and intensity of fishing activity is a prerequisite for estimating fishing impacts on seabed biota and habitats. Vessel monitoring system data provide information on fishing activity at large spatial scales. However, successive position records can be too infrequent to describe the complex movements fishing vessels make. High-frequency position data were collected to evaluate how polling frequency and the method of analysis influenced the estimates of fishing impact on the seabed and associated epifaunal communities. Comparisons of known positions with predictions from track interpolation revealed that the performance of interpolation depended on fleet behaviour. Descriptions and indicators of fishing intensity were influenced significantly by the analytical methods (track reconstruction, density of position records) and grid-cell resolution used for the analysis. These factors can lead to an underestimation of fishing impact on epifaunal communities. It is necessary to correct for such errors to quantify the effects of fishing on various ecosystem components and hence to inform ecosystem-based management. Polling at intervals of 30 min would provide a desirable compromise between achieving precise estimates of fishing impacts on the seabed and minimizing the cost of data collection and handling.
Efficiency comparison of conventional and digital soil mapping for updating soil maps
Kempen, B. ; Brus, D.J. ; Stoorvogel, J.J. ; Heuvelink, G.B.M. ; Vries, F. de - \ 2012
Soil Science Society of America Journal 76 (2012)6. - ISSN 0361-5995 - p. 2097 - 2115.
model-based geostatistics - spatial interpolation - peat soils - information - prediction - uncertainty - variables - knowledge - regression - science
This study compared the efficiency of geostatistical digital soil mapping (DSM) with conventional soil mapping (CSM) for updating soil class and property maps of a cultivated peatland in the Netherlands. For digital soil class mapping, the generalized linear geostatistical model was used. Digital mapping of the soil organic matter (SOM) content and peat thickness was done by universal kriging. The conventional soil class map was created by free survey, while the property maps were created with the representative profile description (RPD) and map unit means (MUM) methods. For each method, we computed the effort invested in the mapping in terms of the sampling and cost densities. The accuracies of the created soil maps were estimated from independent probability sample data. The results showed that for DSM, the cost density could be reduced by a factor of three compared with CSM without compromising accuracy. The map purity of both maps was around 55%. For conventional soil property mapping, the MUM maps were more accurate than the RPD maps. For SOM, CSM-MUM (RMSE 7.5%) performed better than DSM (RMSE 12.1%), although accuracy differences were not significant. For peat thickness, DSM (RMSE 23.3 cm) performed slightly better than CSM-MUM (RMSE 24.9 cm). Despite the differences in accuracy being small, the digital soil property maps were produced more efficiently. The cost density was a factor of 3.5 smaller. We conclude that for updating conventional soil maps in the Dutch peatlands, geostatistical DSM can be more efficient, although not necessarily more accurate, than CSM.
Soil type mapping using the generalised linear geostatistical model: A case study in a Dutch cultivated peatland
Kempen, B. ; Brus, D.J. ; Heuvelink, G.B.M. - \ 2012
Geoderma 189-190 (2012). - ISSN 0016-7061 - p. 540 - 553.
markov random-fields - spatial prediction - categorical variables - information - classification - regression - uncertainty - knowledge - trend - maps
We present the generalised linear geostatistical model (GLGM) for soil type mapping and investigate if spatial prediction with this model results in a soil map of greater accuracy than a map obtained using a non-spatial model, i.e. a model that ignores spatial dependence in the soil type variable. The GLGM is central to the framework of model-based geostatistics. We adopted a pragmatic approach in which the five soil types in a cultivated peatland were separately modelled with a binomial logit-linear GLGM. Prediction with soil type-specific GLGMs resulted in five binomial probabilities at each prediction location, which were standardised to multinomial probabilities by selecting the soil type with maximal probability. A soil map was created from the predicted probabilities. In addition, two non-spatial models were used to map soil type. These were the multinomial logit model and the generalised linear model for Bernoulli-distributed data. Validation with independent probability sample data showed that use of a spatial model for digital soil type mapping did not result in more accurate predictions than those with the non-spatial models.
Using a genetic algorithm as an optimal band selector in the mid and thermal infrared (2.5-14 µm) to discriminate vegetation species
Ullah, S. ; Groen, T.A. ; Schlerf, M. ; Skidmore, A.K. ; Nieuwenhuis, W. ; Vaiphasa, C. - \ 2012
Sensors 12 (2012)7. - ISSN 1424-8220 - p. 8755 - 8769.
spectral discrimination - reflectance - spectroscopy - emissivity - imagery - leaves - identification - spectrometry - regression - plants
Genetic variation between various plant species determines differences in their physio-chemical makeup and ultimately in their hyperspectral emissivity signatures. The hyperspectral emissivity signatures, on the one hand, account for the subtle physio-chemical changes in the vegetation, but on the other hand, highlight the problem of high dimensionality. The aim of this paper is to investigate the performance of genetic algorithms coupled with the spectral angle mapper (SAM) to identify a meaningful subset of wavebands sensitive enough to discriminate thirteen broadleaved vegetation species from the laboratory measured hyperspectral emissivities. The performance was evaluated using an overall classification accuracy and Jeffries Matusita distance. For the multiple plant species, the targeted bands based on genetic algorithms resulted in a high overall classification accuracy (90%). Concentrating on the pairwise comparison results, the selected wavebands based on genetic algorithms resulted in higher Jeffries Matusita (J-M) distances than randomly selected wavebands did. This study concludes that targeted wavebands from leaf emissivity spectra are able to discriminate vegetation species.
Check title to add to marked list
<< previous | next >>

Show 20 50 100 records per page

Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.