|Title||Sampling for digital soil mapping : A tutorial supported by R scripts|
|Source||Geoderma (2018). - ISSN 0016-7061|
Biometris (WU MAT)
|Publication type||Refereed Article in a scientific journal|
|Availibility||Full text available from 2020-08-19|
|Keyword(s)||K-means sampling - Kriging - Latin hypercube sampling - Model-based sampling - Spatial coverage sampling - Spatial simulated annealing - Variogram|
In the past decade, substantial progress has been made in model-based optimization of sampling designs for mapping. This paper is an update of the overview of sampling designs for mapping presented by de Gruijter et al. (2006). For model-based estimation of values at unobserved points (mapping), probability sampling is not required, which opens up the possibility of optimized non-probability sampling. Non-probability sampling designs for mapping are regular grid sampling, spatial coverage sampling, k-means sampling, conditioned Latin hypercube sampling, response surface sampling, Kennard-Stone sampling and model-based sampling. In model-based sampling a preliminary model of the spatial variation of the soil variable of interest is used for optimizing the sample size and or the spatial coordinates of the sampling locations. Kriging requires knowledge of the variogram. Sampling designs for variogram estimation are nested sampling, independent random sampling of pairs of points, and model-based designs in which either the uncertainty about the variogram parameters, or the uncertainty about the kriging variance is minimized. Various minimization criteria have been proposed for designing a single sample that is suitable both for estimating the variogram and for mapping. For map validation, additional probability sampling is recommended, so that unbiased estimates of map quality indices and their standard errors can be obtained. For all sampling designs, R scripts are available in the supplement. Further research is recommended on sampling designs for mapping with machine learning techniques, designs that are robust against deviations of modeling assumptions, designs tailored at mapping multiple soil variables of interest and soil classes or fuzzy memberships, and probability sampling designs that are efficient both for design-based estimation of populations means and for model-based mapping.