|Title||How to compare sampling designs for mapping?|
|Author(s)||Wadoux, Alexandre M.J.C.; Brus, Dick J.|
|Source||European Journal of Soil Science (2020). - ISSN 1351-0754|
Mathematical and Statistical Methods - Biometris
|Publication type||Refereed Article in a scientific journal|
|Keyword(s)||Kriging - machine learning - pedometrics - random forest - soil sampling - validation|
If a map is constructed through prediction with a statistical or non-statistical model, the sampling design used for selecting the sample on which the model is fitted plays a key role in the final map accuracy. Several sampling designs are available for selecting these calibration samples. Commonly, sampling designs for mapping are compared in real-world case studies by selecting just one sample for each of the sampling designs under study. In this study, we show that sampling designs for mapping are better compared on the basis of the distribution of the map quality indices over repeated selection of the calibration sample. In practice this is only feasible by subsampling a large dataset representing the population of interest, or by selecting calibration samples from a map depicting the study variable. This is illustrated with two real-world case studies. In the first case study a quantitative variable, soil organic carbon, is mapped by kriging with an external drift in France, whereas in the second case a categorical variable, land cover, is mapped by random forest in a region in France. The performance of two sampling designs for mapping are compared: simple random sampling and conditioned Latin hypercube sampling, at various sample sizes. We show that in both case studies the sampling distributions of map quality indices obtained with the two sampling design types, for a given sample size, show large variation and largely overlap. This shows that when comparing sampling designs for mapping on the basis of a single sample selected per design, there is a serious risk of an incidental result. Highlights: We provide a method to compare sampling designs for mapping. Random designs for selecting calibration samples should be compared on the basis of the sampling distribution of the map quality indices.