Feature filtering and selection for dry matter estimation on perennial ryegrass: A case study of vegetation indices
Alckmin, G.T. ; Kooistra, L. ; Lucieer, A. ; Rawnsley, R. - \ 2019
In: ISPRS Geospatial Week 2019, 10–14 June 2019, Enschede, The Netherlands. - ISPRS (International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives ) - p. 1827 - 1831.
Biomass - Collinearity - Dry Matter - Feature Selection - Machine Learning - Pasture - Perennial Ryegrass - Vegetation Indices
Vegetation indices (VIs) have been extensively employed as a feature for dry matter (DM) estimation. During the past five decades more than a hundred vegetation indices have been proposed. Inevitably, the selection of the optimal index or subset of indices is not trivial nor obvious. This study, performed on a year-round observation of perennial ryegrass (n Combining double low line 900), indicates that for this response variable (i.e. kg.DM.ha−1), more than 80% of indices present a high degree of collinearity (correlation > |0.8|.) Additionally, the absence of an established workflow for feature selection and modelling is a handicap when trying to establish meaningful relations between spectral data and biophysical/biochemical features. Within this case study, an unsupervised and supervised filtering process is proposed to an initial dataset of 97 VIs. This research analyses the effects of the proposed filtering and feature selection process to the overall stability of final models. Consequently, this analysis provides a straightforward framework to filter and select VIs. This approach was able to provide a reduced feature set for a robust model and to quantify trade-offs between optimal models (i.e. lowest root mean square error – RMSE Combining double low line 412.27 kg.DM.ha−1) and tolerable models (with a smaller number of features – 4 VIs and within 10% of the lowest RMSE.).