- Miguel D. Mahecha (1)
- Ethan Deyle (1)
- Clark Glymour (1)
- Egbert H. Nes van (1)
- Marlene Kretschmer (1)
- José Moreno (2)
- Jordi Muñoz-Marí (3)
- Juan Pablo Rivera (2)
- Jonas Peters (1)
- Rick Quax (1)
- Markus Reichstein (1)
- Jakob Runge (1)
- Marten Scheffer (1)
- Bernhard Schölkopf (1)
- Peter Spirtes (1)
- George Sugihara (1)
- Jie Sun (1)
- Frank Veroustraete (2)
- Jochem Verrelst(older publications) (1)
- Jochem Verrelst (1)
- Kun Zhang (1)
- Jakob Zscheischler (1)
Inferring causation from time series in Earth system sciences
Runge, Jakob ; Bathiany, Sebastian ; Bollt, Erik ; Camps-Valls, Gustau ; Coumou, Dim ; Deyle, Ethan ; Glymour, Clark ; Kretschmer, Marlene ; Mahecha, Miguel D. ; Muñoz-Marí, Jordi ; Nes, Egbert H. van; Peters, Jonas ; Quax, Rick ; Reichstein, Markus ; Scheffer, Marten ; Schölkopf, Bernhard ; Spirtes, Peter ; Sugihara, George ; Sun, Jie ; Zhang, Kun ; Zscheischler, Jakob - \ 2019
Nature Communications 10 (2019)1. - ISSN 2041-1723
The heart of the scientific enterprise is a rational effort to understand the causes behind the phenomena we observe. In large-scale complex dynamical systems such as the Earth system, real experiments are rarely feasible. However, a rapidly increasing amount of observational and simulated data opens up the use of novel data-driven causal methods beyond the commonly adopted correlation techniques. Here, we give an overview of causal inference frameworks and identify promising generic application cases common in Earth system sciences and beyond. We discuss challenges and initiate the benchmark platform causeme.net to close the gap between method users and developers.
Experimental Sentinel-2 LAI estimation using parametric, non-parametric and physical retrieval methods - A comparison
Verrelst, Jochem ; Rivera, Juan Pablo ; Veroustraete, Frank ; Muñoz-Marí, Jordi ; Clevers, J.G.P.W. ; Camps-Valls, Gustau ; Moreno, José - \ 2015
ISPRS Journal of Photogrammetry and Remote Sensing 108 (2015). - ISSN 0924-2716 - p. 260 - 272.
Biophysical variables - Machine learning - Non-parametric - Parametric - Physically-based RTM inversion - Sentinel-2
Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC), collected at the agricultural site of Barrax (Spain), was used to evaluate different retrieval methods on their ability to estimate leaf area index (LAI). With regard to parametric methods, all possible band combinations for several two-band and three-band index formulations and a linear regression fitting function have been evaluated. From a set of over ten thousand indices evaluated, the best performing one was an optimized three-band combination according to (ρ560-ρ1610-ρ2190)/(ρ560+ρ1610+ρ2190) with a 10-fold cross-validation RCV2 of 0.82 (RMSECV: 0.62). This family of methods excel for their fast processing speed, e.g., 0.05s to calibrate and validate the regression function, and 3.8s to map a simulated S2 image. With regard to non-parametric methods, 11 machine learning regression algorithms (MLRAs) have been evaluated. This methodological family has the advantage of making use of the full optical spectrum as well as flexible, nonlinear fitting. Particularly kernel-based MLRAs lead to excellent results, with variational heteroscedastic (VH) Gaussian Processes regression (GPR) as the best performing method, with a RCV2 of 0.90 (RMSECV: 0.44). Additionally, the model is trained and validated relatively fast (1.70s) and the processed image (taking 73.88s) includes associated uncertainty estimates. More challenging is the inversion of a PROSAIL based radiative transfer model (RTM). After the generation of a look-up table (LUT), a multitude of cost functions and regularization options were evaluated. The best performing cost function is Pearson's χ-square. It led to a R2 of 0.74 (RMSE: 0.80) against the validation dataset. While its validation went fast (0.33s), due to a per-pixel LUT solving using a cost function, image processing took considerably more time (01:01:47). Summarizing, when it comes to accurate and sufficiently fast processing of imagery to generate vegetation attributes, this paper concludes that the family of kernel-based MLRAs (e.g. GPR) is the most promising processing approach.
Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties - A review
Verrelst, Jochem ; Camps-Valls, Gustau ; Muñoz-Marí, Jordi ; Rivera, Juan Pablo ; Veroustraete, Frank ; Clevers, J.G.P.W. ; Moreno, José - \ 2015
ISPRS Journal of Photogrammetry and Remote Sensing 108 (2015). - ISSN 0924-2716 - p. 273 - 290.
Bio-geophysical variables - Hybrid - Machine learning - Non-parametric - Operational variable retrieval - Parametric - Physical
Forthcoming superspectral satellite missions dedicated to land monitoring, as well as planned imaging spectrometers, will unleash an unprecedented data stream. The processing requirements for such large data streams involve processing techniques enabling the spatio-temporally explicit quantification of vegetation properties. Typically retrieval must be accurate, robust and fast. Hence, there is a strict requirement to identify next-generation bio-geophysical variable retrieval algorithms which can be molded into an operational processing chain. This paper offers a review of state-of-the-art retrieval methods for quantitative terrestrial bio-geophysical variable extraction using optical remote sensing imagery. We can categorize these methods into (1) parametric regression, (2) non-parametric regression, (3) physically-based and (4) hybrid methods. Hybrid methods combine generic capabilities of physically-based methods with flexible and computationally efficient methods, typically non-parametric regression methods. A review of the theoretical basis of all these methods is given first and followed by published applications. This paper focusses on: (1) retrievability of bio-geophysical variables, (2) ability to generate multiple outputs, (3) possibilities for model transparency description, (4) mapping speed, and (5) possibilities for uncertainty retrieval. Finally, the prospects of implementing these methods into future processing chains for operational retrieval of vegetation properties are presented and discussed.