- A. Cormont (1)
- S. Dolédec (1)
- Stéphane Dray (4)
- C. Dray (1)
- S. Dray (2)
- H.P. Fierobe (1)
- Cajo J.F. Braak ter (2)
- Cajo J.F. Braak Ter (1)
- A.M. Lopez Contreras (1)
- F. Mingardon (1)
- S. Pavoine (1)
- Pedro Peres-Neto (1)
- P.R. Peres-Neto (1)
- Pedro R. Peres-Neto (2)
- Edoardo Saccenti (1)
- W. Thuiller (1)
- Raffaele Vitale (1)
- Petr Šmilauer (1)
Simple parametric tests for trait–environment association
Braak, Cajo J.F. ter; Peres-Neto, Pedro R. ; Dray, Stéphane - \ 2018
Journal of Vegetation Science 29 (2018)5. - ISSN 1100-9233 - p. 801 - 811.
community ecology - community-level test - CWM of traits - environmental gradients - fourth-corner - functional traits - modified test - species niche centroid - species-level test - statistical ecology - trait–environment relationship
Question: The CWM approach is an easy way of analysing trait–environment association by regressing (or correlating) the mean trait per plot against an environmental variable and assessing the statistical significance of the slope or the associated correlation coefficient. However, the CWM approach does not yield valid tests, as random traits (or random indicator values) are far too often judged significantly related to the environmental variable, even when the trait and environmental variable are extrinsic to (not derived from) the community data. Existing solutions are the ZS-modified test (Zelený & Schaffers,) and the max (or sequential) test based on the fourth-corner correlation. Both tests are based on permutations which become cumbersome when many tests need to be carried out and many permutations are required, as in methods that correct for multiple testing. The main goal of this study was to compare these existing permutation-based solutions and to develop a quick and easy parametric test that can replace them. Methods: This study decomposes the fourth-corner correlation in two ways, which suggests a simple parametric approach consisting of assessing the significances of two linear regressions, one plot-level test as in the CWM approach and one species-level test, the reverse of the CWM approach, that regresses the environmental mean per species (i.e. the species niche centroid) on to the trait. The tests are combined by taking the maximum p-value. The type I error rates and power of this parametric max test are examined by simulation of one- and two-dimensional Gaussian models and log-linear models. Results: The ZS-modified test and the fourth-corner max test are conservative in different scenarios, the ZS-modified test being even more conservative than the fourth-corner. The new parametric max test is shown to control the type I error and has equal or even higher power than permutation tests based on the fourth-corner, the ZS-modified test and variants thereof. A weighted version of the new test showed inflated type I error. Conclusion: The combination of two simple regressions is a good alternative to the fourth-corner and the ZS-modified test. This combination is also applicable when multiple trait measurements are made per plot.
Comparison of dimensionality assessment methods in Principal Component Analysis based on permutation tests
Vitale, Raffaele ; Saccenti, Edoardo - \ 2018
Chemometrics and Intelligent Laboratory Systems 181 (2018). - ISSN 0169-7439 - p. 79 - 94.
Deflation - Eigenanalysis - Monte Carlo simulations - Parallel Analysis (PA) - Rank approximation
We compare the performance of several data permutation methods for assessing dimensionality in Principal Component Analysis. We consider the classical Horn's Parallel Analysis, Dray's approach based on the similarity between the data matrix under study and its lower rank approximation and Vitale et al.’s method based on sequential deflation and rank reduction. Their potential is assessed on a large array of simulated data sets accounting for different data correlation structures, data distributions and homo- and heteroscedastic noise, and on 15 experimental data sets from different disciplines, such as metabolomics, proteomics, chemometrics and sensory analysis. In both the simulated and real life case-studies we report differential behaviours of the concerned techniques for which we propose theoretical explanations. The paper also discusses their limits of applicability and some guidelines are offered to practitioners.
Algorithms and biplots for double constrained correspondence analysis
Braak, Cajo J.F. Ter; Šmilauer, Petr ; Dray, Stéphane - \ 2018
Environmental and Ecological Statistics 25 (2018)2. - ISSN 1352-8505 - p. 171 - 197.
Biplot - Canonical correlation analysis - Canonical correspondence analysis - Community ecology - Fourth-corner correlation - Multivariate analysis - Trait-environment relations
Correspondence analysis with linear external constraints on both the rows and the columns has been mentioned in the ecological literature, but lacks full mathematical treatment and easily available algorithms and software. This paper fills this gap by defining the method as maximizing the fourth-corner correlation between linear combinations, by providing novel algorithms, which demonstrate relationships with related methods, and by making a detailed study of possible biplots and associated approximations. The method is illustrated using ecological data on the abundances of species in sites and where the species are characterized by traits and sites by environmental variables. The trait data and environment data form the external constraints and the question is which traits and environmental variables are associated, how these associations drive species abundances and how they can be displayed in biplots. With microbiome data becoming widely available, these and related multivariate methods deserve more study as they might be routinely used in the future.
Linking trait variation to the environment : Critical issues with community-weighted mean correlation resolved by the fourth-corner approach
Peres-Neto, Pedro R. ; Dray, Stéphane ; Braak, Cajo ter - \ 2017
Ecography 40 (2017)7. - ISSN 0906-7590 - p. 806 - 816.
Establishing trait-environment relationships has become routine in community ecology. Here, we demonstrate that the community weighted means correlation (CWM) and its parallel approach in linking trait variation to the environment, the species niche centroid correlation (SNC), have important shortcomings, arguing against their continuing application. Using mathematical derivations and simulations, we show that the two major issues are inconsistent parameter estimation and unacceptable significance rates when only the environment or only traits are structuring species distributions, but they themselves are not linked. We show how both CWM and SNC are related to the fourth-corner correlation and propose to replace all by the Chessel fourth-corner correlation, which is the fourth-corner correlation divided by its maximum attainable value. We propose an appropriate hypothesis testing procedure that is not only unbiased but also has much greater statistical power in detecting trait-environmental relationships. We derive an additive framework in which trait variation is partitioned among and within communities, which can be then modeled against the environment. We finish by presenting a contrast between methods and an application of our proposed framework across 85 lake-fish metacommunities.
A critical issue in model-based inference for studying trait-based community assembly and a solution
Braak, Cajo J.F. ter; Peres-Neto, Pedro ; Dray, Stéphane - \ 2017
PeerJ 5 (2017). - ISSN 2167-8359
Community composition - Compositional count data - Fourthcorner problem - Generalized linear models - Log-linear model - Negative-binomial response - Poisson regression - Trait-environment association
Statistical testing of trait-environment association from data is a challenge as there is no common unit of observation: the trait is observed on species, the environment on sites and the mediating abundance on species-site combinations. A number of correlation-based methods, such as the community weighted trait means method (CWM), the fourth-corner correlation method and the multivariate method RLQ, have been proposed to estimate such trait-environment associations. In these methods, valid statistical testing proceeds by performing two separate resampling tests, one sitebased and the other species-based and by assessing significance by the largest of the two p-values (the pmax test). Recently, regression-based methods using generalized linear models (GLM) have been proposed as a promising alternative with statistical inference via site-based resampling. We investigated the performance of this new approach along with approaches that mimicked the pmax test using GLM instead of fourth-corner. By simulation using models with additional random variation in the species response to the environment, the site-based resampling tests using GLM are shown to have severely inflated type I error, of up to 90%, when the nominal level is set as 5%. In addition, predictive modelling of such data using site-based cross-validation very often identified trait-environment interactions that had no predictive value. The problem that we identify is not an ``omitted variable bias'' problem as it occurs even when the additional random variation is independent of the observed trait and environment data. Instead, it is a problem of ignoring a random effect. In the same simulations, the GLM-based pmax test controlled the type I error in all models proposed so far in this context, but still gave slightly inflated error in more complex models that included both missing (but important) traits and missing (but important) environmental variables. For screening the importance of single trait-environment combinations, the fourth-corner test is shown to give almost the same results as the GLM-based tests in far less computing time.
Combining the fourth-corner and the RLQ methods for assessing trait responses to environmental variation
Dray, S. ; Choler, P. ; Dolédec, S. ; Peres-Neto, P.R. ; Thuiller, W. ; Pavoine, S. ; Braak, C.J.F. ter - \ 2014
Ecology 95 (2014). - ISSN 0012-9658 - p. 14 - 21.
co-inertia analysis - species traits - community ecology - plant - variables - linking
Assessing trait responses to environmental gradients requires the simultaneous analysis of the information contained in three tables: L (species distribution across samples), R (environmental characteristics of samples) and Q (species traits). Among the available methods, the so-called fourth-corner and RLQ methods are two appealing alternatives that provide a direct way to test and estimate trait-environment relationships. Both methods are based on the analysis of the fourth-corner matrix which crosses traits and environmental variables weighted by species abundances. However, they greatly differ in their outputs: RLQ is a multivariate technique that provides ordination scores to summarize the joint structure among the three tables, whereas the fourth-corner method mainly tests for individual trait-environment relationships (i.e. one trait and one environmental variable at a time). Here, we illustrate how the complementarity between these two methods can be exploited to promote new ecological knowledge and to improve the study of trait-environment relationships. After a short description of each method, we apply them to real ecological data to present their different outputs and provide hints about the gain resulting from their combined use. Read More: http://www.esajournals.org/doi/abs/10.1890/13-0196.1
Improved testing of species traits-environment relationships in the fourth corner problem
Braak, C.J.F. ter; Cormont, A. ; Dray, S. - \ 2012
Ecology 93 (2012)7. - ISSN 0012-9658 - p. 1525 - 1526.
The fourth corner problem entails estimation and statistical testing of the relationship between species traits and environmental variables from the analysis of three data tables. Dray and Legendre (2008, Ecology, 89, 3400-34) proposed and evaluated five permutation methods for statistical significance testing, including a new two-step testing procedure. However, none of these attained the correct type I error in all cases of interest. We solve this problem by showing that a small modification of their two-step procedure controls the type I error in all cases. The modification consists of adjusting the significance level from va to a or, equivalently, of reporting the maximum of the individual P-values as the final one. The test is also applicable to the three table ordination method RLQ
Incorporation of fungal cellulases in bacterial minicellulosomes yields viable, synergistically acting celluloytic complexes
Mingardon, F. ; Chanal, A. ; Lopez Contreras, A.M. ; Dray, C. ; Bayer, E.A. ; Fierobe, H.P. - \ 2007
Applied and Environmental Microbiology 73 (2007)12. - ISSN 0099-2240 - p. 3822 - 3832.
clostridium-cellulolyticum - crystalline cellulose - cellobiohydrolase-ii - dockerin domain - piromyces-equi - trichoderma-reesei - endoglucanase - thermocellum - substrate - xylanase
Artificial designer minicellulosomes comprise a chimeric scaffoldin that displays an optional cellulose-binding module (CBM) and bacterial cohesins from divergent species which bind strongly to enzymes engineered to bear complementary dockerins. Incorporation of cellulosomal cellulases from Clostridium cellulolyticum into minicellulosomes leads to artificial complexes with enhanced activity on crystalline cellulose, due to enzyme proximity and substrate targeting induced by the scaffoldin-borne CBM. In the present study, a bacterial dockerin was appended to the family 6 fungal cellulase Cel6A, produced by Neocallimastix patriciarum, for subsequent incorporation into minicellulosomes in combination with various cellulosomal cellulases from C. cellulolyticum. The binding of the fungal Cel6A with a bacterial family 5 endoglucanase onto chimeric miniscaffoldins had no impact on their activity toward crystalline cellulose. Replacement of the bacterial family 5 enzyme with homologous endoglucanase Cel5D from N. patriciarum bearing a clostridial dockerin gave similar results. In contrast, enzyme pairs comprising the fungal Cel6A and bacterial family 9 endoglucanases were substantially stimulated (up to 2.6-fold) by complexation on chimeric scaffoldins, compared to the free-enzyme system. Incorporation of enzyme pairs including Cel6A and a processive bacterial cellulase generally induced lower stimulation levels. Enhanced activity on crystalline cellulose appeared to result from either proximity or CBM effects alone but never from both simultaneously, unlike minicellulosomes composed exclusively of bacterial cellulases. The present study is the first demonstration that viable designer minicellulosomes can be produced that include (i) free (noncellulosomal) enzymes, (ii) fungal enzymes combined with bacterial enzymes, and (iii) a type (family 6) of cellulase never known to occur in natural cellulosomes.