Functional Ecology. 2022;00:1–17. wileyonlinelibrary.com/journal/fec | 1 Received: 5 January 2022  | Accepted: 22 March 2022 DOI: 10.1111/1365-2435.14047 R E S E A R C H A R T I C L E Metacommunity dynamics and the detection of species associations in co- occurrence analyses: Why patch disturbance matters Vincent Calcagno1  | Nik J. Cunniffe2  | Frédéric M. Hamelin3 Authors are listed in alphabetical order and share equal co- authorship. 1Institut Sophia Agrobiotech, Université Côte d'Azur, INRAE, CNRS, Sophia- Antipolis, France 2Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom 3Institut Agro, Univ Rennes, INRAE, IGEPP, Rennes, France Correspondence Nik J. Cunniffe Email: njc1001@cam.ac.uk Handling Editor: Raul Costa- Pereira Abstract 1. Many statistical methods attempt to detect species associations— and so infer interspecific interactions— from species co- occurrence patterns. Habitat het- erogeneity and out- of- equilibrium colonization histories are well recognized as potentially causing species associations, even when interactions are absent. The potential for patch disturbance, a classical component of metacommunity dy- namics, to also drive spurious species associations has however been overlooked. 2. Using a new general metacommunity model, we derive mathematical predic- tions regarding how patch disturbance would affect the patterns of species as- sociations detected in ‘null’ co- occurrence matrices. We also conduct numerical simulations to test our predictions and to compare the performance of several widespread statistical methods, including direct tests of pairwise independence, matrix permutation approaches and joint species distribution modelling. 3. We show how classical metacommunity dynamics can produce statistical as- sociations, both positive and negative, even when species do not interact, when there is no habitat heterogeneity, and at equilibrium. This occurs as soon as there is some rate of patch disturbance (i.e. simultaneous extinction of several species in a patch) and/or a finite life span of patches, a common feature of a broad range of plant, animal or microbial systems. 4. Patch disturbance can compromise species co- occurrence analyses and cause the artefactual detection of species associations if not taken into account. Including patch age (i.e. the time since the last patch disturbance event) as a covariate in a joint species distribution model can resolve the artefact. However, this requires additional data that often are not available in practice. We argue that the consequences of patch disturbance should not be underestimated when analysing species distribution patterns in metacommunity- like systems. K E Y W O R D S Hmsc, independence, interactions, JSDM, metapopulation, permutations, Sim2, Sim9 This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2022 The Authors. Functional Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society. 2  |   Functional Ecology CALCAGNO et AL. 1  |  INTRODUC TION Community ecology has long been concerned with inferring in- teractions between species (e.g. competition or facilitation) from co- occurrence data (Caswell, 1976; Cohen, 1970; Connor & Simberloff, 1979; Diamond, 1975; Forbes, 1907). Recently, metage- nomics approaches have renewed this use of co- occurrence patterns as a means to infer species interaction networks, in environments ranging from soils to the human microbiome (Barberán et al., 2012; Faust & Raes, 2012). If species do not interact, intuition suggests the proportion of patches (or hosts) where species co- occur should be the product of the proportions of patches occupied by each species. Testing for pairwise independence, that is, looking for sta- tistical patterns of association, consequently underpins most meth- ods for inferring species interactions from presence– absence data (Gotelli, 2000; Gotelli & Ulrich, 2012; Ovaskainen & Abrego, 2020). Positive associations (excess of co- occurrences) may indicate facil- itation, whereas negative associations (deficit of co- occurrences) may indicate competition. However, it is well known that several pitfalls can compromise this approach, since correlation is not equivalent to interaction (Barner et al., 2018; Blanchet et al., 2020; Molina & Stone, 2020). As a simple example, patch heterogeneity and differential habitat preferences among species (i.e. environmental filtering) can cause species associations, even in the absence of any interaction between the pairs of species involved. For instance, if some patch types host more species than others because of their larger size or more favour- able conditions, the occurrence of a species in a patch increases the odds of occurrence of other species, simply because it makes it more likely the patch is of higher quality. This would generate positive as- sociation signals if not controlled for. Succession or non- equilibrium dynamics, that is, species expanding through space, concomitantly or differentially, are also expected to cause similar patterns (D'Amen et al., 2018). More recent statistical and logical frameworks incor- porate and test for alternative mechanisms to species interactions (Blois et al., 2014; D'Amen et al., 2018; Ulrich et al., 2017). However, it is commonly expected that in the absence of any of the above con- founding factors and others (such as dispersal limitation), or after having appropriately controlled for them, classical extinction/recol- onization dynamics (metacommunity dynamics; Leibold et al., 2004) would not introduce particular species associations, unless ecological interactions are actually occurring (Opedal, von Numers, et al., 2020). Here we show that a specific component of metacommunity dynamics, patch disturbance, has consequences that have been overlooked so far. Patch disturbance here means the stochastic occurrence of patch extinctions (Hastings, 1980, 2003). Patch dis- turbance events denote extreme but possibly frequent events, such as fires, droughts, floods and others, causing the local extinction of all, or perhaps some subset of all, species from a patch (Leibold et al., 2004; Sousa, 1984). This can also correspond to the actual destruction and disappearance of a patch, for instance in ecosys- tems managed by humans, such as agricultural fields or forest stands (Ovaskainen et al., 2017), or in systems for which ‘patches’ are actually hosts with a finite life span, such as parasite or microbiote communities (Friedman & Alm, 2012; Hamelin et al., 2019). Patch disturbances are ubiquitous in natural and anthropogenic systems (Jentsch & White, 2019; Pickett & White, 2013). Although naturally regarded as a part of metapopulation and metacommunity frame- works (e.g. Calcagno et al., 2011; Hastings, 1980, 2003; Leibold et al., 2004; Levin & Paine, 1974), patch disturbance is often omitted (e.g. Slatkin, 1974) or confounded with species- specific extinctions (e.g. Thompson et al., 2020). Although this may seem a benign math- ematical simplification, we will show here that this omission can have non- trivial implications. We first introduce a general metacommunity model of patch dy- namics, that allows for patch disturbance to occur at some, poten- tially time- varying, rate, and in which patches can have a finite life span. The model generalizes many existing particular models, and is ‘null’ in the sense that it lacks the ingredients regarded as causing species statistical associations: species do not interact, the habitat is completely homogeneous and all species disperse uniformly with no patch preferences. We use this model to generate null sample co- occurrence matrices, drawn from the steady- state occupancies predicted from the model. We then apply different approaches as often are used to detect species associations on these null synthetic data. The methods considered include direct statistical tests of in- dependence (e.g. Veech, 2013), different permutation- based matrix analyses (e.g. Gotelli, 2000) and joint species distribution modelling (e.g. Ovaskainen & Abrego, 2020). In each case, we first derive math- ematical predictions concerning the expected performance of the method in question, and then test these predictions with numerical simulations. We show that the occurrence of patch disturbance will yield characteristic patterns of spurious species associations, for all methods that do not explicitly model the effect of patch age (time since last disturbance event) as a covariate. We show that a quan- tity we term species ‘fastness’, that captures how quickly species occurrence probability recovers after disturbance under the effect of recolonization/extinction dynamics, is a key determinant of the expected patterns of spurious associations. We show that the use of joint species distribution modelling with patch age as an envi- ronmental covariate (e.g. Ovaskainen et al., 2017) is the only way to achieve satisfactory performance (absence of spurious signals). Since patch age would often be unknown, we explore whether patch richness (defined as the number of species in a patch) could suffice as a proxy for patch age, but we find it does not. We suggest the implications of patch disturbance as a constituent of metacommu- nity dynamics should not be underestimated, and that explicit data on patch age would generally be required for accurate species co- occurrence analysis in metacommunity- like systems. 2  |  MATERIAL S AND METHODS This study is based entirely upon theoretical and numerical in- vestigation of a mathematical model and did not generate or use     |  3Functional EcologyCALCAGNO et AL. experimental data. Therefore, neither ethical approval nor site per- mission were required. 2.1  |  A general metacommunity model We consider a metacommunity model describing the extinction– recolonization dynamics of s non- interacting species over a large num- ber of identical patches (Caswell, 1976). Each species (i ) has its own colonization rate ci per occupied patch, as well as a constant rate of immigration from outside the metacommunity, mi (Hastings, 1987). Any species can go extinct in a patch, at some species- dependent rate ei. In addition, patches can undergo catastrophic disturbances, after which all species in the patch immediately go extinct, irrespec- tive of species composition (Calcagno et al., 2011; Hastings, 1980; Leibold et al., 2004). Such catastrophes occur at rate 휇x, where x is the ‘age’ of a patch, that is, the time since its appearance or since it experienced the last catastrophic disturbance event (Hastings, 2003; Levin & Paine, 1974). Finally, there can be some maximal patch age X after which a catastrophic disturbance systematically occurs (Olivieri et al., 1995). The model is illustrated in Figure 1. Let pi,x,t be the fraction of patches that have age x and are oc- cupied by species i at time t (Hastings, 1991). The fraction of patches occupied by species i is pi,∙,t, and the fraction of patches with age x at time t is p∙,x,t. The fraction of patches with age x that are not yet occupied by species i is therefore p∙,x,t − pi,x,t. We refer to Appendix S1 (Section S1- 1.1) for more technical definitions. The general metacommunity model can be written as, for all i = 1, 2,…, s, This framework generalizes classical metacommunity models (e.g. Bell, 2001; Cohen, 1970; Hanski, 1982; Hanski & Gyllenberg, 1997; Hastings, 1980, 1987; Slatkin, 1974; Tilman, 1994; Tilman et al., 1994), deriving from MacArthur and Wilson (1963)'s ‘mainland- island’ model (ci = 0), and from Levins (1969)'s metapopulation model (mi = 0). Such metacommunity models, when they consider patch catastrophic extinctions at all, assume a constant disturbance rate 휇x = 휇, but this need not be the case. The rate of disturbance might be increasing with patch age, producing an accelerating failure time model, for instance if a patch is a host that ages and suffers higher mortality with aging; this is termed as Type 1 survivorship curve in Begon and Townsend (2020). Conversely, if patches have some variability in their risk of disturbance, younger patches would go extinct at a relatively high rate, whereas older patches are relatively more dis- turbance resistant, causing a decline of 휇x with patch age (Type 3 survivorship curve). Another common situation is when patches have a finite lifetime and systematically get destroyed after some (1) 휕pi,x,t 휕x + 휕pi,x,t 휕t = − ( 휇x + ei ) pi,x,t + ( cipi,∙,t + mi )( p∙,x,t − pi,x,t ) . F I G U R E 1  Graphical summary of our model and approach. Top: null metacommunity model, its assumptions and the processes it describes. Note that patch disturbances may occur at a certain rate, possibly dependent on the time since the last disturbance. Alternatively, they may occur after some prescribed amount of time (introducing a maximum life span of patches). Bottom: from the metacommunity model, using specific parameter values, we can generate sample co- occurrence matrices of arbitrary size. We can then subject these matrices to standard analytical methods to search for signals of species association. Considering that our model assumes no species interaction, no habitat heterogeneity or differences in habitat selection among species, we expect no significant association for any pair of species. Any such association would be a spurious signal (type I error, i.e. false positive) 4  |   Functional Ecology CALCAGNO et AL. prescribed duration X (Pickett & White, 2013), as may happen for instance in agricultural settings where crops are harvested after some fixed time. Our framework encompasses all such situations. As special cases, the classical ‘mainland- island’ and ‘Levins’ models correspond to ci = 0 and mi = 0, respectively (Gotelli, 1991; Hanski & Gyllenberg, 1993), together with X →∞ and 휇x = 휇, a constant (see Appendix S1, Section S1- 2). 2.2  |  Steady- state occupancies and co- occurrence patterns At steady state, we can drop the t subscripts. The overall occupancy of species i , pi,∙, can be derived explicitly in the special cases of the mainland- island or Levins models, for which simple expressions exist (see Appendix S1, Section S1- 2.1). For instance, pi,∙ = 1 − (ei + 휇)∕ci in the Levins model. In general, no such solution exists when the rate of patch disturbance depends on age, but pi,∙ can be computed numerically, for example, using a recursive algorithm outlined in Appendix S1, Section S1- 1.4. It would be possible to derive a separate equation for the dynam- ics of every possible patch state, that is, the fraction of patches that are occupied by any given combination of species, and solve these (as done in Appendix S1, Section S1- 2.2, for two species). However, this is computationally inefficient. To obtain random co- occurrence matrices drawn from the above model, we instead use a Monte- Carlo procedure that generates the desired number of patches and draws their species composition. We provide a comprehensive set of R functions that make it easy to compute the steady- state occupancy of each species and generate random species co- occurrence matri- ces from the general model (1), for any number of species, patches and any shape of the disturbance function 휇x (see Appendix S2). We will use this method to generate null co- occurrence matrices that involve no species interactions, just metacommunity dynamics (Figure 1). 2.3  |  Introducing relative distribution profiles and ‘fastness’ As illustrated in Figure 2a, at steady state, there exists a stable distri- bution of patch ages, and species occupancies vary as a function of patch age. We here derive a quantity 휋i∕x representing how species i is distributed over patch ages. We term this quantity the relative distribution profile of a species (Figure 2b). Values smaller (larger) than one imply the species i is rarer (more frequent) in patches of age x, relative to its overall occupancy pi,∙. We first define the fraction of patches of age x that are occupied by species i as pi|x = pi,x∕p∙,x, for which we obtain an explicit expres- sion from model 1 at steady state (Appendix S1, Section S1- 1.3). The relative distribution profile of species i is then derived as the frac- tion of patches of age x that are occupied by species i , relative to its overall probability of occupancy pi,∙: As shown in Figure 2b, the mean value of the profile, for any species, is equal to one by definition (Appendix S1, Section S1- 1.5). To facilitate interpretation, we will discriminate species based on their ‘fastness’: fast species are those which have very flat rela- tive distribution profiles that quickly jump from zero and saturate at a value close to one (Figure 2b). Slow species, on the contrary, are those which have relative distribution profiles that slowly increase from zero and reach much larger values as the patch age gets larger. Formally, we will quantify the fastness of a relative distribution profile as its variance over patch ages, Var(휋i∕x). A maximally fast spe- cies has 휋i∕x = 1 for all x, that is, zero variance (Figure 2). Conversely, a very slow species has a relative distribution that gradually climbs to very large values, implying a huge variance. It can be shown that the variance of a species is also strictly decreasing with the initial slope of its relative distribution profile (Appendix S1, Section S1- 1.7). The variance of a species profile is thus a metric of ‘slowness’, and its in- verse is a metric of fastness. Importantly, the fastness of a species is not directly related to its overall occupancy in the metacommunity. 2.4  |  Methods to test for species associations in co- occurrence matrices Methods to infer species associations from co- occurrence patterns are all based on the probability of co- occurrence, that is, the fraction of patches in which both of the two species are found. Various meth- ods exist to determine whether the observed probability deviates from statistical independence, under some appropriate ‘null’ model. The probability of co- occurrence of two species i and j, that is, the overall fraction of patches in which the two species are found, will be denoted qi,j,∙ (at steady state). A straightforward null expectation is that of species indepen- dence: if species are not interacting, they should be distributed independently across patches, and thus the value of qi,j,∙ should be compared to the null value pi,∙pj,∙ using a standard test for contin- gency tables (Veech, 2013). In this article, we will use pairwise Fisher tests for independence as an exemplar of this type of method. The prediction of species independence is straightforward and intuitive, but is known to suffer from several limitations. To circum- vent these, more sophisticated null models are often preferred. A widespread approach consists of using permutation schemes in- tended to break species associations in co- occurrence matrices while retaining important species and patch differences (Gotelli, 2000). In this context, the C- score (Stone & Roberts, 1990) is the metric most commonly used to quantify the tendency of species to be segre- gated or aggregated in co- occurrence matrices. The partial C- score between two species, Ci,j, can be expressed as with N the total number of sites (patches) in the matrix. (2)휋i∕x = pi|x pi,∙ = 1 pi,∙ cipi,∙ + mi cipi,∙ + mi + ei ( 1 − exp ( − (cipi,∙ + mi + ei)x )) . (3)Ci,j = N 2(pi,∙ − qi,j,∙)(pj,∙ − qi,j,∙) ,     |  5Functional EcologyCALCAGNO et AL. Values of Ci,j larger than expected indicate segregation (e.g. com- petition) between the two species, whereas values smaller than expected indicate preferential association (e.g. facilitative interac- tions). Several null models have been proposed to determine the expected Ci,j value and test for deviations from it. The most classical methods are the so- called fixed- equiprobable (or sim2) and fixed- fixed (or sim9) permutation schemes that reshuffle matrix values while keeping the row sums fixed, or both the row and column sums fixed, respectively (Gotelli & Ulrich, 2012; Münkemüller et al., 2020). We will consider both approaches. More recently, flexible methods relying on the joint statistical modelling of species occurrences have gained a lot of popularity. These models are sufficiently flexible to allow several fixed or ran- dom predictors to be incorporated in modelling species occurrences, allowing different aspects of habitat variability or species trait dif- ferences to be controlled against. In such a framework, one can infer species associations from their residual covariances, that is, the correlation between their residual probabilities of occurrence. One commonly used example of this approach are hierarchical mod- els of species composition, as implemented in the Hmsc r package (Ovaskainen & Abrego, 2020; Tikhonov et al., 2020). This is the method we will use in this article. In this framework, we will use four different model specifica- tions. First, a null model (Hmsc M0) that includes no latent factor or covariate, and therefore is equivalent to the above simplest methods (Fisher's tests or fixed- equiprobable permutation schemes). Second, differences across patches will be accounted for by including patch richness (i.e. the number of species in a patch) as a latent factor (Hmsc M1). Then, patch age (x) will be explicitly included as an en- vironmental covariate in the model (Hmsc M2). As a variant of M2, patch richness will be also used as a proxy for patch age (Hmsc M2'). In other words, the difference between models Hmsc M1 and Hmsc M2' is that patch richness is a latent factor in the former, but an en- vironmental covariate in the latter. 2.5  |  Parameter values and numerical simulations In the following, we will analyse how null co- occurrence matri- ces generated from our general metacommunity model behave, in terms of producing significant species associations, when sub- jected to the different methods described above (Figure 1). We will first derive mathematical predictions and test them with numerical simulations. For all numerical simulations, we will use a synthetic community of 31 species, constructed to exhibit variation in their occupancies and extinction/colonization parameters (Figure 4a; see Appendix S2 for details). Although the parameterization does not correspond to a particular set of real species, the range of variation they present in colonization rates is similar to that found in natural metacommuni- ties, for instance in Caribbean pond snails (Dubart et al., 2019) and grassland plant communities (Tilman, 1994). We assumed the most classical metacommunity setup, that is, no external immigration (mi = 0) and a constant rate of patch dis- turbance. The value of the disturbance rate was set to 휇(x) = 0.1 or 휇(x) = 0.2 per unit of time. The maximum life span of patches was set to X = 20 time units. These values correspond to low or interme- diate levels of patch disturbance, considering the average species- specific extinction rate was 0.5 and the average colonization rate was 1.5. This means that extinctions caused by patch disturbance were always several times less frequent than other extinction/recol- onization events. We chose these values in a conservative mindset, knowing that in metacommunities more dominated by patch distur- bance, and/or with more variable species traits, the patterns we re- port could be more pronounced. F I G U R E 2  Example steady- state solutions of Equation (1), obtained by numerical simulation. Three species differing in fastness are used for illustration. (a) In this example, the stable patch- age distribution was exponential with mean 10 units of time (top curve). For any patch age x, each species has some patch occupancy pi,x (bottom curves), leaving a certain fraction of patches empty. Note that in this example the fastest species are more abundant overall, but this need not be the case. (b) Relative distribution profiles (휋i∕x). The faster species has steeper initial slope and attains lower asymptotic values. As a species gets faster and faster, its profile converges to the horizontal dashed line (i.e. a value of 1 for all patch ages). As shown in the inset, faster species have a smaller variance in 휋i∕x values, and as a species gets faster and faster, the distribution converges to a Dirac delta function at value 1 (vertical dashed line). Parameters: mi = 0.01, 휇 = 0.1, X = ∞ and ei = (0.05, 0.1, 0.2) and ci = (0.2, 0.3, 1) for slow, intermediate and fast species, respectively (b) (a) All patches Empty patches Occupied patches Occupied by: Faster species Intermediate species Slower species R el at iv e pr of ile ( ) Patch age ( ) Pr ob ab ilit y de ns ity Pr ob ab ilit y de ns ity values 6  |   Functional Ecology CALCAGNO et AL. We varied the sample size (number of patches in the co- occurrence matrices) between 300 and 1,500, with intermediate values 500 and 1,000 (for recent studies using similar sample sizes, see Dubart et al., 2019; Facon et al., 2021; Opedal, Ovaskainen, et al., 2020). Finally, we also considered the possibility that not all species might be affected by patch disturbance: some species might be immune to disturbance (‘immune species’). To address this possi- bility, we used the same set of 31 species, but rendered six of them, evenly spaced along trait values, invulnerable to patch disturbance (휇(x) = 0 for them), for each parameter combination. This corresponded to a total of 2 × 4 × 2 = 16 parameter combi- nations. For each combination, we generated random co- occurrence matrices from the null metacommunity model (Equation 1). Each time, we applied the different methods introduced above to test for sig- nificant species associations, with standard settings recommended in the literature (Gotelli, 2000; Ovaskainen & Abrego, 2020). We computed, for each parameter combination and statistical method, the percent of spurious positive associations (fraction of species pairs declared positively associated) and the percent of spurious negative associations (fraction of species pairs declared negatively associated). All simulations and figures reported can be reproduced using the accompanying R markdown (Appendix S2). 3  |  RESULTS 3.1  |  Mathematical predictions In this section, we analyse mathematically the action that different methods have if applied on community matrices generated from our general model (1) and what conclusions would follow in terms of the species associations detected. We derive predictions under the limit of large community matrices (many patches and many species). Readers who are not mathematically inclined might decide to skip this section, and focus on the predictions we obtained, as illustrated in Figure 3c– f. 3.1.1  |  Patch disturbance generates spurious positive associations with methods that do not account for patch age Direct independence tests (e.g. Fisher's tests) In the absence of any interaction, species are independently distrib- uted among patches within any particular age class (Appendix S1, Section S1- 3). This implies that the fraction of patches occupied by species i and j for all patch ages (qi,j) will be determined by the steady- state probabilities of patches of age x co- occupied by species i (pi|x) and species j (pj|x): which can be equivalently expressed as: given that the mean of the profile is one (Appendix S1, Section S1- 1.5). Since 휋i∕x and 휋j∕x are increasing functions of x (Equation 2), from Harris' inequality, we have Cov(휋i∕x ,휋 j∕x) ≥ 0 (Appendix S1, Section S1- 1.5). From (5), we therefore have Equality occurs only if at least one of the two species is distrib- uted uniformly over all patch ages, that is, 휋k∕x = 1 for all x so that Cov(휋i∕x ,휋 j∕x) = 0. From Equation 2, this is not the case in general. This only occurs if there is no patch disturbance at all (휇x = 0 and X →∞), in which case all patches are effectively infinitely old so that 휋i∕x and 휋j∕x both tend to one, and Cov(휋i∕x ,휋 j∕x) tends to zero. But in all other cases, species are not distributed independently: they are positively associated, that is, they co- occur more often than expected from the ‘null’ product of their respective overall occupancies (see Equation 6). As a consequence, methods based on the test of pairwise indepen- dence of species, such as direct Fisher's tests or more recently proposed methods (e.g. Veech, 2013) would consistently detect spurious positive associations (aggregation) among species, even in the absence of any interaction between species. Specifically, positive associations will be strongest, and thus more likely to be statistically detected, for species pairs with a large covariance of their relative profiles (Equation 5), that is, for pairs of species that are both slow. In contrast, species pairs including at least one fast species are more likely to be declared independent. Fixed- equiprobable permutations (Sim2) The same conclusion holds for more elaborate methods that do not control for patch age, such as the ‘fixed- equiprobable’ permutation algorithm, also known as ‘Sim2’ (Gotelli, 2000). To show this, let us consider the commonly used partial C- score Ci,j as our metric for spe- cies association (Equation 3). We will denote with an asterisk the value of quantities after the permutation algorithm has been applied to a co- occurrence matrix. A fixed- equiprobable matrix permutation algorithm preserves the number of occurrences per row (i.e. overall occupancies of species) but reshuffles species across all patches equiprobably. Therefore, after permutations, species overall occupancies are unchanged (p∗ i,∙ = pi,∙ for all i ), but all species become equiprobable over the different age classes, on average. Therefore, their relative distribution profiles be- come flat: 휋∗ k∕x = 1 for all x. From Equation 5, it follows This indicates that the algorithm effectively breaks the positive association generated by patch disturbance. From Equation 6, this im- plies q∗ i,j,∙ ≤ qi,j,∙. Permuted matrices will on average have a deficit of co- occurrences, for all species pairs, compared to the original matrix. (4)qi,j,∙ = ∫ X 0 pi|xpj|xp∙,xdx , (5)qi,j,∙ = pi,∙pj,∙ ∫ X 0 p∙,x휋i∕x휋j∕xdx = pi,∙pj,∙(1 + Cov(휋i∕x ,휋 j∕x)) , (6)qi,j,∙ ≥ pi,∙pj,∙. (7)q∗i,j,∙ = p ∗ i,∙ p∗ j,∙ = pi,∙pj,∙.     |  7Functional EcologyCALCAGNO et AL. Since pi,∙ and pj,∙ are unaffected by permutation, we can directly con- clude that C∗ i,j > Ci,j, as soon as patch disturbances occur. Using the fixed- equiprobable permutation algorithm will therefore consistently yield spurious facilitation signals between virtually all pairs of species, exactly as direct tests for species independence do (Figure 3c). Joint species distribution modelling (Hmsc M0) Joint species distribution models that do not account for patch differences (such as our Hmsc M0), by assuming species to be independently distributed across patches, will suffer from the same caveat. Hmsc models predict the probability of presence of some species i in a patch of age x, p∗ i|x, where the asterisk stands for ‘predicted value’. The model predicts the probability that species i is present in a patch regardless of the patch age, that is, p∗ i|x = pi,∙ . F I G U R E 3  Mathematical predictions. (a) The set of synthetic species. Occupancy as a function of patch age x (pi|x) is shown for each of the 31 species. Species traits were drawn randomly to have variable overall occupancies and fastnesses. (b) Pairwise covariances of relative profiles for all species pairs. Species are ranked according to fastness: species 1 is the fastest (lowest variance), and species 31 is the slowest (highest variance). The variance of the average profile in the community (휋̂x) was subtracted from covariances to help distinguish lower/higher than average values (colour bar). See Equations (4) and (7). (c) Mathematical predictions associated with methods testing for pairwise species independence (Fisher's test, fixed- equiprobable - Sim2- permutations and the simplest joint species distribution model Hmsc M0). (d– e) Same as (c) based on methods that control for patch richness, that is, the fixed- fixed permutation (Sim9) algorithm (d), and joint species distribution model Hmsc M1 (e). (f) Same as (c) based on joint species distribution model Hmsc M2, a method that controls for patch age. Species are ordered as in (b). The bullets indicate the predictions relative to an example species (species 27) being immune to patch disturbance (a) Fisher's tests, Fixed-equiprobable permutations, Hmsc M0 Fixed-fixed permutations Hmsc M1 Hmsc M2 Species S pe ci es S pe ci es 1 6 11 16 21 26 31 1 6 11 16 21 26 31 + 1 6 11 16 21 26 31 1 6 11 16 21 26 31 + - 1 6 11 16 21 26 31 1 6 11 16 21 26 31 + + - - S pe ci es Species 1 6 11 16 21 26 31 1 6 11 16 21 26 31 0 (c) (e) (d) E (f) Immune Set of 31 species O cc up an cy in p at ch es o f a ge x Patch age (x) (b) Faster Slower Relative pairwise covariance Species Species S pe ci es Species 8  |   Functional Ecology CALCAGNO et AL. It follows that the residual association (covariance) of two species i, j is which is equal to, dividing by pi,∙pj,∙, Since all species have relative distribution profiles smaller than one for small x, and greater than one for large x, it follows that the integrand is positive for most x values, and thus the two species will have positive residual association. Note that if at least one of the two species is infinitely fast, then the corresponding parenthesis is null for all x: the two spe- cies will have no residual association (i.e. will be independently distributed). Hmsc M0 models will thus report positive residual associations for all species pairs. The ability to detect those associations will how- ever depend on statistical power, especially for pairs of fast species, as in previous methods (independence tests and fixed- equiprobable permutations). 3.1.2  |  Patch disturbance yields a characteristic signature of spurious associations even with methods that control for patch richness We now turn to methods that can control for systematic differences among patches, in this case emerging from heterogeneity in the time since the last disturbance occurred (patch age). We here distinguish two groups of approaches. We first treat the popular fixed- fixed permutation algorithm, also known as Sim9 or swap (Gotelli, 2000). We then consider joint species distribution models that include patch richness as a factor (as our Hmsc M1). Fixed- fixed permutation approaches (Sim9) By fixing the row totals, this method effectively maintains the num- ber of species per patch unchanged, and can thus deal with the posi- tive association of species that causes issues with the previous set of approaches. Similar to the permutation algorithm discussed in the previous sec- tion (Sim2), preservation of row totals implies p∗ i,∙ = pi,∙. Therefore, we only need to study changes in qi,j,∙ to understand changes in C- scores. Preservation of column totals further means that the number of occur- rences will remain the same in every age class, which can be stated as ∑s k=1 p∗ k,x = ∑s k=1 pk,x for all x. Species are otherwise reshuffled indiffer- ently across all patch ages. Let wi be the relative occupancy of species i in the matrix, that remains unchanged after permutations: A fixed- fixed permutation algorithm yields: leading to, for all i = 1, 2,…, s, This indicates that species relative distribution profiles are replaced with their weighted- average over all species, with weights equal to species relative overall occupancies. Let us denote this weighted- average by 휋̂x. We note that 휋̂x can also be expressed as that is, ̂휋x is the mean species richness in patches age of x relative to the mean richness over all patch ages. We thus obtain: Comparing with Equation 5, we see that the term correcting for non- independence, Cov(휋i∕x ,휋 j∕x), is here replaced with Var(휋̂). The difference qi,j,∙ − q∗i,j,∙ has the same sign as Cov(휋i∕x ,휋 j∕x) − Var(휋̂). The permutation algorithm is thus biased, except if Cov(휋i∕x ,휋 j∕x) = Var(휋̂) for all species pairs. This would occur if all species had identical relative distribution profiles, that is, when all species are ‘similar’, in the sense pi,x = 휅pj,x for all x, with some positive 휅. This trivially occurs if there is no patch disturbance at all, or if all species have identical extinction and colonization parameters. Beyond that, species may differ in their parameters and overall occupancies and still be similar in that sense, but this requires quite constrained parametric conditions (Appendix S1, Section S1- 2.3). Except for these restricted scenarios, the permutation algorithm will yield spurious associations in both directions. Species pairs whose relative distribution profiles have covariance smaller than the variance of the average community profile (Cov(𝜋i∕x ,𝜋 j∕x) < Var(�𝜋) ) will appear to have a spurious negative association. Conversely, species pairs for which Cov(𝜋i∕x ,𝜋 j∕x) > Var(�𝜋) will appear to have a spurious positive association. Only species pairs for which Cov(휋i∕x ,휋 j∕x) = Var(휋̂) would appear as independently distributed. In practical terms, species pairs containing at least one species that is ‘faster’ than average (i.e. whose relative profile is flatter) will have low relative covariance, and thus will be declared nega- tively associated. Indeed, if the fast species (say i ) is fast enough (regardless of how slow the other is), then 휋i∕x ≈ 1 for all x thus and Cov(휋i∕x ,휋 j∕x) ≈ 0 = Var(휋i∕x), which is less than the variance of the average. In contrast, species pairs that are both ‘slower’ than aver- age (i.e. whose relative profiles are steeper) will have large relative ∫ X 0 p∙,x(pi|x − pi,∙)(pj|x − pj,∙)dx (8)pi,∙pj,∙ ∫ X 0 p∙,x(휋i∕x − 1)(휋 j∕x − 1)dx. wi = pi,∙∑s k=1 pk,∙ . p∗ i,x = ( s∑ k=1 pk,x ) wi , 휋∗ i∕x = p∗ i,x p∙,xpi,∙ = ∑s k=1 pk,x p∙,x ∑s k=1 pk,∙ = ∑s k=1 pk,∙휋k∕x∑s k=1 pk,∙ = �s k=1 wk휋k∕x = 휋̂x . (9)휋̂x = ∑s k=1 pk�x∑s k=1 pk,∙ , (10) q∗ i,j,∙ = pi,∙pj,∙ ∫ X 0 p∙,x휋 ∗ i∕x 휋∗ j∕x dx = pi,∙pj,∙ ∫ X 0 p∙,x 휋̂ 2 x dx = pi,∙pj,∙(1 + Var(휋̂)).     |  9Functional EcologyCALCAGNO et AL. covariance, and will be declared positively associated. Pairs of spe- cies that are both close to average fastness may yield no particular signal. If species are ordered according to fastness (i.e. variance), this produces a characteristic signature of spurious association signals. The pairwise covariance of all species pairs used in reference set of species is shown in Figure 3b. The corresponding mathematical predictions for fixed- fixed permutation algorithms are illustrated in Figure 3d (note the vertical and horizontal blue lines for species 27 are because that species is assumed to be immune to patch distur- bance as an illustration of the potential effect of that phenomenon; see below). Joint species distribution modelling (Hmsc M1) In the context of joint species distribution modelling, it has been suggested that an equivalent to the fixed- fixed permutation scheme discussed above would be to include patch richness as a latent factor in the model of species occurrence (our Hmsc M1). This was hypoth- esized in Ovaskainen and Abrego (2020, p. 335; see Appendix S2 for details). We can similarly formulate mathematical predictions regarding the behaviour such an approach would have under our null meta- community model. In the limit of large co- occurrence matrices, the number of species in a patch ( ∑ kpk�x) tends to become tightly associ- ated with patch age x. In other words, one may neglect the variance in patch age for a given species richness. A statistical model that describes the probability of occurrence, while including patch rich- ness as a latent factor, would therefore predict species i to occur in some particular patch of age x with probability p∗ i|x = pi,∙휋̂x, since 휋̂x is the relative richness of a patch of age x (Equation 9). The expected residual for this species in this sort of patch would be pi,∙(휋i,x − 휋̂x), and the residual covariance between two species i and j, over all patches, would be, from Equation (8), From the above equation, we can immediately conclude that if at least one of the two species has average fastness, in the sense defined above, then 휋i,x ≈ 휋̂x for all x, yielding a residual association of zero (independence). Otherwise, if the two species are faster (slower) than average, then both parentheses in the integral will be first positive (negative), and then negative (positive), as x increases. It follows that the overall residual association will be found positive. By the same argument, if one species is faster than average and the other is slower, the two parentheses will have opposite signs, and the overall residual association will be negative. As for the above permutation scheme, we expect a mixture of positive and negative associations, unless all species have the same fastness. The average fastness again plays a critical role in deter- mining where the switch from negative to positive associations, or vice versa, occurs. However, the resulting pattern of species associations is markedly different. We predict a block- like pattern, with faster species positively associated together, slower species also positively associated, and the two groups of species mutually negatively associated. We thus predict a four- block pattern, remi- niscent of differential specialization on two habitats, as illustrated in Figure 3e. In any case, controlling for differences across patches is therefore insufficient to handle the metacommunity dynamical consequences of patch disturbance, even though it brings some improvement. First, correct results are obtained if all species are ‘similar’ (i.e. have identical fastness). Second, the overall number of spurious associa- tions obtained will likely be reduced, since some species pairs will likely be correctly declared as independently distributed. However, this comes at the price of generating spurious associations of both kinds (positive and negative), not only positive associations. 3.1.3  |  Patch disturbance requires species- specific modelling of patch age Efficiently dealing with patch disturbance in co- occurrence analyses therefore requires both patch age and species traits to be taken into account, to model appropriately the differential effect that patch age has on species with different fastnesses. In other words, there must be an interaction between the effect of patch age and the iden- tity of species. This cannot be done with direct independent tests or simple null model approaches, but can be in the context of joint species distri- bution models, such as our model Hmsc M2. The model predicts the probability that species i is present in a patch as an increasing function of patch age, that is, it will yield p∗ i|x ≈ pi|x where the approximation denotes the fact that the func- tional form of the regression curve may not exactly match the shape of the pi∕x function. It follows that residuals will be close to zero, and so will be the re- sidual associations, for all species pairs. This is illustrated in Figure 3f. 3.1.4  |  What if some species are immune to patch disturbance? So far, we assumed all species in the metacommunity were affected by patch disturbances. In practice, in large sets of species, it may be that only a subset of species are susceptible to disturbance events, while some others are not (‘immune’ species). Based on our math- ematical results, this possibility is straightforward to deal with: the immune species would behave as a species with infinite fastness (Var(휋i∕x) = 0), regardless of its actual colonization/extinction rates (Appendix S1, Section S1- 4). As a consequence, any species pair including at least one immune species is expected to return no overall association, if using pairwise independence tests or equivalent methods treated in Section 2.1; (11)pi,∙pj,∙ ∫ X 0 p∙,x(휋i,x − 휋̂x)(휋 j,x − 휋̂x)dx. 10  |   Functional Ecology CALCAGNO et AL. see Figure 3c). Therefore, fewer spurious associations would be de- tected with such methods. On the contrary, by increasing the heterogeneity of fastness among species in the community, the presence of immune spe- cies would degrade the performance of methods that control for patch richness. More specifically, when using a fixed- fixed per- mutation method, any species pair including one immune species will systematically appear as negatively associated with the oth- ers, since Cov(휋i∕x ,휋 j∕x) = 0. For the same reason, if using a joint species distribution model with patch richness as a latent fac- tor, immune species will tend to cluster with the fastest species, and to be declared negatively associated with the slower species (Figure 3d,e). All the mathematical predictions derived in this section are sum- marized in Figure 3. We will now proceed to test them with numer- ical simulations. 3.2  |  Numerical simulations 3.2.1  |  Patch disturbance generates spurious positive associations with methods that do not account for patch age Consistent with our mathematical predictions, species were posi- tively associated overall in the simulated co- occurrence matrices, even under our ‘null’ model (Equation 1). Using the reference pa- rameter set (휇 = 0.1, N = 1, 000 and no immune species), the odds of species co- occurrence were on average 75% higher than ex- pected under independence (Table 1). About 10% of all species pairs had odds of co- occurrence more than 2.5 times as large as expected. Results for methods not controlling for patch richness or patch age (pairwise Fisher's tests, fixed- equiprobable permutations and species distribution model Hmsc M0) are shown in Figure 4. As predicted, all methods detected significant positive associations for the vast majority of species pairs, except for some of the spe- cies pairs involving the fastest species. Direct pairwise Fisher's tests and fixed- fixed permutations yielded very similar results, with about 75% of species pairs declared as positively associated (Figure 4a,c). Hmsc M0, which uses a quite different statistical ap- proach, happened to be more sensitive, and thus more error prone, declaring more than 90% of species pairs positively associated (Figure 4e). Again confirming predictions, the inclusion of species immune to patch disturbance reduced the number of positive associations detected, as species pairs including at least one immune species were reported as independently distributed, with a few exceptions (Figure 4b,d,f). The main difference between the three methods was that Hmsc M0 could occasionally go as far as declaring one im- mune species as negatively associated with the non- immune species (Figure 4f). 3.2.2  |  Patch disturbance yields a characteristic signature of spurious associations even with methods that control for patch richness Simulation results applying fixed- fixed permutation schemes (Sim9) and the species distribution model Hmsc M1 are presented in Figure 5. As expected from our mathematical analyses, we observe the detection of spurious associations of both kinds (positive and negative). The patterns of species associations closely resemble those of mathematical predictions, be it for fixed- fixed permutations (see Figure 3d) or Hmsc M0 (see Figure 3e). The main difference from predictions is the absence of some predicted associations, due to lack of statistical power. This is particularly the case for species pairs with covariance close to average, for a fixed- fixed permutation algorithm (Figure 5a). The total number of species pairs for which a spurious associa- tion is reported is slightly smaller than it was without controlling for patch richness. It is 56% with a fixed- fixed permutation scheme ver- sus 73% with a fixed- equiprobable scheme. Similarly, it is 12% with Hmsc M1 versus more than 90% with Hmsc M0. However, both positive and negative spurious associations are detected. There is a slight majority of negative associations with fixed- fixed permutations, but, in contrast, a majority of positive as- sociations with Hmsc M1. Unlike what was observed for the previous set of methods, Hmsc modelling is here less sensitive, and thus less error prone, than the comparable permutation algorithm. Therefore, fixed- fixed permutations and Hmsc M1 models differ not only in the qualitative type of characteristic signature they produce, but also in their quantitative performance (power and ratio of negative versus positive associations). Introducing immune species also yielded results conforming to mathematical predictions (Figure 5b,d). Interestingly, the presence of immune species degraded the overall performance of the two methods, but not to the same extent: the percentage of spurious as- sociations detected increased from 12% to 28% in the case of Hmsc M1 and, more modestly, from 57% to 59% in the case of fixed- fixed permutations. 3.2.3  |  Patch disturbance requires species- specific modelling of patch age Species distribution models can describe how the probability of occurrence increases, for each species, as a function of patch age. This can be done rather straightforwardly by providing patch age as an environmental covariate, over which species occupancies are regressed, as in our model Hmsc M2. As predicted, such an approach can restore satisfying re- sults, that is, no association was detected. Examples are shown in Figure 6a,b. Using the same set of species and co- occurrence matrix as before, the use of model Hmsc M2 suppressed almost all spurious association signals, with or without immune species, as desired. The     |  11Functional EcologyCALCAGNO et AL. TA B LE 1   Pe rf or m an ce (% o f e rr or s) f or e ac h of t he f iv e m et ho ds t es te d (F is he r, Fi xe d- fi xe d, H m sc M 0, H m sc M 1, H m sc M 2) , u nd er t he 1 6 pa ra m et ri c sc en ar io s (s am pl e si ze N × d is tu rb an ce ra te 휇 × N o Im m un e sp ec ie s/ W it h im m un e sp ec ie s) . N ot e th at t he ‘f ix ed - f ix ed ’ a lg or it hm s is a ls o kn ow n as ‘S im 9’ . G re y sh ad in g co rr es po nd s to t he d ef au lt pa ra m et er is at io n us ed t hr ou gh ou t th e pa pe r Sc en ar io Fi sh er Fi xe d- fix ed H m sc M 0 H m sc M 1 H m sc M 2 % + % − % to t % + % − % to t % + % − % to t % + % − % to t % + % − % to t N o im m un e sp ec ie s 휇 = 0 .2 N = 3 0 0 38 .3 0 38 .3 11 .8 8. 8 20 .6 75 0 75 0 0 0 0 0 0 N = 5 0 0 58 .3 0 58 .3 20 .2 19 .8 40 78 .3 0 78 .3 0. 6 3. 9 4. 5 0 0. 4 0. 4 a N = 1 ,0 0 0 73 .8 0 73 .8 29 .7 27 .3 57 98 .9 0 98 .9 4. 1 8 12 .1 0. 9 2. 8 3. 7 N = 1 ,5 0 0 79 .6 0 79 .6 35 28 .6 63 .6 98 .5 0 98 .5 6. 9 13 .8 20 .6 1. 3 1. 9 3. 4 휇 = 0 .1 N = 3 0 0 19 .6 0 19 .6 5 3. 6 8. 6 69 .9 0 69 .9 0 0. 2 0. 2 0 0 0 N = 5 0 0 45 .2 0 45 .2 11 .8 10 .5 22 .4 70 .1 0 70 .1 0. 4 1. 1 1. 5 0 0. 6 0. 6 N = 1 ,0 0 0 69 .4 0 69 .4 19 .1 18 .5 37 .6 98 .5 0 98 .5 0. 9 2. 5 3. 4 0 0 0 N = 1 ,5 0 0 75 0 75 25 .6 22 .6 48 .2 87 .3 0 87 .3 2 6. 4 8. 4 0 0. 2 0. 2 W it h im m un e sp ec ie s 휇 = 0 .2 N = 3 0 0 22 .6 0 22 .6 4. 7 6 10 .7 50 0 50 0 0 0 0 0 0 N = 5 0 0 37 .8 0 37 .8 20 .4 17 .8 38 .3 45 .2 0 45 .2 1. 1 7. 3 8. 4 0 0 0 N = 1 ,0 0 0 48 .4 0 48 .4 35 .3 23 .4 58 .7 64 .1 4. 7 68 .8 9. 9 18 .5 28 .4 0 0. 4 0. 4 N = 1 ,5 0 0 51 .8 0 51 .8 41 .3 27 68 .4 59 .3 0 59 .3 5. 6 20 25 .6 0 0 0 휇 = 0 .1 N = 3 0 0 7. 5 0 7. 5 1. 9 3. 6 5. 6 45 .2 0 45 .2 0 0 0 0 0 0 N = 5 0 0 28 .2 0 28 .2 10 .1 10 .3 20 .4 45 .2 0 45 .2 3 5. 8 8. 8 0 0 0 N = 1 ,0 0 0 44 .7 0 44 .7 20 .4 19 .8 40 .2 64 .5 3. 9 68 .4 3. 2 7. 3 10 .5 0 0 0 N = 1 ,5 0 0 49 .2 0 49 .2 27 .7 24 .9 52 .7 64 .5 3. 9 68 .4 3. 2 7. 3 10 .5 0 0 0 G ra nd a ve ra ge 46 .8 0 46 .8 20 17 37 69 .6 0. 78 70 .4 2. 5 6. 3 8. 9 0. 17 0. 23 0. 33 a T hi s is t he s ce na ri o us ed in a ll fi gu re s. 12  |   Functional Ecology CALCAGNO et AL. percentage of spurious associations detected was not strictly speak- ing zero, but remained under a tolerable error rate of 5%. By regressing the probability of occurrence on patch age, for every species individually, such a model appropriately captures both the overall increase in occupancy with patch age (Figure 6c) and the differential response of the different species, depending on their fastness (Figure 6d). Note however that this approach requires additional data, namely the age of patches, on top of the species co- occurrence matrix. This would not always be possible. Considering the fair association be- tween patch age and patch richness (Figure 6c), one could consider using patch richness as a surrogate for patch age. As patch richness is already encoded in the species co- occurrence matrix, this approach, like all previous sets of methods, would not require additional data. However, if using patch richness as a proxy of patch age (model Hmsc M2'), the method fails to provide correct results (see Figure S1). The association presented in Figure 6c appears not to be strong enough for the model to appropriately capture age- related patch differences. Arguably, in datasets containing even more spe- cies (e.g. 50– 100), the correlation might well be stronger, but in many situations, patch richness would also vary because of several factors beyond patch age. Therefore, its capacity to serve as a proxy of the F I G U R E 4  Numerical simulations: methods testing for pairwise species independence. (a) Species pairs significantly associated according to Fisher's pairwise tests. (b) Same as (a), with a subset of species immune to patch disturbance (shown as bullets). (c) Species pairs significantly associated according to the fixed- equiprobable (Sim2) permutation algorithm. (d) Same as (c), with a subset of immune species. (e) Species pairs significantly associated according to the simplest joint species distribution model (Hmsc M0). (f) Same as (e), with a subset of immune species. Species are ordered as in Figure 3b. The sample co- occurrence matrix was generated from Equation (1) with N = 1, 000 sites (patches) and the illustrative parameter set (a) (b) Faster Slower Species S pe ci es Pairwise associations (Hmsc M0) Immune (e) (f) Pairwise associations (Sim2) Pairwise associations (c) (d) with 6 immune species (Sim2) Faster Slower Fisher's test with 6 immune speciesFisher's test of association Pairwise associations with 6 immune species (Hmsc M0) SpeciesSpecies Species S pe ci es S pe ci es S pe ci es S pe ci es S pe ci es SpeciesSpecies     |  13Functional EcologyCALCAGNO et AL. latter would be compromised anyway. Consequently, using explicit independent data on patch age is the only generally recommendable strategy. 3.2.4  |  Effect of changing parameter values and sample size The performances of the main methods, under the different pa- rameter scenarios, are summarized in Table 1. Confirming the pre- vious results, only Hmsc M2 yielded satisfying results in all cases. All other methods generally detected spurious associations, often at quite high rates, sometimes exceeding 70%. Obviously, reducing sample size down to N = 300 reduced the general statistical power, and therefore the rate of spurious detections. In the case of species distribution model Hmsc M1, this sometimes made the percentage of errors fall to acceptable levels. However, this is just indicative of a lack of power, and would also come at the expense of higher type II errors in actual datasets containing genuine species associations. Decreasing the rate of patch disturbance 휇, as expected, also decreases the overall amount of spurious associations detected (Table 1). This is simply because this weakens the metacommunity- dynamics- driven patterns, in the absence of which all methods com- pared here perform correctly. Interestingly, the effect of introducing some species immune to patch disturbance systematically reduces the rate of spurious de- tections for methods that do not control for patch richness (Fisher's tests or Hmsc M0) but can have variable consequences for methods that do (fixed- fixed permutations and Hmsc M1). The consequences depend strongly on sample size. At low sample sizes, the main con- sequence is to reduce the average rate of perturbation, and thus to decrease the number of spurious detections, as statistical power is limiting. At large sizes, on the contrary, the presence of immune spe- cies increases the number of spurious detections, as it increases the heterogeneity in fastness among species, to which these methods are sensitive. Our main overall conclusions are synthesized in Table 2. 4  |  DISCUSSION Metacommunity dynamics matter. We demonstrated here how sim- ple metacommunity dynamics can cause the detection of spurious species associations, even in null communities with no interactions. This occurs as soon as patch disturbances exist, that is, catastrophic events that cause all species (or at least a subset of them) to be lo- cally wiped out. The latter is an arguably common element in natural and exploited ecosystems (Pickett & White, 2013), and one often, F I G U R E 5  Numerical simulations: methods controlling for patch richness. (a) Species pairs significantly associated according to the fixed- fixed (Sim9) permutation algorithm. (b) Same as (a), with a subset of species immune to patch disturbance (shown as bullets). (c) Species pairs significantly associated according to a joint species distribution model (Hmsc M1) that models patch richness as a single latent factor. (d) Same as (c), with a subset of immune species. Same species and sample co- occurrence matrix as in previous figures Immune (a) (b) (c) (d) Pairwise associations with 6 immune species (Sim9) Pairwise associations (Sim9) SpeciesSpecies SpeciesSpecies S pe ci es S pe ci es S pe ci es S pe ci es Pairwise associations (Hmsc M1) Pairwise associations with 6 immune species (Hmsc M1) 14  |   Functional Ecology CALCAGNO et AL. though not always, included in metapopulation and metacommunity models (Leibold et al., 2004). It is classically assumed that in the absence of any ecological in- teractions, habitat heterogeneity or species differences in dispersal patterns, species would be distributed independently across patches. This ‘null’ assumption underlies many interpretations of species co- occurrence patterns. Our analysis shows that this intuition is true for any given patch age (time since last disturbance), but it is not true over- all. Patch age acts as a hidden structuring variable, the implications of which for co- occurrence patterns have been overlooked so far. Indeed, among the list of processes typically invoked as drivers which might cause departures from species independence, patch age does not nor- mally appear (Gotelli, 2000; Ovaskainen & Abrego, 2020). This result may seem counter- intuitive at first glance. Indeed, if species are initially independently distributed in a metacommunity, patch disturbances do not per se cause any association to appear: as they occur regardless of species composition, they preserve species independence, unlike interaction- driven extinctions would. Rather, it is the outcome of extinction/recolonization dynamics (Equation 2) that inevitably causes, at steady state, species to be more frequent in older patches. Since all species are more likely to occupy older patches, all appear positively associated overall. We found that this signal was detectable even with realistic sample sizes (number of patches), under relatively low levels of patch disturbance, and for variations among species fastness similar to what has been ob- served in actual metacommunity systems. Obviously, the amount F I G U R E 6  Numerical simulations: modelling the effect of patch age. (a) A joint species distribution model that incorporates patch age as an explicit environmental covariate (Hmsc M2) yields correct results, even with patch disturbance. The few spurious species associations detected are within the type I error rate. Note that log(patch age) was used as the covariate, since patch age is distributed exponentially. (b) Same as (a), with a subset of species immune to patch disturbance (shown as bullets). (c) Predicted species richness as a function of patch age. Grey dots represent the actual data in the sample co- occurrence matrix. (d) Predicted probability of occurrence as a function of patch age, for the fastest species and the slowest species. Same species and sample co- occurrence matrix as in previous figures (d) (a) Patch age (x) Sp ec ie s ric hn es s Fit of species richness against patch age (Hmsc M2) (c) Pr ob ab ilit y of oc cu rre nc e Patch age (x) Fits of species occupancies for 2 species (Hmsc M2) slowest species (31) fastest species (1) (b) Immune Pairwise associations with 6 immune species (Hmsc M2) Pairwise associations (Hmsc M2) S pe ci es S pe ci es SpeciesSpecies Type of approach Example methods Robust? Type of bias Direct independence tests Fisher's tests; Sim2; Hmsc M0 Noa Positive Controlling for patch richness Sim9; Hmsc M1; Hmsc M2' Nob Positive/ negative Modelling effect of patch age Hmsc M2 Yes None aWorks only if no patch disturbance. bWorks only if no patch disturbance, or all species similar (same fastness). TA B L E 2  Are detection methods robust to patch disturbance? Summary of main findings. For each type of approach, we only mention methods tested in this study, but conclusions apply to other similar methods as well. Note that ‘Sim2’ and ‘Sim9’ denote the ‘fixed- equiprobable’ and ‘fixed- fixed’ algorithms, respectively     |  15Functional EcologyCALCAGNO et AL. of spurious species associations detected generally declined with sample size (Table 1), but so would the power to detect genuine species association signals. Moreover, the range of sample sizes we used may quickly become conservative, as more and more ecological studies now have datasets with several hundreds or thousands of patches sampled (Dubart et al., 2019; Facon et al., 2021; Opedal, Ovaskainen, et al., 2020). The development of metagenomics and microbiome studies will undoubtedly contribute to this upward trend in sample sizes, and so the phenomenon reported here could become more and more prevalent. The consequences of patch disturbance cannot simply be dealt with by controlling for patch differences in richness. If doing so, spu- rious associations, both negative and positive, also appear, as soon as species have different relative distribution profiles (fastnesses). These spurious associations have a different cause: they stem from the fact that slow and fast species are concentrated in different parts of the patch- age spectrum, which can be thought of as a form of niche differentiation. This differential distribution with reference to patch age causes spurious negative association signals to appear. But the signature we report is more subtle than that (see Figure 5). The fastest species (or species immune to patch disturbance, if any) also appear to be segregated from each other, as well as from all the other species. Reciprocally, the slowest species, being more con- centrated than others in the older parts of the patch age spectrum, appear positively associated (and segregated from the others). The generation of spurious associations (Type I error) is in itself problematic. But beyond that, in more realistic communities with actual species interactions, these spurious patterns would interfere with and confound the actual signals. For instance, actual signals of species interactions, for example, an increase in species extinction rates with species number, that is, a ‘neutral’ form of competition (Bell, 2001; Hastings, 1987), could be hidden in the presence of patch disturbance: the negative association patterns would be cancelled, at least among pairs of slow species, by the positive signal caused by metacommunity dynamics alone. In such situations, false negatives (loss of power) will result. Furthermore, the power of detection is not homogeneously distributed across all species pairs: signals of com- petition would be weakened for slow species pairs, but reinforced for fast species pairs. Therefore, a diffuse type of competitive inter- action with no discernible pattern (all species competitively identi- cal) could turn into a structured pattern, with nonzero nestedness. A possible solution to these issues consists of using species dis- tribution models that explicitly incorporate patch age as a covariate (Figure 6). However, this requires having independent data on patch ages, which is far from being universal in co- occurrence analyses. While feasible for some types of systems (Ovaskainen et al., 2017), typically parasite or microbiome communities for which patches are hosts of known age, in many cases such data would be difficult or impossible to obtain. It may sometimes be possible to approximate patch age from measurable environmental variables, when patch disturbance causes environmental modifications and patches fol- low a succession- like cycle after them. Depending on the type of biological system considered, indicator species known to reflect successional stage, or the dosage of chemical compounds indicative of disturbance (e.g. fire), may correlate with patch age. In the case of parasites, host size or other host attributes may also be used to infer host age. However, this approach would not feasible for all sys- tems, particularly since a sufficiently strong correlation is required. Indeed, in our examples, using patch richness as a proxy of patch age failed to remove the spurious associations, even though the two variables were reasonably well correlated (Figure 6). In general, effectively controlling for patch age would almost certainly require genuine temporal data on patch life cycles (Ovaskainen et al., 2017, Figure 5c). In general, our work here underscores once more how inferring species interactions from snapshots of co- occurrence patterns can be as hazardous as it is appealing, and requires careful thought about null expectations (Blanchet et al., 2020; Molina & Stone, 2020; Münkemüller et al., 2020). We join the recent burst of cautionary articles in highlighting these concerns, and in any case, we encour- age investigators to evaluate whether detected species association patterns could match those reported here, and so could stem from pure metacommunity dynamical effects. ACKNOWLEDG EMENTS The authors thank Maxime Dubart for guidance on using Hmsc and for sharing data on Caribbean snail metacommunity parameter es- timates, and the anonymous peer reviewers for helpful suggestions. Colin Guétemme and Héloïse Villesèche participated in preliminary research on the topic. CONFLIC T OF INTERE S T The authors declare no conflict of interest. AUTHORS' CONTRIBUTIONS Authors contributed similarly to the manuscript preparation; V.C. and F.M.H. contributed most to the mathematical derivations; N.J.C. contributed most to the stochastic simulations and R coding. DATA AVAIL ABILIT Y S TATEMENT All the codes used to produce the results in this article are available via the doi https://doi.org/10.5281/zenodo.6400509 (Cunniffe, 2002). ORCID Vincent Calcagno https://orcid.org/0000-0002-5781-967X Nik J. Cunniffe https://orcid.org/0000-0002-3533-8672 Frédéric M. Hamelin https://orcid.org/0000-0003-2653-699X R E FE R E N C E S Barberán, A., Bates, S. T., Casamayor, E. O., & Fierer, N. (2012). Using net- work analysis to explore co- occurrence patterns in soil microbial communities. The ISME Journal, 6(2), 343– 351. Barner, A. K., Coblentz, K. E., Hacker, S. D., & Menge, B. A. (2018). Fundamental contradictions among observational and experimen- tal estimates of non- trophic species interactions. Ecology, 99(3), 557– 566. 16  |   Functional Ecology CALCAGNO et AL. Begon, M., & Townsend, C. R. (2020). Ecology: From individuals to ecosys- tems. John Wiley & Sons. Bell, G. (2001). Neutral macroecology. Science, 293(5539), 2413– 2418. Blanchet, F. G., Cazelles, K., & Gravel, D. (2020). Co- occurrence is not ev- idence of ecological interactions. Ecology Letters, 23(7), 1050– 1063. https://doi.org/10.1111/ele.13525 Blois, J. L., Gotelli, N. J., Behrensmeyer, A. K., Faith, J. T., Lyons, S. K., Williams, J. W., Amatangelo, K. L., Bercovici, A., Du, A., Eronen, J. T., Graves, G. R., Jud, N., Labandeira, C., Looy, C. V., McGill, B., Patterson, D., Potts, R., Riddle, B., Terry, R., … Wing, S. (2014). A framework for evaluating the influence of climate, dispersal limita- tion, and biotic interactions using fossil pollen associations across the late quaternary. Ecography, 37(11), 1095– 1108. Calcagno, V., Massol, F., Mouquet, N., Jarne, P., & David, P. (2011). Constraints on food chain length arising from regional metacom- munity dynamics. Proceedings of the Royal Society B: Biological Sciences, 278(1721), 3042– 3049. Caswell, H. (1976). Community structure: A neutral model analysis. Ecological Monographs, 46(3), 327– 354. Cohen, J. E. (1970). A markov contingency- table model for replicated lotka- volterra systems near equilibrium. The American Naturalist, 104(940), 547– 560. Connor, E. F., & Simberloff, D. (1979). The assembly of species communi- ties: Chance or competition? Ecology, 60(6), 1132– 1140. https://doi. org/10.2307/1936961 Cunniffe, N. J. (2002). Code to support "Metacommunity dynamics and the detection of species associations in co- occurrence analyses: Why patch disturbance matters". Zenodo, https://doi.org/10.5281/ zenodo.6400509 D'Amen, M., Mod, H. K., Gotelli, N. J., & Guisan, A. (2018). Disentangling biotic interactions, environmental filters, and dis- persal limitation as drivers of species co- occurrence. Ecography, 41(8), 1233– 1244. Diamond, J. M. (1975). Assembly of species communities. In J. M. Diamond, & M. L. Cody (Eds.), Ecology and evolution of communities (pp. 342– 444). Harvard University Press. Dubart, M., Pantel, J. H., Pointier, J.- P., Jarne, P., & David, P. (2019). Modeling competition, niche, and coexistence between an inva- sive and a native species in a two- species metapopulation. Ecology, 100(6), e02700. Facon, B., Hafsi, A., Charlery de la Masselière, M., Robin, S., Massol, F., Dubart, M., Chiquet, J., Frago, E., Chiroleu, F., Duyck, P.- F., & Ravigné, V. (2021). Joint species distributions reveal the combined effects of host plants, abiotic factors and species competition as drivers of species abundances in fruit flies. Ecology Letters, 24(9), 1905– 1916. https://doi.org/10.1111/ele.13825 Faust, K., & Raes, J. (2012). Microbial interactions: From networks to models. Nature Reviews Microbiology, 10(8), 538– 550. Forbes, S. A. (1907). On the local distribution of certain Illinois fishes: An essay in statistical ecology (Vol. 7). Illinois State Laboratory of Natural History. Friedman, J., & Alm, E. J. (2012). Inferring correlation networks from ge- nomic survey data. PLoS Computational Biology, 8(9), e1002687. Gotelli, N. J. (1991). Metapopulation models: The rescue effect, the propagule rain, and the core- satellite hypothesis. The American Naturalist, 138(3), 768– 776. Gotelli, N. J. (2000). Null model analysis of species co- occurrence pat- terns. Ecology, 81(9), 2606– 2621. Gotelli, N. J., & Ulrich, W. (2012). Statistical challenges in null model anal- ysis. Oikos, 121(2), 171– 180. Hamelin, F. M., Allen, L. J., Bokil, V. A., Gross, L. J., Hilker, F. M., Jeger, M. J., Manore, C. A., Power, A. G., Rúa, M. A., & Cunniffe, N. J. (2019). Coinfections by noninteracting pathogens are not in- dependent and require new tests of interaction. PLoS Biology, 17(12), e3000551. Hanski, I. (1982). Dynamics of regional distribution: The core and sat- ellite species hypothesis. Oikos, 38(2), 210– 221. https://doi. org/10.2307/3544021 Hanski, I., & Gyllenberg, M. (1993). Two general metapopulation models and the core- satellite species hypothesis. The American Naturalist, 142(1), 17– 41. Hanski, I., & Gyllenberg, M. (1997). Uniting two general patterns in the distribution of species. Science, 275(5298), 397– 400. Hastings, A. (1980). Disturbance, coexistence, history, and competition for space. Theoretical Population Biology, 18(3), 363– 373. Hastings, A. (1987). Can competition be detected using species co- occurrence data? Ecology, 68(1), 117– 123. https://doi. org/10.2307/1938811 Hastings, A. (1991). Structured models of metapopulation dynamics. Biological Journal of the Linnean Society, 42(1– 2), 57– 71. Hastings, A. (2003). Metapopulation persistence with age- dependent disturbance or succession. Science, 301(5639), 1525– 1526. Jentsch, A., & White, P. (2019). A theory of pulse dynamics and distur- bance in ecology. Ecology, 100(7), e02734. Leibold, M. A., Holyoak, M., Mouquet, N., Amarasekare, P., Chase, J. M., Hoopes, M. F., Holt, R. D., Shurin, J. B., Law, R., Tilman, D., Loreau, M., & Gonzalez, A. (2004). The metacommunity concept: A frame- work for multi- scale community ecology. Ecology Letters, 7(7), 601– 613. Levin, S. A., & Paine, R. T. (1974). Disturbance, patch formation, and com- munity structure. Proceedings of the National Academy of Sciences of the United States of America, 71(7), 2744– 2747. Levins, R. (1969). Some demographic and genetic consequences of environmental heterogeneity for biological control. American Entomologist, 15(3), 237– 240. MacArthur, R. H., & Wilson, E. O. (1963). An equilibrium theory of insular zoogeography. Evolution, 17, 373– 387. Molina, C., & Stone, L. (2020). Difficulties in benchmarking ecological null models: An assessment of current methods. Ecology, 101(3), e02945. Münkemüller, T., Gallien, L., Pollock, L. J., Barros, C., Carboni, M., Chalmandrier, L., Mazel, F., Mokany, K., Roquet, C., Smyčka, J., Talluto, M. V., & Thuiller, W. (2020). Dos and don'ts when infer- ring assembly rules from diversity patterns. Global Ecology and Biogeography, 29(7), 1212– 1229. Olivieri, I., Michalakis, Y., & Gouyon, P.- H. (1995). Metapopulation genet- ics and the evolution of dispersal. The American Naturalist, 146(2), 202– 228. Opedal, Ø. H., Ovaskainen, O., Saastamoinen, M., Laine, A.- L., & van Nouhuys, S. (2020). Host- plant availability drives the spatiotempo- ral dynamics of interacting metapopulations across a fragmented landscape. Ecology, 101(12), e03186. https://doi.org/10.1002/ ecy.3186 Opedal, Ø. H., von Numers, M., Tikhonov, G., & Ovaskainen, O. (2020). Refining predictions of metacommunity dynamics by modeling species non- independence. Ecology, 101(8), e03067. https://doi. org/10.1002/ecy.3067 Ovaskainen, O., & Abrego, N. (2020). Joint species distribution modelling: With applications in R. Cambridge University Press. Ovaskainen, O., Tikhonov, G., Norberg, A., Guillaume Blanchet, F., Duan, L., Dunson, D., Roslin, T., & Abrego, N. (2017). How to make more out of community data? A conceptual framework and its implemen- tation as models and software. Ecology Letters, 20(5), 561– 576. Pickett, S. T., & White, P. S. (2013). The ecology of natural disturbance and patch dynamics. Elsevier. Slatkin, M. (1974). Competition and regional coexistence. Ecology, 55(1), 128– 134. Sousa, W. P. (1984). The role of disturbance in natural communities. Annual Review of Ecology and Systematics, 15(1), 353– 391. Stone, L., & Roberts, A. (1990). The checkerboard score and species dis- tributions. Oecologia, 85(1), 74– 79.     |  17Functional EcologyCALCAGNO et AL. Thompson, P. L., Guzman, L. M., De Meester, L., Horváth, Z., Ptacnik, R., Vanschoenwinkel, B., Viana, D. S., & Chase, J. M. (2020). A process- based metacommunity framework linking local and regional scale community ecology. Ecology Letters, 23(9), 1314– 1329. Tikhonov, G., Opedal, Ø. H., Abrego, N., Lehikoinen, A., de Jonge, M. M., Oksanen, J., & Ovaskainen, O. (2020). Joint species distribu- tion modelling with the R- package Hmsc. Methods in Ecology and Evolution, 11(3), 442– 447. Tilman, D. (1994). Competition and biodiversity in spatially structured habitats. Ecology, 75(1), 2– 16. Tilman, D., May, R. M., Lehman, C. L., & Nowak, M. A. (1994). Habitat destruction and the extinction debt. Nature, 371(6492), 65– 66. Ulrich, W., Kryszewski, W., Sewerniak, P., Puchałka, R., Strona, G., & Gotelli, N. J. (2017). A comprehensive framework for the study of species co- occurrences, nestedness and turnover. Oikos, 126(11), 1607– 1616. Veech, J. A. (2013). A probabilistic model for analysing species co- occurrence. Global Ecology and Biogeography, 22(2), 252– 260. SUPPORTING INFORMATION Additional supporting information may be found in the online version of the article at the publisher’s website. How to cite this article: Calcagno, V., Cunniffe, N. J., & Hamelin, F. M. (2022). Metacommunity dynamics and the detection of species associations in co- occurrence analyses: Why patch disturbance matters. Functional Ecology, 00, 1– 17. https://doi.org/10.1111/1365- 2435.14047