Article Canalisation and plasticity on the developmental manifold of Caenorhabditis elegans David J Jordan* & Eric A Miska Abstract How do the same mechanisms that faithfully regenerate complex developmental programmes in spite of environmental and genetic perturbations also allow responsiveness to environmental signals, adaptation and genetic evolution? Using the nematode Caenorhab- ditis elegans as a model, we explore the phenotypic space of growth and development in various genetic and environmental contexts. Our data are growth curves and developmental parame- ters obtained by automated microscopy. Using these, we show that among the traits that make up the developmental space, cor- relations within a particular context are predictive of correlations among different contexts. Furthermore, we find that the develop- mental variability of this animal can be captured on a relatively low-dimensional phenotypic manifold and that on this manifold, genetic and environmental contributions to plasticity can be deconvolved independently. Our perspective offers a new way of understanding the relationship between robustness and flexibility in complex systems, suggesting that projection and concentration of dimension can naturally align these forces as complementary rather than competing. Keywords canalisation; dimensionality reduction; phenotypic manifold; plasticity Subject Category Development DOI 10.15252/msb.202311835 | Received 21 June 2023 | Revised 26 September 2023 | Accepted 5 October 2023 Mol Syst Biol. (2023) e11835 Introduction Biological systems are remarkable for their ability to generate repro- ducible macroscopic dynamics from the complex interactions of large numbers of microscopic components. For example, in animals, the development of an entire organism from a single cell proceeds faithfully each generation even in the presence of environmental fluctuations and molecular noise. Such robustness arises at many spatial and temporal scales, for example, gene expression patterns give rise to reproducible cell differentiation (Rulands & Simons, 2017) neural and muscular activity generate locomotion (Stephens et al, 2008) and interactions between individuals of different species give rise to surprisingly reproducible ecological dynamics (Frentz et al, 2015). This robustness is called canalisation (Waddington, 1942), and dynamics that are canalised are said to be homeorhetic (Waddington, 1957; Chuang et al, 2019), terms introduced by CH Waddington in his series of foundational works on the development. Although robust, canalised processes nevertheless allow for important variability; stem cell populations generate diverse tissue types, behaviours respond to stimuli and environmental cues, and populations adapt to changing environments. However, the struc- ture of this macroscopic variability is often much more constrained than the intrinsic variability of the microscopic processes that underlie it. Although gene expression determines the dynamics of the cell cycle, variations in the cell cycle timing are largely insensi- tive to stochastic fluctuations in the levels of thousands of proteins. Thus, the ways in which these macroscopic phenotypes can vary are relatively “low-dimensional” compared with the ways in which the individual components can vary. A low-dimensional manifold is a locally Euclidean subspace of a higher-dimensional space. We call this higher-dimensional space the “ambient” space. For example, the surface of a sphere is a two- dimensional manifold embedded in a three-dimensional ambient space. In the ambient space, the points on the sphere could be iden- tified by their three coordinates (x, y, z; with the added condition that ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 þ y2 þ z2ð Þp ¼ r). However, we can also describe coordi- nates on the surface of the sphere using a 2d coordinate system, such as their latitude ϕ and longitude γ. These coordinates are given by the nonlinear projection map ϕ ¼ sin�1 z r � � and λ ¼ atan2 y x � � . In this procedure, variations in elevation will be lost, as the projection map admits only a fixed r; in this sense, the projection reduces vari- ation. An example of a projection in a biological system is the gene regulatory dynamics that map the high-dimensional ambient space of gene expression onto the subspace of cell identities. In this work, we introduce the term concentration of dimension to differentiate this phenomenon from numerical and computation methods which extract low-dimensional representations, collectively referred to as dimensionality reduction techniques. We propose that projection of variations into a lower- dimensional space may provide a way for biological systems to be both robust and flexible. Robustness arises because most variations manifest as excitations onto relatively few phenotypic modes, whereas flexibility is permitted along these modes. As an example, Department of Biochemistry, University of Cambridge, Cambridge, UK *Corresponding author. Tel: +44 7484 872451; E-mail: dj333@cam.ac.uk ª 2023 The Authors. Published under the terms of the CC BY 4.0 license Molecular Systems Biology e11835 | 2023 1 of 16 https://orcid.org/0000-0002-3506-5440 https://orcid.org/0000-0002-3506-5440 https://orcid.org/0000-0002-3506-5440 https://orcid.org/0000-0002-4450-576X https://orcid.org/0000-0002-4450-576X https://orcid.org/0000-0002-4450-576X consider how facial diversity can be generated by the combination of relatively few eigenfaces (Turk & Pentland, 1991). Here, the eigenfaces are the modes in this system, and varying the weights of each mode can capture many diverse faces. However, even large variations in these weights are unlikely to produce nonface images. Low-dimensional representations of phenotypic diversity are ubiqui- tous in living systems. Variations in phenotypes that tend to a stable state during development have been successfully represented in low-dimensional phenotypic spaces called “morpho-spaces,” for example, morphological traits such as the arrangement of flowers in a plant (Prusinkiewicz et al, 2007), the shapes of finch beaks (Camp�as et al, 2010), of coiled sea shells (Raup, 1966), and even the shape of the influenza antigen (Smith et al, 2004). Furthermore, while the dimensionality of dynamic, time-varying and responsive phenotypes is more difficult to define rigorously (Bialek, 2022), it has been shown in some cases to be low-dimensional. Some recent examples include the crawling behaviour of C. elegans (Costa et al, 2023) and the neuronal dynamics that underlie it (Brennan & Proekt, 2019), the swimming behaviour of both eukaryotic (Tetrahy- mena thermophila; Jordan et al, 2013) and prokaryotic (Escherichia coli) single-celled organisms (Pleška et al, 2021), and the transcrip- tional trajectories of cells during fate determination (Huang et al, 2005). Concentration of dimension may be an intrinsic prop- erty of systems where robustness and control of the macroscopic observables are required (Eckmann & Tlusty, 2021) and may arise from constraints imposed by steady-state growth (Klumpp & Hwa, 2014; Kaneko & Furusawa, 2021), or trade-offs between phe- notypic archetypes suited for different tasks (Shoval et al, 2012). Here, we propose that the process of canalisation may be super- seded by a more general process of phenotypic concentration of dimension. That is, genetic networks evolve such that most varia- tions will be projected onto a low-dimensional manifold, and this in turn provides both canalisation and phenotypic plasticity. The con- centration of dimensions manifests itself as the covariation of traits in a high-dimensional multiphenotype space. This structure of covariation is called phenotypic integration (Pigliucci, 2003), and its geometric properties can be captured by a phenotypic manifold, that is, a lower-dimensional space that faithfully captures most of the variation observed in a higher-dimensional space. The process of finding such a manifold is called dimensionality reduction. The determination of this space and how it is measured can be essential to finding the dominant modes of phenotypic variation and thus to assessing the degree of canalisation. A recent example comes from the developmental programme of the wing of the fruit fly Drosophila melanogaster (Alba et al, 2021), which used landmark free morpho- metrics to uncover a dominant mode of variation that was not apparent from traditional landmark-based techniques. In this work, we use a custom-built automated imaging system to map the phenotypic space of growth and development of C. elegans. Using this system, we recorded 673 individual growth curves during the �70 h of their development from eggs to reproductive adults and manually recorded the timing of egg hatching and reproductive maturation in single animals. To construct as complete a manifold as possible, we sampled extant variations in different genetic and environmental contexts, including both natural genetic variation and single-gene mutants. Although the most unbiased approach to sampling genetic variation would be random mutagenesis, due to the immense sampling space, efficient sampling cannot be done. For this reason, we chose to sample “wild isolates” of C. elegans, that is, strains collected from nature and whose collections of genetic changes have been subjected to natural selection. For environmental diversity, we chose to alter the animal’s diet; C. elegans feed on bac- teria, and we chose a collection of bacteria that included the stan- dard laboratory bacteria E. coli as well as bacteria isolated from natural sites where C. elegans were also collected (Fr�ezal & F�elix, 2015; Samuel et al, 2016). Measurements were taken for three wild isolates of C. elegans, each fed one of four different bacterial diets, and two mutants of the laboratory strain N2. Using these data, we demonstrate that the space of developmen- tal trajectories can be captured by a low-dimensional manifold and show a correspondence between the directions of fluctuations in a fixed context and the directions in which populations will shift due to genetic or environmental changes. We find that the manifold obtained using nonlinear dimensionality reduction techniques cap- tures developmental variations in a way that allows one to neatly decompose the contributions of genetics and environment indepen- dently, with the major mode of variation corresponding to environ- mental shifts and the second mode corresponding to genetic changes. Results Characteristics that change over time are more difficult to quantify and compare than those that are static or near a steady state. Often, dynamic phenotypes such as these are measured and compared at a single fixed time, but in this case, proper alignment or synchronisa- tion can be challenging. Ideally, one would like to compare the time series of time-varying phenotypes and compare these directly. The growth of a multicellular organism is a good example of such a time-dependent phenotype. Ideally, one would like to compare growth curves aligned to an unambiguous starting time. To this end, we designed and constructed a low-cost parallel imag- ing platform capable of measuring C. elegans growth for 60 individual animals simultaneously over the course of their �70 h development at a temporal resolution of �0.001 Hz, resulting in a time series of �200 observations per animal. In addition to length and area mea- sured automatically, egg hatching and first egg laying by mature adults are manually recorded. These not only provide estimates of the ani- mal’s reproductive development but also provide a fixed time to which the time series can be aligned. Reproductive development is generally correlated with growth, but it does not necessarily have to be. Because animals grow from approximately 0.2 to 1 mm in length during their lifetime, any wide-field system capable of imaging many isolated worms simultaneously would lack the necessary resolution. To solve this, we developed a system that uses a fixed lens and USB camera that is programmed to move between 6-mm-diameter custom-made wells using an XY-plotting robot. Sample images from the time series are shown (Fig 1A; inset, 1–8), along with the associated time series and the best-fit logistic function of the form: l tð Þ ¼ lmax 1þ e�r t�Að Þ (1) as determined by nonlinear least squares fitting, giving three parameters for each curve (Fig 1E). We used this system to record 2 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska development in a collection of C. elegans isolated from the wild and fed on various bacteria, some of which were collected from sites where C. elegans were found. In total, we assayed five unique C. elegans genotypes, three wild isolates (Fig 1B) and two N2 mutants, each fed on four different bacterial diets (Fig 1C) and col- lected a total of 673 growth curves (Fig 1D), in addition to devel- opmental data, across these conditions. The developmental data consist of the duration of ex utero embryonic development and the duration of reproductive maturation. These are measured as the time from egg-laying to egg-hatching thatch and the time from hatching until the animal grows to reproductive age and lays its first egg tdev = (tlaid � thatch) as shown in Fig 1A. While these durations were single scalar quantities, the growth curves consisted of hundreds of points. Dimensionality reduction could have been performed directly on these high-dimensional vectors after appro- priate regularisation, but instead these growth curves were fit with the logistic function (equation 1) which performed similarly well and whose parameters are easier to interpret (see Fig EV1). The three parameters, which we call the maximum length lmax, the growth rate r and the shift A, determine the saturation value of the growth curve, the growth rate at the inflection point and the temporal shift of the inflection point, respectively. For each growth curve, we have an independent measurement of the reproductive development time tdev and of the animal’s length at this time l(tlaid) = ldev (Fig 2A, squares). We can use these addi- tional parameters to rescale each growth curve. First, we divide the length as a function of time by ldev, yielding: l̂ tð Þ ¼ lmax=ldev 1þ e�r t�Að Þ Then, rescaling time t → tdev l̂ t̂ � � ¼ lmax=ldev 1þ e�rtdev t̂�A=tdevð Þ (2) Thus, the normalisation simply rescales the fit parameters from [lmax,r,A] → [lmax/ldev,r�tdev,A/tdev] yielding unit-less quantities for the fit parameters. The fitting procedure on the normalised curves recovers the rescaled fitting parameters from the unnormalised data (see Fig EV2). Interestingly, we find that rescaling the curves to an independently measured parameter, the reproductive developmental duration, seems to collapse the growth curves, consistent with tem- poral scaling observed previously (Filina et al, 2022), for example (Fig 2B). To compare the variance of all the rescaled growth curves with the variance of the nonrescaled growth curves, the raw data were normalised to the mean duration of development and the mean and standard deviation of the resulting growth curves were compared, showing that the rescaled growth curves “collapse” and that their resulting standard deviation is smaller than for normalised raw data (Fig 2C). The growth curves of C. elegans of different genetic backgrounds and grown on different food sources differ in their maximum length, their growth rate and in the shift, as well in the durations of ex-utero development and reproductive development. However, we find that these parameters do not vary independently. For example, some recent work has shown a negative correlation between growth rate and developmental duration in C. elegans and we also observe this in our data. They suggest that this may be a mechanism to reduce the variability in adult size in Stojanovski et al (2022). If correla- tions between traits arise from a mechanism for control such as this, we would expect variation within populations to be correlated in the same way as variations between populations. Strikingly, we find that if parameters are correlated among individuals in one context, these correlations are more likely to also appear between popula- tions in different contexts (To see how such correlations may arise, refer to Box 1). For example, there is a strong negative correlation between the maximum length and the growth rate among individ- uals in all combinations of conditions. Likewise, the mean values of these parameters among populations grown in different conditions are also negatively correlated (Fig 3A and B). In contrast, the length and the duration of reproductive development are not strongly cor- related among individuals, and neither are they correlated between populations (Fig 3C and D). Computing the correlation coefficient (ρ) among individuals in a sample in fixed conditions and plotting it against the value computed between-population means among dif- ferent conditions shows that this seems to be true in general for dif- ferent pairs of parameters of C. elegans development (Figs 3E and EV2). The geometric structure of these correlations can be captured by a low-dimensional manifold in the ambient phenotypic space (Appendix Figs S1 and S2). Nonlinear principal component analysis (Scholz et al, 2005) was used to determine the shape of this mani- fold from the ambient space of rescaled logistic parameters with the developmental durations included (Fig 4A). The first two nonlinear principal components capture 93% of the variance. While this dimensionality reduction may seem modest, in fact, the true dimen- sionality of the growth curve space is higher, some of the dimen- sions having already been compressed by the logistic fit. Within this embedding of growth curves, animals tend to cluster in ϕ1 according to their food source (Fig 4B) and in ϕ2 according to their genetic background (Fig 4C; note only natural genetic variants are shown). This decomposition can be quantified with a linear regression model predicting either the nonlinear principal component embedding ϕ1;ϕ2ð Þ or the rescaled logistic fit parameters l̂; r̂; Â � � using the genetic or environmental conditions as independent variables. Lin- ear regression models of the form: y G; Eð Þ ¼ β0 þ∑iβ1;iGþ∑jβ2;jE þ ∈ were fit where ∈ , with y∈ ϕ1;ϕ2; l̂; r̂; Â � � , G and E are indicator variables which take the value of 1 or 0 depending of the strain i and food j in that condition, and ∈ are the residuals to be mini- mised. In Fig 4B and C, the distributions grouped by environment and genotype, respectively, have different means. We can quantify this using the F-statistic, which is the variance of the between- group means divided by the mean of the within-group variances. Using this, we can see that grouping the data by environment does in fact give distinct distributions in ϕ1 and grouping by genotype gives distinct distributions in ϕ2 (Fig 4E). However, a linear regres- sion of the parameters before dimensionality reduction does not give a clean separation between genotype and environment. The eigenfunctions of the nonlinear principal component analy- sis cannot be derived analytically but can be investigated by varying each component while keeping the others fixed. If ϕ2 is fixed ϕ2 ¼ 0ð Þ, the shapes of the resulting growth curves as ϕ1 is varied � 2023 The Authors Molecular Systems Biology e11835 | 2023 3 of 16 David J Jordan & Eric A Miska Molecular Systems Biology reflect the strong anticorrelation between the parameters lmax and r. For positive values of ϕ1, animals grow quickly during their maxi- mum growth phase but are ultimately shorter (Fig 4F, blue curve). In contrast, for negative ϕ1, animals grow more slowly at their peak, but are longer as adults (Fig 4F green curve). This may indicate a potential trade-off between the speed of growth during development and the ultimate size achieved by adult animals. In contrast, varia- tion along ϕ2 results in both slower growth and shorter animals as adults (Fig 4G, green curve) or both faster and longer (Fig 4G, blue curve). Interestingly, variation along ϕ1 does not appear to affect the duration of reproductive development as much as along ϕ2 as shown by the colour of the plotted points (Fig 4D). Points away Bacillus pumilus Escherichia coli Pseudomonas fragi Comamonas aquatica N 2 A B 1 C B 4856 D L238 M Y 10 P S 2025 A D B E 1 2 3 4 5 6 7 8 0 1,000 2,000 3,000 4,000 L e n g th ( µ m ) 1,200 1,000 800 600 400 200 Time from Hatching (min) xy - Robot Camera LED Panel 6 mm Wells 1 2 3 4 75 6 8 t0 thatch tlaidtdev raw data fit L e n g th ( µ m ) 1,200 1,000 800 600 400 200 Time from Hatching (min) 500 1,000 1,500 2,000 2,500 3,000 3,500 C lmax : 1390 r : 9.10E-3 A : 2564 lmax : 1260 r : 11.23E-3 A : 1873 lmax : 1030 r : 8.97E-3 A : 2278 Figure 1. Overview of the experimental apparatus, protocol and analysis. A A schematic of the imaging apparatus. Synchronised eggs are transferred into the mini wells at t0. Hatching thatch and egg-laying time tlaid are manually recorded and give development time tdev. Example images are shown for eight time points (A, lower), with the computed outline of the worm shown (yellow). B An example image of an adult C. elegans and an egg (scale bar: 200 μm). A phylogenetic tree pruned from the CaeNDR (formerly CeNDR) database full tree. This shows the relationship of some common wild isolates (CB4856 is the Hawaiian strain) to the three strains used in this work, marked in red. C Micro-graphs of the four bacteria used as food sources imaged at 100× magnification (scale bar: 10 μm). D The entire developmental time course of an animal from hatching through adulthood. The blue points show calculated length over time, with eight specific points highlighted (red plus) corresponding to each of the images in (A, lower). E Shows the length measurements and the computed logistic best-fit curves from three example time courses. The parameters of each logistic fit are shown to the left, with each colour corresponding to the relevant data and curve. Source data are available online for this figure. 4 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska from the boundary in the positive ϕ2 direction correspond to slower reproductive development. In addition to natural genetic variation and diet perturbations, the development of two C. elegans mutants was also measured. Because these mutants were not tested on the entire collection of diets, we chose not to include them in the analysis presented in Fig 4. The first mutant is sid-2, a deletion in a protein that is required for RNAi by feeding. This mutant has a mild developmental phenotype and hatches a few per cent longer than N2 and maintains this extra length through adulthood (Braukmann et al, 2021) with a faster mean development time (47.74 h). The second, C28H8.3, is a deletion in an uncharacterised gene that is orthologous to several human DExD/H-Box helicases, including DDX60. This deletion has a more severe developmental phenotype, increasing the mean devel- opment time by about 4.5 h ≈9%ð Þ compared with N2 (50.83 h) fed the same diet. The mild sid-2 mutant lies very close to wild-type N2 (Fig 5C) and its distribution largely overlaps the marginal distribu- tions for both its diet and the N2 genotype (Fig 5A and B, cyan curves). In contrast, the more severe mutant breaks both trends, clustering as an outlier to other wild-type N2 worms, and as an out- lier to other worm strains fed on E. coli (Fig 5A and B, purple curves). However, the two mutants do cluster together along the third nonlinear principal component (Fig 5D, red curve). If we con- sider the 2d manifold defined by the first two nonlinear principal components, then mutations do indeed seem to drive development away from this manifold in the positive direction (Fig 5D and E). Interestingly, natural genetic variation also drives development off the manifold but in the opposite direction (Fig 5D, green curve) and only in combination with diet perturbations (Fig 5E). It is also inter- esting to note that the sid-2 mutant is off-manifold in the same way as the C28 mutant despite it having a mild phenotype and not break- ing diet and genotype trends. Discussion Biological systems are remarkable in part because they are both incredibly robust and simultaneously flexible. Biological processes faithfully regenerate complex developmental programmes every generation, yet these same processes have given rise to an over- whelming diversity of complex matter. Canalisation and develop- mental plasticity appear to be opposing forces. One can imagine that these forces ebb and flow, acting in sequence. Cryptic varia- tion, that is, genetic variation that does not result in phenotypic dif- ferences under normal conditions, but can be revealed in certain circumstances, for example, by the chaperone HSP-90 (Rutherford & Lindquist, 1998), is a good example of this, and it has been shown that disruption of the chaperone at the molecular level acts as the switch from robustness to plasticity, even in complex pheno- types such as the morphology of an animal (Sieriebriennikov & Sommer, 2018). However, this may be only part of the story. Cana- lisation and flexibility may in fact be complementary properties of A B C L e n g th ( m m ) Time (min) Rescaled Time (1) R e s c a le d L e n g th ( 1 ) 0 2 4 × 103 0.2 0.4 0.6 0.8 1.0 1.2 0.3 0.6 0.9 1.2 N o rm a li s e d L e n g th ( 1 ) 0.3 0.6 0.9 1.2 Normalised Time (1) 0.3 0.6 0.9 1.2 0.3 0.6 0.9 1.2 raw (normalized) rescaled AB1 N2 PS2025 PS2025 E. coli E. coli B. pumilus C. aquatica Figure 2. Canalisation of developmentally rescaled growth curves. A Developmental time course of selected individuals where the square indicates tdev and ldev for each. B The data can be rescaled to plot length l as a fraction of development length l̂ ¼ l=ldev and time t as a fraction of development time t̂ ¼ t=tdev . Rescaling the data in this way changes the fit parameters of each logistic function. lmax; r; A½ � ! lmax=ldev; r � tdev; A=tdev½ �. C The rescaling of the curves in B appears to collapse the growth curves. Growth curves rescaled by their individual tdev and ldev show a smaller standard deviation (shaded region = 1 SD) for the rescaled data (black) than for the raw data (red) (which is normalised to the average tdev and ldev to facilitate plotting on the same scale). Solid lines show the mean growth curve. Source data are available online for this figure. � 2023 The Authors Molecular Systems Biology e11835 | 2023 5 of 16 David J Jordan & Eric A Miska Molecular Systems Biology systems that are organised to generate low-dimensional phenotypic manifolds. Concentration of dimensionality is an intrinsic property of complex high-dimensional systems, even if only trivially because of the Johnson–Lindenstrauss lemma (Johnson & Linden- strauss, 1984). However, the emergence of low dimensions in bio- logical systems is often orders of magnitude smaller than random Box 1. Illustration of relationships between genetic architecture, dimensionality and the correlation structure of traits A A simple genetic circuit where green ϕ1 and red ϕ2 phenotypes are controlled by a single transcription factor χ1. The imaginary gene products produce red colour and green colour, which combines to produce yellow. B Different populations (P1–P3) have different mean expression of χ1 due to either genetic or environmental changes, changing the brightness of the yel- low colour. Fluctuations in χ1 around these mean values lead to correlated noise within populations. C The mean values of ϕ1 and ϕ2 are also correlated between populations (right panel) in the same way. This leads to a one-dimensional phenotypic manifold (B, grey line), which is consistent with the single knob χ1. D Here, there are two independent transcription factors χ1; χ2ð Þ that control ϕ1 and ϕ2. χ1 controls both outputs in a correlated manner similarly to (A); however, the second factor χ2 activates ϕ2 but represses ϕ1. E In each population, P1–P3 E, χ1 sets the mean value of both ϕ1 and ϕ2 in a correlated manner, changing the overall brightness of the yellow colour. F Fluctuations in χ2 introduce anticorrelated noise within populations (left panel), shifting the phenotype to either green or red. The between- population correlation is dominated by the changes in χ1, resulting in correlated changes in both ϕ1 and ϕ2 (right panel). The within-population antic- orrelation expands the dimensionality of the manifold (E, grey plane). Dimensionality can also be expanded by uncorrelated noise both within and between populations (not illustrated). Within Population P1 P2 P3 P1 P2 P3 P1 P2 P3 P1 P2 P3 Within Population Between Population Between Population A B C D E F χ1 χ 1 χ 2 zsample ) z s am p le ) zpop ) z p o p ) zsample ) z s am p le ) zpop ) z p o p ) χ 1 χ2 χ 1 6 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska projection would predict (Eckmann & Tlusty, 2021). At the heart of concentration of dimensionality lies projection. We propose that it is natural to view robustness and plasticity as analogous to projec- tion of variation onto, or orthogonal to, a low-dimensional mani- fold, respectively. Given a distribution of variations that arise from environmental heterogeneity, stochastic fluctuations and genetic mutations, whether a given phenotype or set of phenotypes is buff- ered or responsive will depend on the projection function from high-dimensional chemical space to the low-dimensional phenotype space. In cases where a phenotype is highly buffered to some set of variations, almost all those variations will be orthogonal to the phe- notypic manifold, and variations for which the phenotype is plastic will have large projections onto the manifold (Appendix Fig S3). In this scenario, one might envision the process of evolution as one of shaping the projection function, and thus the resulting phenotypic manifold, given the statistics of distribution of variations. Adapta- tion can then be viewed as the process of learning the optimal pro- jection over the prior distribution of fluctuations, that is, an environment to phenotype rather than genotype to phenotype map (Xue et al, 2019), although the environment to phenotype map is certainly a product of the architecture of the gene regulatory net- works of the organism. It is tempting to look for a genetic basis for the emergence of low-dimensional phenotypes, especially in C. elegans, which has many available genetic tools. However, we believe that it is unlikely that a single-gene or a genetic module will underlie this process. Rather, it will likely be a property of the entire network of molecular interactions that underlie complex phenotypes. Nevertheless, there ρ (within) ρ ( be tw ee n) z s a m p le (r ) (1 ) z sample (l m a x ) (1) z p o p (r ) (1 ) z pop (l m a x ) (1) z sample (t dev ) (1) z s a m p le (l m a x ) (1 ) z pop (t dev ) (1) z p o p (l m a x ) (1 ) Between Population Between Population Within Population Within Population A C B D E Figure 3. Within-population correlations predict between-population correlations. A An example showing a pair of developmental parameters l and r that are correlated within each populations measured. The scatter plot shows z score of these parameters with respect to each conditions mean and variance, along with the best-fit linear regression for that condition (colours indicate the food source). There is a significant within-population negative correlation of these parameters in each condition. B The z score is also calculated with respect to the mean and variance of all conditions taken together. These are then grouped by condition, and the mean and standard deviation of the z scores of both parameters are plotted (colours indicate the food source). The means are also correlated between conditions, as indicated by the confidence interval (green dashed lines) of the slope of the linear regression (red line). (N = {64, 81, 22, 53, 79, 75, 13, 23, 64, 37, 24, 52, 64, 22} biological replicates). C For other pairs of traits, there is no significant within-population correlation between traits, for example, between the development time tdev and l. D Similarly, the between-population means are also not significantly correlated for this pair of traits. The means are plotted with their standard deviations. (N = {64, 81, 22, 53, 79, 75, 13, 23, 64, 37, 24, 52, 64, 22} biological replicates). E A summary of the within- and between-population correlation coefficient for all pairs or developmental parameters, the mean correlation coefficient for each of the within condition groups is plotted with error bars showing the standard deviation. For the correlation among the mean values, there is only a single value; thus, there are no vertical error bars. The red dashed line indicates equivalence, showing that the within-population correlations are largely but not exclusively predictive of the between-population correlations. (N = 673 biological replicates for each trait pair). Source data are available online for this figure. � 2023 The Authors Molecular Systems Biology e11835 | 2023 7 of 16 David J Jordan & Eric A Miska Molecular Systems Biology may be important connections between the structure of phenotypic manifolds and evolutionary dynamics. Movement along, rather than away from such manifolds, has been proposed to constitute a path of “least resistance” for genetic changes that could fix phenotypic variations. While some evidence supports this hypothesis, for exam- ple, a study of the integration of phenotypic and life-history traits in the flowering plant Arabidopsis thaliana (Pigliucci & Hayden, 2001), other evidence from the morphospace of the greenfinch Carduelis chloris suggests that within-population correlations may not predict between-population correlations (Meril€a & Björklund, 1999). This discrepancy may be due in part to the way traits are quantified, how the dimensionality reduction is performed and to whether the populations included in the analysis sample natural genetic varia- tion, genetic mutations, environmental variations or combinations of all three. While the question of when and under what conditions the “directions” of phenotypic plasticity are predictive of the direc- tions of subsequent genetic evolution remains open, phenotypic plasticity in canalised traits has been shown to be associated with rapid evolutionary diversification. An example comes from the poly- phenism in the feeding structures of nematodes, in which the acqui- sition of mouth-form plasticity is associated with an increase in evolutionary rates. Interestingly, even if only one of the two alterna- tive forms is fixed subsequently, the underlying genetic architecture seems to maintain an expanded phenotypic manifold that facilitates p d f( ) E. coli C. aquatica B. pumilus P. fragi pdf( ) A C B D l ^ (1) A^ ( 1 ) tdev (h) Time (min) L e n g th ( μ m ) Time (min) L e n g th ( μ m ) GFE Genotype Environment F -s ta ti s ti c 80 40 120 160 0 l ^ A ^ r^ t dev r^ (1) Figure 4. 8 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska future exploration (Susoy et al, 2015). Although traditional forward and reverse genetics may not be the ideal approach, artificial evolu- tion for expanded phenotypic manifolds seems promising, especially with an automated system that can map individuals to the pheno- typic manifold in real time and use this as a selection criterion. In physics, statistical mechanics provides tools to connect micro- scopic and macroscopic dynamics and to describe the behaviour of the macroscopic observables that arise from systems with large numbers of identical interacting components. Examples include the Navier–Stokes equation for fluid flow or the Fokker-Planck equation for diffusion. The coarse graining of many interacting degrees of freedom into a few dominant modes can be formulated precisely in terms of projection operators (Nakajima, 1958; Zwanzig, 1961; Mori, 1965). These projections often make use of time scale separa- tions. For example, the random motion of a Brownian particle results from its many collisions with surrounding molecules, but these collisions occur very fast and can thus be treated as white noise. For near equilibrium systems, the rigorous relation between the fluctuations arising from the many unobserved degrees of free- dom and the evolution of the system’s macroscopic observables are known collectively as fluctuation–dissipation relations (Callen & Welton, 1951; Kubo, 1966). While biological systems are character- istically far from equilibrium, similar relations have been observed between phenotypic variability and evolutionary response (Sato et al, 2003; Tang & Kaneko, 2021). In fact, even in systems far from equilibrium, relations of this sort can arise similarly when there is a separation of both observed and unobserved degrees of freedom and of timescales (Jung & Schmid, 2021). While the origin of low- dimensional dynamics and of fluctuation–dissipation-like relations in biological systems has not been completely solved, some intrigu- ing mechanisms have recently been proposed. For example, Furu- sawa & Kaneko (2018) propose that the evolution of robustness may be the origin of fluctuation–dissipation relations and of the emer- gence of low dimensionality in biological systems. They propose that if the same evolutionary forces that increase robustness to noise also tend to increase robustness to genetic perturbations, this could account for both phenomena (Furusawa & Kaneko, 2015; Kaneko & Furusawa, 2021). Recently, Murugan and colleagues have presented a model describing how mutational perturbations may be constrained by global epistasis to excite only a few slow or soft modes with examples from protein elasticity and gene regulatory dynamics (Husain & Murugan, 2020). Their work shows that such a ◀ Figure 4. Dimensionality reduction of developmental space using nonlinear principal component analysis. A The z scores of the three rescaled logistic fit parameters are shown in a 3d scatterplot (blue dots). These points lie close to a curved 2d manifold (mesh grid), which was found by performing nonlinear principal component analysis (NLPCA). The flattened manifold is shown in (D). B On this manifold, ϕ1 seems to separate the environmental conditions as shown in by the marginal distribution over bacterial food sources. C In contrast, the marginal distributions over the C. elegans strains shows separation in ϕ2 . Marginal distributions were computed with a kernel density estimator implemented with the ksdensity function in MATLAB. D The principal component weights for first two nonlinear principal components ϕ1 and ϕ2 for each C. elegans shown as a scatterplot. The colour of each point indicates that worm’s development time, tdev, as indicated by the colour bar. The mean ϕ1 and ϕ2 for each condition are shown as a combination of symbols (C. elegans strain) and colour (Bacterial food source). E This separation in ϕ1 and ϕ2 can be quantified by computing the F-statistic for a linear regression model taking genotype and environment as regressors. ϕ1 regresses primarily on environment and ϕ2 on genotype. Interestingly, both genotype and environment together are generally required to explain the variance in the three logistic fit parameters and the developmental time alone. F To determine the effect of varying ϕ1 on the shape of the growth curve, ϕ2 was fixed to 0 and ϕ1 was varied through a range, as indicated by the colour bar, with coordinates being converted back from the unit-less quantities. In this case, there appears to be a trade-off between fast growth (blue curves) and larger adult size (green curves). G Similarly, when ϕ1 is fixed and ϕ2 varied, the resulting growth curves change from slower growth and smaller adult size (green) to faster growth and larger adult size (blue). Source data are available online for this figure. ▸Figure 5. Analysis of genetic mutations in relation to natural genetic variation and diet perturbations. A As in Fig 4B, marginal distributions over each bacterial diet, however, this panel includes the two mutant C. elegans strains, sid-2 and C28H8.3, which were only tested on E. coli. Distributions are estimated using a kernel density estimator. The sid-2 mutant has a mild developmental phenotype and its distribution (cyan curve) is similar to the other E. coli fed animals. However, the more severe mutant, C28H8.3 (magenta curve), breaks this pattern and does not cluster together with the other E. coli fed animals. B Likewise, as in Fig 4C, marginal distributions estimated with a kernel density estimator for each genotype, including the two mutants. The sid-2 mutant (cyan curve) is similar to the other strain from which it was derived from N2 (grey triangles). However, the C28H8.3 mutant (magenta curve) is distinct from N2 and from the other nonmutant strains in general (grey curves). C Like Fig 4D, the nonlinear principal component embedding projected onto the first two nonlinear components ϕ1 and ϕ2 and including the mean for the two mutant strains, sid-2 (cyan circle) and C28H8.3, (magenta triangle). The sid-2 mutant is closer than C28H8.3 to N2, from which both were derived (blue triangle), and they fall on opposite sides on N2 in this plane. D If we group the genotypes with wild-type, including N2 (blue), natural variation, including AB1 and PS2025 (green), and mutants, including sid-2 and C28H8.3 (red), these cluster along the third nonlinear principal component ϕ3 . If we define the manifold as the plane spanned by ϕ1 and ϕ2 (blue line), then mutation drives development off the manifold. Natural genetic variation also does this, but these have opposite effects, with mutation increasing ϕ3 and natural variation decreasing it. E A scatter plot of the embedding projected onto the first and third nonlinear principal components. The mean embedding is shown for each combination of diet and genotype. The two mutants lie above the manifold (blue line), and a few of the wild isolates lie below, but this only occurs in combination with a diet perturbation (yellow square, orange and yellow diamonds). The symbols are the same as in (C). Source data are available online for this figure. � 2023 The Authors Molecular Systems Biology e11835 | 2023 9 of 16 David J Jordan & Eric A Miska Molecular Systems Biology relationship between mutational induced and physically induced deformations is expected mathematically for protein elasticity. The existence of such slow modes may also be related to the seemingly universal emergence of stiff and sloppy modes from parameter space compression (Gutenkunst et al, 2007; Machta et al, 2013). In C. elegans, development has been shown to be controlled by a massive gene expression oscillator that is comprised of ≈3; 700 genes (Hendriks et al, 2014; Meeuse et al, 2020). The existence of such an oscillator evokes a direct analogy to projection operator techniques as this network only has a few excitable oscillatory modes similar to a Fourier transform or to the cosine transform employed in Box 2. Recently, the components of a gene regulatory network that comprises a central clock to control this oscillator have been identified (Meeuse et al, 2023). Interestingly, Matsushita & Kaneko (2020) have shown how epigenetic modulation of oscilla- tions can give rise to homeorhesis. As a follow-up to this work, we would like to assess how mutations in these core components affect how fluctuations in gene expression are projected onto the main oscillatory modes that control moulting, and how this in turn mani- fests on the developmental manifold of C. elegans. p d f( ) E. coli C. aquatica B. pumilus P. fragi E. coli (sid-2) E. coli (C28) 0.3 –0.5 0 0.5 0 –0.2 –0.1 0.2 0.1 N2 AB1 PS2025 sid-2 C28 tdev (h)AB1 N2 PS2025 C28 sid-2 pdf( ) –0.5 0 0.5 0 –0.2 –0.1 0.2 0.1 0 –0.2 –0.1 0.2 0.1 0510 pdf( ) wild-type natural variation genetic mutants E A CB D Figure 5. 10 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska In this work, we have focussed on variability resulting from com- binations of dietary and genetic perturbations, and we have done so in only fixed environments. In the future, it will be interesting to perform experiments in which developmental trajectories are perturbed by some impulse or step function and to observe the resulting relaxation. C. elegans development is highly temperature dependent, and it would be interesting to first assay developmental trajectories at different temperatures and then to perturb develop- ment using various temperature shift protocols. In addition, if we could find perturbations that tend to generate displacement in spe- cific directions on the phenotypic manifold, we could test quantita- tively for fluctuation response type relationships. The projection operator perspective naturally leads one to consider how organisms might learn their particular projection function. Evolution by mutation and selection will surely play a large part in this process, especially when an organism must adapt to changes to the prior dis- tribution of variations, but there is no reason, in principle, that this operator cannot also be shaped by within generation “learning” and other nongenetic and epigenetic feedback mechanisms. Biological systems are remarkably robust, and yet the same mechanisms that underlie this robustness have also generated the incredible diversity of life on Earth. In this work, we attempt to rec- oncile these seemingly opposing forces by studying the structure of variability in the development of an animal in the contexts of differ- ent genetic backgrounds and environments. We have used high- resolution imaging data of the growth of the nematode C. elegans in environmental and genetic contexts to map the variability of devel- opment. We find that traits that are correlated within a particular Box 2. Illustration of robustness and flexibility resulting from projection onto a low-dimensional manifold. A In this example, 64 × 64 pixel leaf images represent a high-dimensional environment to be sensed. The 2d discrete cosine transform (DCT) is then used as a projection operator and implemented by the dct2 function in MATLAB. Past evolutionary history has optimised this projection function to recognise the top left leaf in panel (A), and for this, the top 35 modes capture 90% of the energy. The retained modes are shown in (B, inset). In (A), we can see a variety of environmental inputs and their reconstructions from their low-dimensional representations; the first 10 mode weightings are shown as colour band below each image in (A). Although this projection operator evolved for the leaf in the top left, it reconstructs the different leaf images reasonably well. High-frequency variation around the original leaf shape is buffered (A, blue box), resulting in canalisation of the representa- tion such fluctuations in the environmental input. Furthermore, even distinct environmental inputs such as the flowers (A, orange box) are recon- structed fairly well, demonstrating that this projection operator is robust, even for divergent inputs. B This shows the embedding onto the first two DCT components. From this we can visualise both the robustness and the plasticity associated with this projection operator. The leaves cluster separately from the flowers, revealing flexibility, and the two leaves that only differ in high-frequency modes cluster together, showing robustness, with the variation between reconstructed representations smaller than that of inputs. Here, the first component is correlated with overall size and the second seems to separate leaves from flowers. It is interesting to note that the projection operator was not opti- mised in any way to distinguish leaves from flowers, and in fact, the eigenfunctions used are the generic ones from the DCT. Over evolutionary time, organisms could not only optimise which modes are retained to better capture the prior distribution over contexts but also could optimise the shape of the eigenfunctions themselves. This panel depicts an environmental sensing mechanism, but equally important, though not shown here, would be the phenotypic execution mechanism. The structure of the network that maps the low-dimensional representation of the environment back into a high-dimensional phenotype will also be an important source of variability and affect both robustness and plasticity. A B –1.5 –1 –0.5 0 0.5 1 1.5 2 DCT Component 1 –2 –1.5 –1 –0.5 0 0.5 1 1.5 2 D C T C o m p o n en t 2 � 2023 The Authors Molecular Systems Biology e11835 | 2023 11 of 16 David J Jordan & Eric A Miska Molecular Systems Biology context predict whether the mean values of those traits will be cor- related among different contexts. Correlation between traits indi- cates that the true dimensionality of the system may be lower, and we find a parsimonious low-dimensional representation of the vari- ability, in which the contributions of genetic variation on the pheno- type are separable from those of environmental variation by a simple linear model. We present a framework in which the emer- gence of low dimensionality provides a basis for both robustness and plasticity. Concentration of dimensionality seems to be an intrinsic property of the chemical and molecular networks that gen- erate living systems and of complex dynamical systems in general. While beyond the scope of this work, we hope that the connection between projection operators in physical systems, which explain how reproducible coarse-grained dynamics arise from the stochastic influences of innumerable microscopic components, may inform the theory in biology of how reproducible developmental dynamics arise from the interactions of similarly large numbers of biomolecules. Materials and Methods Reagents and Tools table Reagent/Resource Reference or Source Identifier or Catalogue Number Experimental Models C. elegans N2 CGC https://cgc.umn.edu N2 C. elegans AB1 CGC https://cgc.umn.edu AB1 C. elegans PS2025 CGC https://cgc.umn.edu PS2025 C. elegans sid2 (mj465) III Fabian Braukmann EM Lab SX3237 C. elegans c28h8.3 (mj649) III Alexandra Dallaire EM Lab SX3614 Comamonas aquatica DA1877 CGC https://cgc.umn.edu DA1877 Escherichia coli HB101 CGC https://cgc.umn.edu HB101 Bacillus pumilus Marie-Anne Felix Lab JUb197-19 Pseudomonas fragi Marie-Anne Felix Lab JUb239-19 Chemicals, Enzymes and other reagents Gelzan CM Gelrite Merck/Sigma Life Sciences Cat # G1910-250G, Lot # SLBL5174V KH2PO4 Merck/Sigma Aldrich Cat # P9791 NaCl Merck/Sigma Aldrich Cat # S3014 CaCl2 Honeywell/Fluka Cat # 223506 MgCl2 Merck/Sigma Aldrich Cat # M8266 Software MATLAB R2018b, R2022b http://www.mathworks.com License # 666637 Other 40 mm lens tube Thor Labs CML40 50 mm f/1.4 lens Navitar NMV-50 M1 Eleksdraw Robot Eleksmaker https://www.tomtop.com/p-e2350.html Item#: E2350 Flea3 3.2 MP monochrome camera (discontinued) Teledyne FLIR FL3-U3-32S2M-CS (discontinued) Replacement for Flea3, Blackfly S USB3 Camera Teledyne FLIR BFS-U3-31S4M-C A4 LED Light Panel www.amazon.com Petroff-Hausser Counting Chamber Hausser Scientific Item# 3900 LabJack U3 USB DAQ LabJack U3-LV Linear Thermistor Omega Engineering Part # 44204 Peltier Heater Custom Thermoelectric Part # 03111-5 M31-24CQ NanoDrop One Thermo Scientific Cat # ND-ONE-W 12 of 16 Molecular Systems Biology e11835 | 2023 � 2023 The Authors Molecular Systems Biology David J Jordan & Eric A Miska https://cgc.umn.edu https://cgc.umn.edu https://cgc.umn.edu https://cgc.umn.edu https://cgc.umn.edu http://www.mathworks.com https://www.tomtop.com/p-e2350.html http://www.amazon.com Methods and Protocols Imaging hardware The imaging hardware consists of a single 3.2 MP monochrome camera (Flea3, Teledyne FLIR) mounted to the arm of an XY- plotting robot (Eleksdraw, Eleksmaker) that moves the arm using two stepper motors controlled by a GRBL stepper motor controller that interprets the G code sent via a USB serial connection using the MATLAB (R2022b, Mathworks) fprintf command. Returning to the same position was accurate to with a few hundred microns, and wells were recentred periodically by fitting a circle to an intensity-threshold image of the well and zeroing the offset. Illumi- nation was provided from above by an LED light panel. To maxi- mise contrast, oblique illumination was blocked by a sheet of black acrylic in which an array of square holes was laser cut to allow direct illumination to pass though. A 50 mm f/1.4 lens (NMV-50M1, Navitar) was attached to the camera through a 40 mm lens tube (CML40, Thorlabs). In this system, one pixel in the image corresponded to 2.67 μm, giving an effective magnifica- tion of 0.93×. Detailed instructions and a supplier parts list are available on request. Multiwell plates Multiwell plates were made by gently placing a pre-cut acrylic form into a standard 30 mm Falcon petri dish filled with 5 ml of liquefied NGM-Gelrite medium. The multiwell acrylic forms were 24 mm square, with a 2 × 2 grid of 6 mm diameter circular wells 4 cm from the edge and 10 cm centre to centre, and were cut from 3-mm-thick black acrylic using an LS 6090 PRO Laser Cutter (HPC Laser Ltd). After placing the multiwell plate into the molten solid media so that it rested on the surface, the plate was allowed to set and to dry for 1 h covered at 20°C and seeded with 2 μl of bacteria corresponding to approximately 18 million cells, as measured using a Petroff- Hausser Counting Chamber (Hausser Scientific). Temperature control Plates were kept in an insulated, temperature-controlled box during imaging, which itself was in a temperature-controlled room set to 20°C. The actual average temperature in the room was 19.7°C. Tem- peratures were measured with a custom thermometer; the signal from a linear thermistor (Omega Engineering) was difference ampli- fied against a known voltage corresponding to 20°C. The amplified signal (Gain = 2) was recorded in MATLAB (Mathworks) from the analogue input of a DAQ (Labjack). This was fed into a proportional–integral–differential (PID) control script whose output was a voltage that controlled a push-pull current amplifier driving a Peltier effect element (Custom Thermoelectric). Fans were used to distribute air from the heat sinks on each face of the Peltier element either into the enclosure or as exhaust. Temperature was main- tained to within 70 mK of the set point (20°C). Image processing Images were captured directly into MATLAB using the built-in video input object class. Moving objects were extracted from each image using background subtraction, to generate a region of interest with the largest amount of detected motion, as measured by the largest pixel difference. Within this region of interest, objects were detected by contrast with the background by applying a threshold to a Laplacian of Gaussian filtered image. Filtering was performed using the MATLAB imfilter function with the fspecial function with filter size 15 × 15 and filter standard deviation of 1.5. Connected compo- nents were extracted from this thresholded image, and the locations and properties of connected components were recorded from the resulting black and white image. Area (in pixels) and centroid loca- tion were calculated using the MATLAB function regionprops, and length was computed using the MATLAB function bwmorph to extract the skeleton and the function bwgeodesic to compute the length. These properties were used to classify each blob as a worm or not, based on the output of a pretrained support vector machine. All images were also saved for later inspection, which was used to determine the egg-hatching and egg-laying times manually. All code used for analysis and data processing is available on GitHub at https://github.com/davex0r/DevelopmentalManifold, including data from processed images (raw images are available on request). Nematode culture and strains Caenorhabditis elegans was grown on NGM agar plates and fed E. coli HB101 for standard maintenance at 20°C. Bristol N2 was used as wild-type strain (Brenner, 1974). In addition to N2, two wild isolate strains were used, AB1 isolated in Adelaide, Australia and PS2025 isolated in Pasadena, California. In addition, two single-gene mutants of N2 were used, sid-2 (mj465), a mutant that is not compe- tent for RNAi by feeding, and c28h8.3 (mj649) a catalytic mutant of a putative helicase. Bacterial culture and strains In addition to E. coli HB101, three other bacteria were used as food sources. Comamonas aquatica DA1877 (Avery & Shtonda, 2003) was used as it has been previously shown to increase the rate of C. elegans development by providing supplemental vitamin B12 (MacNeil et al, 2013; Watson et al, 2014). We also chose Bacillus pumilus and Pseudomonas fragi from the wild bacteria collection (Fr�ezal & F�elix, 2015) as we had previously observed large develop- mental differences in these food sources. Optical density (OD) of bacterial cultures was measured on a NanoDrop spectrophotometer (Thermo Scientific). Synchronisation by coordinated egg laying To generate populations of synchronised animals without bleaching and starvation, young egg-laying adults (50–75-h posthatching) were gently picked with a platinum wire to a fresh NGM plate seeded with the appropriate bacteria and allowed to lay eggs for a fixed duration, after which the adults were removed from the plate and the eggs were collected. The egg-laying rate of animals at this stage is ≈6 eggs/animal/h. Synchronisation could be tightened by shortening the duration of egg laying, and the number of synchro- nised eggs could be increased by using more egg-laying adults. Nematode growth media—Gelrite NGM-Gelrite plates were made by replacing the Agar in normal NGM recipe with Gelzan Gellan Gum (Sigma Life Sciences), a poly- mer derived from algae. The recipe is given in Table 1. In addition, peptone and cholesterol were omitted to prevent bacterial growth on the plate, so that the only available bacterial food was that which was initially inoculated. In this way, there were areas on the plate both where food was present and absent (Appendix Fig S5). � 2023 The Authors Molecular Systems Biology e11835 | 2023 13 of 16 David J Jordan & Eric A Miska Molecular Systems Biology https://github.com/davex0r/DevelopmentalManifold Additive salts were prepared in 1 M stock solutions, and the KH2PO4 stock solution was adjusted to pH 6. Phylogenetic tree of wild isolates The phylogenetic tree of C. elegans wild isolates was generated from the full CaeNDR (formally CeNDR) phylogenetic tree (Cook et al, 2017) which was hard-filtered by isotype using the prune func- tion in MATLAB. Logistic fits Logistic fits were calculated using the MATLAB implementation of the Levenberg Marquardt (Marquardt, 1963) nonlinear least squares algorithm within the lsqcurvefit package. Logistic fits were performed on the raw data as well as the rescaled data. The rescaled fits were nearly identical to the raw-fit parameters, which were transformed according to equation (2) (see Fig EV1). Nonlinear PCA Nonlinear principal component analysis attempts to find an optimal auto encoder that can recreate the input data after passing it through a bottleneck layer with fewer components than the input. The num- ber of components in the bottleneck layer corresponds to the num- ber of desired principal components (Appendix Fig S4). The NLPCA implementation we use is from NLPCA toolbox for MATLAB (Scholz et al, 2005) and uses a multilayer perceptron architecture with a hyperbolic tangent activation function in the hidden layers. Probability density function (PDF) estimation A probability density function (PDF) is a nonnegative Lebesgue inte- grable function fx, which satisfies the equation: Pr a