# Patterns of eukaryotic diversity from surface to deep ocean sediments

For diversity analyzes a and b, we used functions from the vegan R package v2.5-3 (

*77*), unless otherwise stated. Since the size fractionation of pelagic samples from the*Tara*Ocean datasets have a large effect on diversity measures a and b, we compared eukaryotic diversity patterns in pelagic and benthic domains considering only nano-size fractions (3–20 μm) and pico- (0.2 to 5 μm) the richest pelagic samples. Curves of eukaryotic ASV accumulation versus sampling effort were calculated with the*speculum*work with the “random” method. We calculated Shannon diversity for each sample and compared the distribution of sample diversity across pelagic and aphotic euphotics with strictly benthic diversity using the*stat_compare_means*package function ggpubr R v0.2.5 (*78*) with default settings. For b-diversity analysis, we removed samples with less than 1000 reads and discarded ASVs represented by less than 100 reads in the dataset. We then normalized the ASV-to-sample matrix with the cumulative sum scaling (CSS) method (*79*) and calculated a Bray-Curtis dissimilarity matrix between pairs of samples. The dissimilarity matrix was used to perform non-metric multidimensional scaling (NMDS) ordination on two axes. The sample depth and absolute latitude variables were fitted to the NMDS as smooth surfaces using the*computersurf*a function. The dissimilarity matrix was also used as input to the*Adonis*function for PERMANOVA models testing differences in eukaryotic compositional structure between domains (pelagic euphotic, pelagic aphotic, and sediment) and along an absolute latitude gradient (nested within the domain type and restricting permutations within the domain type with the “strata” option, using 999 permutations. Finally, we measured the dispersion of diversity b within each domain using the*betadisper*function and compared the distribution of distances to cluster centroids between domains using the*stat_compare_means*function of the ggpubr R package.For the a and b diversity analyzes of deep-sea benthic communities, we focused only on ocean samples, i.e. we did not consider samples from the Mediterranean Sea nor those from the Gulf of California, to avoid potential effects on coastal ecosystems. . We calculated normalized ASV richness per sample for all benthic communities and for selected benthic groups (nematodes, foraminifera, flatworms, polychaetes, molluscs and ciliates) by rarefiing each benthic sample at the lowest remaining sequencing depth (after removing planktonic ASVs and after focusing on a given benthic taxonomic group). We used generalized additive models (GAMs) to investigate the possible nonlinear variation of Shannon richness and diversity along gradients of latitude, primary production, and export of POC from the surface and reaching the seabed. using the

*game*function of the mgcv R package (https://cran.r-project.org/web/packages/mgcv/) and the smoothing parameter set to 3. For the b-diversity analyses, we used an approach similar to that detailed above (normalized CSS dissimilarity matrix and Bray-Curtis), although here we only have not filter rare ASVs. We used the*pcoa*ape R package function (*80*) to perform a principal coordinate analysis of the Bray-Curtis dissimilarity matrix and calculate the structural variation explained by the first two axes of the ordination. We used the*computersurf*and*envfit*functions to respectively adjust the absolute latitude and a selection of environmental variables (seafloor variables: salinity, temperature, silicate, nitrate, dissolved oxygen, POC reaching the seafloor, and pelagic variables that relate the surface to the DOS, namely the primary productivity and POC export from the surface) to ordination. PERMANOVA models were used to test differences in benthic composition between postulated abyssal biogeographic provinces (*42*) and along an absolute latitude gradient using 999 permutations. Next, we used the selected environmental variables in a stepwise model construction for constrained ordination (distance-based redundancy analysis) using the*computer2step*forward function and using 999 permutations, to explain the observed benthic community structure. We calculated the proportion of ASV shared between pairs of benthic samples to investigate the decrease in the proportion of ASV shared with increasing spatial distance (calculated from GPS coordinates, see script “companionFunctions .R”). We also calculated the main parameters of distance decay as in (*81*), i.e. the initial similarities (Sørensen similarities between pairs of samples less than one kilometer apart), the slope of the distance-decay relationship (here in the form log-linear regression), and halving distances, i.e., the spatial distance after which initial similarities are halved. We calculated these parameters on a mean Sørensen dissimilarity matrix calculated over 10 rarefaction draws at the minimum possible sequencing depth (the sequencing depth of the sample with the smallest number of reads) and considering the complete benthic community or focusing only on selected benthic groups. , for example, polychaetes, molluscs or flatworms (size classes of macrofauna); nematodes or foraminifera (meiofauna); and amoebae or ciliates (microbes). We used the*chimney*function to test the correlation between spatial distance and community dissimilarities using 999 permutations. We used the*betadisper*function to calculate the dispersion b of benthic communities at an increasing spatial sampling scale (between repeated samples of a sediment core, between cores from the same deployment, between deployments at a given station and within an abyssal basin given). Finally, we pooled all samples at the station scale and fitted the neutral community assembly models as in (*82*) to determine whether the distribution of ASVs in the pelagic and benthic realms is less or more geographically widespread than predicted by neutral models.We compared the inferred functional attributes (size and trophic mode) of sinking pelagic ASVs with their non-sinking counterparts to determine whether these traits could explain their transfer to DOS. We also explored the variation of functional groups and size classes of planktonic communities sinking in sediments along the latitude gradient. We investigated the spatial pattern of planktonic abundance on the seafloor by fitting a GAM to the proportion of planktonic DNA reads in sediment as a function of latitude (with a smoothing parameter set to 3). Next, we aggregated planktonic DNA reads from all station-scale sediment samples and used random forest models to predict POC export from the surface and POC reaching the seafloor in an approach cross-validation without one. We used the

*tidy*package function ranger R (*83*) in regression mode, growing 300 trees and setting the “mtry” parameter to one-third of the total number of items (number of sinking pelagic ASVs). Linear models were used to measure the performance of the predictive models. Finally, we used partial sparse least squares regression [mixOmics R package (*84*)] identify pelagic ASVs detected in sediments that are best correlated with variation in POC export and POC reaching the seafloor and with primary productivity and latitudes. We then focused on pelagic ASVs that were reported with a correlation coefficient greater than 0.3 with POC export and POC reaching the seabed and presented them in a clustered heatmap.
Comments are closed.